SlideShare a Scribd company logo
1 of 67
Download to read offline
Presentation Prepared by:
Kiran Kumar
Pentaho BI Consultant
Objective
At the end of this module, you will be able to know
Trainer Introduction
What is Data Warehousing ?
What is Data Warehouse Architecture ?
What is Dimensional Modelling & Design ?
What is Business Intelligence ?
Person, Academic & Professional Information
Name Kiran Kumar
Academic BE
Companies Graymatter Software Service Pvt. Lmt. India
BI/DWH Technologies Exposure
Domain Knowledge
s
Refers to a Database, Which is maintianed seperately from an organization’s operational database
A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support
of management's decision making process.
Loosely Speaking
Officially Speaking
What is Data Warehouse
Data Warehouse Properties
DW
Integrated
Non-volatileTime-Variant
Subject-Oriented
Subject Oriented: Retail Management Systmes
Integrated: Retail Management Systmes
Time Variant: Retail Management Systmes
Non - Volatile: Retail Management Systmes
Goals of Data Warehousing / Business Intelligence
• DW/BI system must make information easily accessible.
• DW/BI system must present information consistently.
• DW/BI system must adapt to change.
• DW/BI system must be a secure bastion that protects the information assets.
• DW/BI system must serve as the authoritative and trustworthy foundation for improved
decision making.
• DW/BI system present informaion in a timely way.
• Business community must accept the DW/BI system to deem it successful.
Strategic uses of Data Warehousing
Industry Functional areas of
use
Strategic use
Airline Operations; marketing Crew assignment, aircraft development, mix of
fares, analysis of route profitability,
frequent flyer program promotions
Banking Product development;
Operations; marketing
Customer service, trend analysis, product and
service promotions, reduction of IS
expenses
Credit card Product development;
marketing
Customer service, new information service,
fraud detection
Health care Operations Reduction of operational expenses
Investment and
Insurance
Product development;
Operations; marketing
Risk management, market movements
analysis, customer tendencies analysis,
portfolio management
Retail chain Distribution; marketing Trend analysis, buying pattern analysis,
pricing policy, inventory control, sales
promotions, optimal distribution channel
Telecommunications Product development;
Operations; marketing
New product and service promotions,
reduction of IS budget, profitability
analysis
Personal care Distribution; marketing Distribution decisions, product promotions,
sales decisions, pricing policy
Public sector Operations Intelligence gathering
Evolution in Organizational use of data warehouses
• Off line Data Warehouse
Data warehouses at this stage are updated from data in the operational systems on a regular
basis and the data warehouse data is stored in a data structure designed to facilitate reporting.
• Real Time Data Warehouse
Data warehouses at this stage are updated every time an operational system performs a
transaction (e.g. an order or a delivery or a booking.)
Data Marts
• A data mart is a scaled down version of a data warehouse that focuses on a particular subject area.
• A data mart is a subset of an organizational data store, usually oriented to a specific purpose or
major data subject, that may be distributed to support business needs.
• Data marts are analytical data stores designed to focus on specific business functions for a specific
community within an organization.
• Usually designed to support the unique business requirements of a specified department or
business process
• Implemented as the first step in proving the usefulness of the technologies to solve business
problems
Reasons for creating a data mart
• Easy access to frequently needed data
• Creates collective view by a group of users
• Improves end-user response time
• Ease of creation in less time
• Lower cost than implementing a full Data warehouse
• Potential users are more clearly defined than in a full Data warehouse
From the Data Warehouse to Data Marts
Departmentally
Structured
Individually
Structured
Data Warehouse
Organizationally
Structured
Less
More
History
Normalized
Detailed
Data
Information
Characteristics of the Departmental Data Mart
• Small
• Flexible
• Customized by Department
• Source is departmentally
structured data warehouse
Data mart
Data warehouse
Inmon Vs Ralph Kimball Characterictics
Data warehousing Integration
DATA
SOURCES
(databases)
End Users:
Decision making and other
tasks:
CRM, DSS, EIS
Information Data
Warehouse (storage)
Analytical processing,
Data mining
Data visualization
Generate knowledge
Direct use
Direct use
Use
Use
Use of
knowledge
Data
organization ;
storage
use
Design the BI & DWH Architecture
DWH Architecture Cont..
• Data Source Layer
• Data Extraction Layer
• Staging Area
• ETL Layer
• Data Storage Layer
• Data Logic Layer
• Data Presentation Layer
• Metadata Layer
Adv & DisAdv of Data Warehouse
Advantage:
Data warehouses tend to have a very high query success as they have complete control
over the four main areas of data management systems.
• Bottom Up Appoarch
• Clean data
• Indexes: multiple types
• Query processing: multiple options
• Security: data and access
• Easy report creation
• Enhanced access to data and information
Disadvantages:
• Preparation may be time consuming
• Long initial implementation time and associated high cost
• Because data must be extracted, transformed and loaded into the warehouse, there is an
element of latency in data warehouse data.
OTLP VS OLAP System’s
To Summarize
Data, Data everywhere yet ...
• I can’t find the data I need
– data is scattered over the network
– many versions, subtle differences
• I can’t get the data I need
– need an expert to get the data
• I can’t understand the data I found
– available data poorly documented
• I can’t use the data I found
– results are unexpected
– data needs to be transformed from
one form to other
Business Intelligence
• One ultimate use of the data gathered and processed in the data life cycle is for business
intelligence.
• Business intelligence generally involves the creation or use of a data warehouse and/or data
mart for storage of data, and the use of front-end analytical tools such as Pentaho BI Suite,
SAP BO, MSBI, Oracle’s Sales Analyzer and Financial Analyzer or Micro Strategy’s Web.
• Such tools can be employed by end users to access data, ask queries, request ad hoc (special)
reports, examine scenarios, create CRM activities, devise pricing strategies, and much more.
A producer wants to know….
Which are our
lowest/highest margin
customers ?
Who are my customers
and what products
are they buying?
What is the most
effective distribution
channel?
What product prom-
-otions have the biggest
impact on revenue?
What impact will
new products/services
have on revenue
and margins?
Which customers
are most likely to go
to the competition ?
How Business Intelligence works?
• The process starts with raw data which are usually kept in corporate data bases. For
example, a national retail chain that sells everything from grills and patio furniture to plastic
utensils had data about inventory, customer information, data about past promotions, and
sales numbers in various databases.
• Though all this information may be scattered across multiple systems and may seem
unrelated-business intelligence software can being it together. This is done by using a data
warehouse.
• In the data warehouse (or mart) tables can be linked, and data cubes are formed. For
instance, inventory information is linked to sales numbers and customer databases, allowing
for deep analysis of information.
• Using the business intelligence software the user can ask queries, request ad-hoc reports, or
conduct any other analysis.
• For example, deep analysis can be carried out by performing multilayer queries. Because all
the databases are linked, one can search for what products a store has too much of,
determine which of these products commonly sell with popular items, bases on previous
sales. After planning a promotion to move the excess stock along with the popular products
(by bundling them together, for example), one can dig deeper to see where this promotion
would be most popular (and most profitable).
• The results of the request can be reports, predictions, alerts, and/or graphical presentations.
These can be disseminated to decision makers to help them in their decision-making tasks.
Dimension Tables
• Dimension table is one that Contain text and descriptive information of the business entities
of an enterprise, represent as hierarchical, categorical information such as Customer,
Product, Date, Location, Department etc.
• 1 in a 1-M relationship
• Also called as lookup or reference tables
• Typically contain the attributes for the SQL answer set.
Type of Dimension Tables
• Standard / Common Dimension
• Conformed Dimension
• Junk Dimension
• Degenerated Dimension
• Role-Playing dimension
• Denormalized Flattened Dimension
• Snowflaked Dimension
• Outrigger Dimension
• Shrunken Dimension
Slowly Changing Dimensions
• Dimensions attributes that change slowly over time, rather than changing on regular
schedule, time-base.
• In Data Warehouse there is a need to track changes in dimension attributes in order to report
historical data.
• Ex: Person chaging his/her city from Bangalore to Mumbai.
Type of SCD:
– Type 1: Store only the current value ( Overwrite)
– Type 2: Maintain History changes ( Add New Row)
– Type 3: Create an attribute in the dimension record for previous value ( Add New Attribute)
– Type 4: Using historical table ( Add Mini – Dimension table)
– Type 5: Add Mini-Dimensional & Type 1 Outrigger
SCD 1 – Overwrite the Old Value
SCD 2 – Add a New Row
SCD 2 – Add a New Row
SCD 3 – Add a New Column
SCD 4
• What is Mini Dimension ?
– In case of a dimension, whre there are attributes which change rapidly or at a frequent interval of time, they are split
off to form a dimension table named as mini-dimension
Ex: Age of a Customer or Employee, Salary Band, Designation etc.
• Design aspects of Mini Dimension
– Should have its own surrogate key of mini dimension table.
– There is no direct connection btw the base & mini dimension table.
– Fact table contains Primary Key of both Base & Mini Dimension table.
• What is SCD4 ?
– Involves usage of 2 or more dimension table in
which one would act as a base dimension and
one or more mini dimension tables
• When to use ?
– Handling Rapidly changing attributes
SCD 5
• What is SCD 5 ?
– Scd 5 involves usage of one or more mini dimension tables and a base dimension table with a reference to mini
dimension key in the base dimension table.
– This reference key in base dimension should be of Type 1 in nature. Therefore it would reflect the current version of
mini dimension attributes in the dimension table
• When to use ?
– When there is a need to access the current values in the mini-dimension directly from the base dimension without
joining a fact table
• What is SCD 5 ?
– Type 1 referential key should get updated in the
base dimension in all the version of the dimension records
whenever there is a change involved
in corresponding mini dimension attributes values
• Design aspects of Mini Dimension
– Should have its own surrogate key of mini dimension table.
– There is direct connection btw the base &
mini dimension table.
– Fact table contains Primary Key of
both Base & Mini Dimension table.
Fact Tables
• Stores the performance measurements resulting from an organization’s business process events
• Store the low-level measurement data resulting from a business process in a single dimensional
model
• The term fact represents a business measure.
• Each row in a fact table corresponds to a measurement event
• Contains two or more foreign keys
• Tend to have huge numbers of records
• Useful facts tend to be numeric and additive
Types of Fact Table:
1. Transactional Fact Table
2.Factless Fact Table
3. Snapshot Fact Table
4. Accumulating Fact Table
5. Aggregate Fact Table
6. Consolidated Fact Tables
Transactional Fact table
• These fact tables represent an event that occurred at an instantaneous point in time.
A row exists in the fact table for a given customer or product only if a transaction has occurred
• Grain is the individual transaction
• Mostly Additive Facts
Periodic Snapshot Fact table
• Fact table summarizes many measuresment events occuring over a standard period such as a
day, week, month or Quarter
• Grain is the period not the individual transaction
• If we have 1000 peopleliving in a region at the end of month 1 and 1500 people living in the
same region at the end of month 2 then the total number of people will not be 2500
• Semi Additive & Non – Additive Facts
Aggregate Fact table
• Fact table contains Aggregated Data
• Mostly Additive Facts
Factless Fact table
• Factless fact table contains no measures
• Only Keys from Dimension tables
Accumulated Fact table
Consolidated Fact table
It is often convenient to combine facts from multiple processes together into a
single consolidated fact table if they can be expressed at the same grain. For example, sales
actuals can be consolidated with sales forecasts in a single fact table to make the task of
analyzing actuals versus forecasts simple and fast, as compared to assembling a drill-across
application using separate fact tables. Consolidated fact tables add burden to the ETL
processing, but ease the analytic burden on the BI applications. They should be considered
for cross-process metrics that are frequently analyzed together.
Type of Fact / Measure
• Additive: Additive facts are facts that can be summed up through all of the dimensions in the
fact table.
• Semi-Additive: Semi-additive facts are facts that can be summed up for some of the
dimensions in the fact table, but not the others.
• Non-Additive: Non-additive facts are facts that cannot be summed up for any of the
dimensions present in the fact table.
•
• The purpose of this table is to record the current balance for each account at the end of each
day, as well as the profit margin for each account for each
day. Current_Balance and Profit_Margin are the facts. Current_Balance is a semi-additive
fact, as it makes sense to add them up for all accounts (what's the total current balance for all
accounts in the bank?), but it does not make sense to add them up through time (adding up
all current balances for a given account for each day of the month does not give us any useful
information). Profit_Margin is a non-additive fact, for it does not make sense to add them up
for the account level or the day level.
Type of Fact / Measure Cont..
• Additive
The purpose of this table is to record the sales amount for each product in each store on a daily
basis. Sales_Amount is the fact. In this case, Sales_Amount is an additive fact, because you can sum up
this fact along any of the three dimensions present in the fact table -- date, store, and product. For
example, the sum of Sales_Amount for all 7 days in a week represents the total sales amount for that
week.
• Semi-Additive & Non-Additive:
The purpose of this table is to record the current balance for each account at the end of each day, as well
as the profit margin for each account for each day. Current_Balance and Profit_Margin are the
facts. Current_Balance is a semi-additive fact, as it makes sense to add them up for all accounts (what's
the total current balance for all accounts in the bank?), but it does not make sense to add them up
through time (adding up all current balances for a given account for each day of the month does not give
us any useful information). Profit_Margin is a non-additive fact, for it does not make sense to add them up
for the account level or the day level.
Dimensional Models
• A denormalized relational model
– Made up of tables with attributes
– Relationships defined by keys and foreign keys
• Organized for understandability and ease of reporting rather than update.
• Queried and maintained by SQL or special purpose management tools.
• Star Schemas Versus OLAP Cubes
– Dimensional models implemented in relational database management systems are
referred to as star schemas because of their resemblance to a star-like structure.
– Dimensional models implemented in multidimensional database environments are
referred to as online analytical processing (OLAP) cubes.
– Both stars and cubes have a common logical design with recognizable dimensions;
however, the physical implementation differs
OLAP
• OLAP stands for On-Line Analytical Processing
• For people on the business side, the key feature out of the above list is "Multidimensional."
In other words, the ability to analyze metrics in different dimensions such as time, geography,
gender, product, etc.
For example, sales for the company are up.
- What region is most responsible for this increase?
- Which store in this region is most responsible for the increase?
- What particular product category contributed the most to the increase?
Answering these types of questions in order means that you are performing an OLAP
analysis.
• In the OLAP world, there are mainly two different types:
1. Multidimensional OLAP (MOLAP)
2. Relational OLAP (ROLAP)
3. Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP.
MOLAP
• This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a
multidimensional cube. The storage is not in the relational database, but in proprietary
formats.
Advantages:
• Excellent performance: MOLAP cubes are built for fast data retrieval, and are optimal for
slicing and dicing operations.
• Can perform complex calculations: All calculations have been pre-generated when the cube is
created. Hence, complex calculations are not only doable, but they return quickly.
Disadvantages:
• Limited in the amount of data it can handle: Because all calculations are performed when the
cube is built, it is not possible to include a large amount of data in the cube itself. This is not
to say that the data in the cube cannot be derived from a large amount of data. Indeed, this
is possible. But in this case, only summary-level information will be included in the cube itself.
• Requires additional investment: Cube technology are often proprietary and do not already
exist in the organization. Therefore, to adopt MOLAP technology, chances are additional
investments in human and capital resources are needed.
MOLAP Operation
• Since OLAP servers are based on multidimensional view of data, we will discuss OLAP
operations in multidimensional data.
• Here is the list of OLAP operations:
1. Roll-up
2. Drill-down
3.Slice and dice
4. Pivot (rotate)
MOLAP Operation – Roll Up
• Roll-up performs aggregation on a data cube in any of the following ways:
– By climbing up a concept hierarchy for a dimension
– By dimension reduction
• The following diagram illustrates how roll-up works
– Roll-up is performed by climbing up a concept hierarchy for the dimension location.
– Initially the concept hierarchy was "street < city < province < country".
– On rolling up, the data is aggregated by ascending the location hierarchy from the level of city to the level of country.
– The data is grouped into cities rather than countries.
– When roll-up is performed, one or more dimensions from the data cube are removed.
MOLAP Operation – Drill Down
• Drill-down is the reverse operation of roll-up. It is performed by either of the following ways:
– By stepping down a concept hierarchy for a dimension
– By introducing a new dimension.
• The following diagram illustrates how drill-down works:
– Drill-down is performed by stepping down a concept hierarchy for the dimension time.
– Initially the concept hierarchy was "day < month < quarter < year."
– On drilling down, the time dimension is descended from the level of quarter to the level of month.
– When drill-down is performed, one or more dimensions from the data cube are added.
– It navigates the data from less detailed data to highly detailed data.
MOLAP Operation – Slice
• The slice operation selects one particular dimension from a given cube and provides a new
sub-cube. Consider the following diagram that shows how slice works.
– Here Slice is performed for the dimension "time" using the criterion time = "Q1".
– It will form a new sub-cube by selecting one or more dimensions.
MOLAP Operation – Dice
• Dice selects two or more dimensions from a given cube and provides a new sub-cube.
Consider the following diagram that shows the dice operation.
• The dice operation on the cube based on the following selection criteria involves three
dimensions.
– (location = "Toronto" or "Vancouver")
– (time = "Q1" or "Q2")
– (item =" Mobile" or "Modem")
MOLAP Operation – Pivot
• The pivot operation is also known as rotation. It rotates the data axes in view in order to
provide an alternative presentation of data. Consider the following diagram that shows the
pivot operation
ROLAP
• This methodology relies on manipulating the data stored in the relational database to give
the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action
of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement.
Advantages:
• Can handle large amounts of data: The data size limitation of ROLAP technology is the
limitation on data size of the underlying relational database. In other words, ROLAP itself
places no limitation on data amount.
• Can leverage functionalities inherent in the relational database: Often, relational database
already comes with a host of functionalities. ROLAP technologies, since they sit on top of the
relational database, can therefore leverage these functionalities.
Disadvantages:
• Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple
SQL queries) in the relational database, the query time can be long if the underlying data size
is large.
• Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL
statements to query the relational database, and SQL statements do not fit all needs (for
example, it is difficult to perform complex calculations using SQL), ROLAP technologies are
therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by
building into the tool out-of-the-box complex functions as well as the ability to allow users to
define their own functions.
Type’s of Relational Dimensional Models
• Star Schema
• Snow Flake’s Schema
• Fact Centipede schema
• Fact Constellation Schema
Star Schema
Snow Flake Schema
Same as Star Schema, but Dimension tables are normalized (Spilt)
Fact Centipede Schema
Every Dimension tables are connected to Fact Table
Fact Constellation Schema
• For each star schema it is possible to construct fact constellation schema
(for example by splitting the original star schema into more star schemes each of them describes
facts on another level of dimension hierarchies). The fact constellation architecture contains
multiple fact tables that share many dimension tables.
• The main shortcoming of the fact constellation schema is a more complicated design
because many variants for particular kinds of aggregation must be considered and selected.
Moreover, dimension tables are still large.
HOLAP
• HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For
summary-type information, HOLAP leverages cube technology for faster performance. When
detail information is needed, HOLAP can "drill through" from the cube into the underlying
relational data.
Difference btw ERD & Dimensional Model
• One table per entity
• Minimize data redundancy
• Optimize update / insert
• The Transaction Processing Model
• One fact table for data organization
• Maximize understandability
• Optimized for retrieval
• The data warehousing model
Choosing the Data Mart / Dimensional
Design Process
1. Select the business process
2. Declare the grain
3. Identify the dimensions
4. Identify the facts
Business Process
• Businnes process are the operational activities performed by your organization, such taking
an order, registring students etc.
• It is important to determine the identity of the transaction table and specify exactly what it
represents.
• Represent a process or reporting environment that is of value to the organization
Grain (unit of analysis)
• Atomic graing refers to the lowest level at which data is captured by a given business process
• The grain determines what each fact record represents: the level of detail
• For example
– Individual transactions
– Snapshots (points in time)
– Line items
• Generally better to focus on the smallest grain
Dimensions
• A table (or hierarchy of tables) connected with the fact table with keys and foreign keys
• Preferably single valued for each fact record (1:m)
• Connected with surrogate (generated) keys, not operational keys
• Dimension tables contain text or numeric attributes
Facts
• Normally numeric Keys and additive measures
• Measurements associated with fact table records at fact table granularity
• Non-key attributes in the fact table
Attributes in dimension tables are constants. Facts vary with the granularity of the fact
table
Business Intelligence Data Warehouse System

More Related Content

What's hot

DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data ModelerDAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data ModelerDATAVERSITY
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoJusto Hidalgo
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing conceptspcherukumalla
 
Data Analytics
Data AnalyticsData Analytics
Data AnalyticsRavi Nayak
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Data Stewards – Defining and Assigning
Data Stewards – Defining and AssigningData Stewards – Defining and Assigning
Data Stewards – Defining and AssigningDATAVERSITY
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business IntelligenceRonan Soares
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Business Intelligence tools comparison
Business Intelligence tools comparisonBusiness Intelligence tools comparison
Business Intelligence tools comparisonStratebi
 

What's hot (20)

DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data ModelerDAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
DAS Slides: Data Architect vs. Data Engineer vs. Data Modeler
 
Why Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by DenodoWhy Data Virtualization? An Introduction by Denodo
Why Data Virtualization? An Introduction by Denodo
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Data Analytics
Data AnalyticsData Analytics
Data Analytics
 
Business Intelligence
Business IntelligenceBusiness Intelligence
Business Intelligence
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Data Stewards – Defining and Assigning
Data Stewards – Defining and AssigningData Stewards – Defining and Assigning
Data Stewards – Defining and Assigning
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Data Visualization Tools
Data Visualization ToolsData Visualization Tools
Data Visualization Tools
 
Introduction to Business Intelligence
Introduction to Business IntelligenceIntroduction to Business Intelligence
Introduction to Business Intelligence
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Power BI visuals
Power BI visualsPower BI visuals
Power BI visuals
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Ppt
PptPpt
Ppt
 
Business Intelligence tools comparison
Business Intelligence tools comparisonBusiness Intelligence tools comparison
Business Intelligence tools comparison
 
Business Intelligence concepts
Business Intelligence conceptsBusiness Intelligence concepts
Business Intelligence concepts
 
data warehousing
data warehousingdata warehousing
data warehousing
 

Viewers also liked

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Multi dimensional modeling
Multi dimensional modelingMulti dimensional modeling
Multi dimensional modelingnoviari sugianto
 
CONTROL AND AUDIT
CONTROL AND AUDITCONTROL AND AUDIT
CONTROL AND AUDITRos Dina
 
Data Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data AnalysisData Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data AnalysisRaimonds Simanovskis
 
Business intelligence and analytics
Business intelligence and analyticsBusiness intelligence and analytics
Business intelligence and analyticsRajiv Kumar
 
Inventory and manufacturing system migration - case study
Inventory and manufacturing system migration - case studyInventory and manufacturing system migration - case study
Inventory and manufacturing system migration - case studyAtul Singla
 
Airline reservation system 1
Airline reservation system 1Airline reservation system 1
Airline reservation system 1_faisalkhan
 
Social Analytics Best Practices Webinar
Social Analytics Best Practices WebinarSocial Analytics Best Practices Webinar
Social Analytics Best Practices WebinarNetBase Solutions Inc.
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
Multidimensional data models
Multidimensional data  modelsMultidimensional data  models
Multidimensional data models774474
 
Inventory & Manufacturing System Employing E-Business Suite
Inventory & Manufacturing System Employing E-Business SuiteInventory & Manufacturing System Employing E-Business Suite
Inventory & Manufacturing System Employing E-Business SuiteMd. Moktarul Islam
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
A case for business analytics learning
A case for business analytics learningA case for business analytics learning
A case for business analytics learningMark Tabladillo
 
Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1Beamsync
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Project of Airline booking system
Project of Airline booking systemProject of Airline booking system
Project of Airline booking systemmuthahar.sk
 
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategyHow to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategySAP Analytics
 
Coding and testing in Software Engineering
Coding and testing in Software EngineeringCoding and testing in Software Engineering
Coding and testing in Software EngineeringAbhay Vijay
 

Viewers also liked (20)

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Multi dimensional modeling
Multi dimensional modelingMulti dimensional modeling
Multi dimensional modeling
 
CONTROL AND AUDIT
CONTROL AND AUDITCONTROL AND AUDIT
CONTROL AND AUDIT
 
Data Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data AnalysisData Warehouses and Multi-Dimensional Data Analysis
Data Warehouses and Multi-Dimensional Data Analysis
 
Business intelligence and analytics
Business intelligence and analyticsBusiness intelligence and analytics
Business intelligence and analytics
 
Inventory and manufacturing system migration - case study
Inventory and manufacturing system migration - case studyInventory and manufacturing system migration - case study
Inventory and manufacturing system migration - case study
 
Airline reservation system 1
Airline reservation system 1Airline reservation system 1
Airline reservation system 1
 
Social Analytics Best Practices Webinar
Social Analytics Best Practices WebinarSocial Analytics Best Practices Webinar
Social Analytics Best Practices Webinar
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Multidimensional data models
Multidimensional data  modelsMultidimensional data  models
Multidimensional data models
 
Inventory & Manufacturing System Employing E-Business Suite
Inventory & Manufacturing System Employing E-Business SuiteInventory & Manufacturing System Employing E-Business Suite
Inventory & Manufacturing System Employing E-Business Suite
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
A case for business analytics learning
A case for business analytics learningA case for business analytics learning
A case for business analytics learning
 
Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1Introduction to Business Analytics Part 1
Introduction to Business Analytics Part 1
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Mrp 1
Mrp 1Mrp 1
Mrp 1
 
Project of Airline booking system
Project of Airline booking systemProject of Airline booking system
Project of Airline booking system
 
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence StrategyHow to Build a Rock-Solid Analytics and Business Intelligence Strategy
How to Build a Rock-Solid Analytics and Business Intelligence Strategy
 
Computer Based Ordering System
Computer Based Ordering SystemComputer Based Ordering System
Computer Based Ordering System
 
Coding and testing in Software Engineering
Coding and testing in Software EngineeringCoding and testing in Software Engineering
Coding and testing in Software Engineering
 

Similar to Business Intelligence Data Warehouse System

presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxvipush1
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehousessuser7fc7eb
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data WarehousingAAKANKSHA JAIN
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxSalehaMariyam
 
Data warehouse
Data warehouseData warehouse
Data warehouseMR Z
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationSunderland City Council
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Harish Chand
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Caserta
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptSumathiG8
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptxAnusuya123
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.pptPalaniKumarR2
 
Data warehousev2.1
Data warehousev2.1Data warehousev2.1
Data warehousev2.1Tuan Luong
 
Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data AnalyticsUtkarsh Sharma
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
ERP technology Areas.pptx
ERP technology Areas.pptxERP technology Areas.pptx
ERP technology Areas.pptxssuserdd904d
 

Similar to Business Intelligence Data Warehouse System (20)

presentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptxpresentationofism-complete-1-100227093028-phpapp01.pptx
presentationofism-complete-1-100227093028-phpapp01.pptx
 
Cognos datawarehouse
Cognos datawarehouseCognos datawarehouse
Cognos datawarehouse
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DWH_Session_1.pptx
DWH_Session_1.pptxDWH_Session_1.pptx
DWH_Session_1.pptx
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data VisualisationData Warehousing, Data Mining & Data Visualisation
Data Warehousing, Data Mining & Data Visualisation
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Data warehousev2.1
Data warehousev2.1Data warehousev2.1
Data warehousev2.1
 
Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
ERP technology Areas.pptx
ERP technology Areas.pptxERP technology Areas.pptx
ERP technology Areas.pptx
 
Unit 1
Unit 1Unit 1
Unit 1
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

Business Intelligence Data Warehouse System

  • 1. Presentation Prepared by: Kiran Kumar Pentaho BI Consultant
  • 2. Objective At the end of this module, you will be able to know Trainer Introduction What is Data Warehousing ? What is Data Warehouse Architecture ? What is Dimensional Modelling & Design ? What is Business Intelligence ?
  • 3. Person, Academic & Professional Information Name Kiran Kumar Academic BE Companies Graymatter Software Service Pvt. Lmt. India BI/DWH Technologies Exposure Domain Knowledge
  • 4. s Refers to a Database, Which is maintianed seperately from an organization’s operational database A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. Loosely Speaking Officially Speaking What is Data Warehouse
  • 6. Subject Oriented: Retail Management Systmes
  • 8. Time Variant: Retail Management Systmes
  • 9. Non - Volatile: Retail Management Systmes
  • 10. Goals of Data Warehousing / Business Intelligence • DW/BI system must make information easily accessible. • DW/BI system must present information consistently. • DW/BI system must adapt to change. • DW/BI system must be a secure bastion that protects the information assets. • DW/BI system must serve as the authoritative and trustworthy foundation for improved decision making. • DW/BI system present informaion in a timely way. • Business community must accept the DW/BI system to deem it successful.
  • 11. Strategic uses of Data Warehousing Industry Functional areas of use Strategic use Airline Operations; marketing Crew assignment, aircraft development, mix of fares, analysis of route profitability, frequent flyer program promotions Banking Product development; Operations; marketing Customer service, trend analysis, product and service promotions, reduction of IS expenses Credit card Product development; marketing Customer service, new information service, fraud detection Health care Operations Reduction of operational expenses Investment and Insurance Product development; Operations; marketing Risk management, market movements analysis, customer tendencies analysis, portfolio management Retail chain Distribution; marketing Trend analysis, buying pattern analysis, pricing policy, inventory control, sales promotions, optimal distribution channel Telecommunications Product development; Operations; marketing New product and service promotions, reduction of IS budget, profitability analysis Personal care Distribution; marketing Distribution decisions, product promotions, sales decisions, pricing policy Public sector Operations Intelligence gathering
  • 12. Evolution in Organizational use of data warehouses • Off line Data Warehouse Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data is stored in a data structure designed to facilitate reporting. • Real Time Data Warehouse Data warehouses at this stage are updated every time an operational system performs a transaction (e.g. an order or a delivery or a booking.)
  • 13. Data Marts • A data mart is a scaled down version of a data warehouse that focuses on a particular subject area. • A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. • Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization. • Usually designed to support the unique business requirements of a specified department or business process • Implemented as the first step in proving the usefulness of the technologies to solve business problems Reasons for creating a data mart • Easy access to frequently needed data • Creates collective view by a group of users • Improves end-user response time • Ease of creation in less time • Lower cost than implementing a full Data warehouse • Potential users are more clearly defined than in a full Data warehouse
  • 14. From the Data Warehouse to Data Marts Departmentally Structured Individually Structured Data Warehouse Organizationally Structured Less More History Normalized Detailed Data Information
  • 15. Characteristics of the Departmental Data Mart • Small • Flexible • Customized by Department • Source is departmentally structured data warehouse Data mart Data warehouse
  • 16. Inmon Vs Ralph Kimball Characterictics
  • 17. Data warehousing Integration DATA SOURCES (databases) End Users: Decision making and other tasks: CRM, DSS, EIS Information Data Warehouse (storage) Analytical processing, Data mining Data visualization Generate knowledge Direct use Direct use Use Use Use of knowledge Data organization ; storage use
  • 18. Design the BI & DWH Architecture
  • 19. DWH Architecture Cont.. • Data Source Layer • Data Extraction Layer • Staging Area • ETL Layer • Data Storage Layer • Data Logic Layer • Data Presentation Layer • Metadata Layer
  • 20. Adv & DisAdv of Data Warehouse Advantage: Data warehouses tend to have a very high query success as they have complete control over the four main areas of data management systems. • Bottom Up Appoarch • Clean data • Indexes: multiple types • Query processing: multiple options • Security: data and access • Easy report creation • Enhanced access to data and information Disadvantages: • Preparation may be time consuming • Long initial implementation time and associated high cost • Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data.
  • 21. OTLP VS OLAP System’s
  • 23. Data, Data everywhere yet ... • I can’t find the data I need – data is scattered over the network – many versions, subtle differences • I can’t get the data I need – need an expert to get the data • I can’t understand the data I found – available data poorly documented • I can’t use the data I found – results are unexpected – data needs to be transformed from one form to other
  • 24. Business Intelligence • One ultimate use of the data gathered and processed in the data life cycle is for business intelligence. • Business intelligence generally involves the creation or use of a data warehouse and/or data mart for storage of data, and the use of front-end analytical tools such as Pentaho BI Suite, SAP BO, MSBI, Oracle’s Sales Analyzer and Financial Analyzer or Micro Strategy’s Web. • Such tools can be employed by end users to access data, ask queries, request ad hoc (special) reports, examine scenarios, create CRM activities, devise pricing strategies, and much more.
  • 25. A producer wants to know…. Which are our lowest/highest margin customers ? Who are my customers and what products are they buying? What is the most effective distribution channel? What product prom- -otions have the biggest impact on revenue? What impact will new products/services have on revenue and margins? Which customers are most likely to go to the competition ?
  • 26. How Business Intelligence works? • The process starts with raw data which are usually kept in corporate data bases. For example, a national retail chain that sells everything from grills and patio furniture to plastic utensils had data about inventory, customer information, data about past promotions, and sales numbers in various databases. • Though all this information may be scattered across multiple systems and may seem unrelated-business intelligence software can being it together. This is done by using a data warehouse. • In the data warehouse (or mart) tables can be linked, and data cubes are formed. For instance, inventory information is linked to sales numbers and customer databases, allowing for deep analysis of information. • Using the business intelligence software the user can ask queries, request ad-hoc reports, or conduct any other analysis. • For example, deep analysis can be carried out by performing multilayer queries. Because all the databases are linked, one can search for what products a store has too much of, determine which of these products commonly sell with popular items, bases on previous sales. After planning a promotion to move the excess stock along with the popular products (by bundling them together, for example), one can dig deeper to see where this promotion would be most popular (and most profitable). • The results of the request can be reports, predictions, alerts, and/or graphical presentations. These can be disseminated to decision makers to help them in their decision-making tasks.
  • 27. Dimension Tables • Dimension table is one that Contain text and descriptive information of the business entities of an enterprise, represent as hierarchical, categorical information such as Customer, Product, Date, Location, Department etc. • 1 in a 1-M relationship • Also called as lookup or reference tables • Typically contain the attributes for the SQL answer set.
  • 28. Type of Dimension Tables • Standard / Common Dimension • Conformed Dimension • Junk Dimension • Degenerated Dimension • Role-Playing dimension • Denormalized Flattened Dimension • Snowflaked Dimension • Outrigger Dimension • Shrunken Dimension
  • 29. Slowly Changing Dimensions • Dimensions attributes that change slowly over time, rather than changing on regular schedule, time-base. • In Data Warehouse there is a need to track changes in dimension attributes in order to report historical data. • Ex: Person chaging his/her city from Bangalore to Mumbai. Type of SCD: – Type 1: Store only the current value ( Overwrite) – Type 2: Maintain History changes ( Add New Row) – Type 3: Create an attribute in the dimension record for previous value ( Add New Attribute) – Type 4: Using historical table ( Add Mini – Dimension table) – Type 5: Add Mini-Dimensional & Type 1 Outrigger
  • 30. SCD 1 – Overwrite the Old Value
  • 31. SCD 2 – Add a New Row
  • 32. SCD 2 – Add a New Row
  • 33. SCD 3 – Add a New Column
  • 34. SCD 4 • What is Mini Dimension ? – In case of a dimension, whre there are attributes which change rapidly or at a frequent interval of time, they are split off to form a dimension table named as mini-dimension Ex: Age of a Customer or Employee, Salary Band, Designation etc. • Design aspects of Mini Dimension – Should have its own surrogate key of mini dimension table. – There is no direct connection btw the base & mini dimension table. – Fact table contains Primary Key of both Base & Mini Dimension table. • What is SCD4 ? – Involves usage of 2 or more dimension table in which one would act as a base dimension and one or more mini dimension tables • When to use ? – Handling Rapidly changing attributes
  • 35. SCD 5 • What is SCD 5 ? – Scd 5 involves usage of one or more mini dimension tables and a base dimension table with a reference to mini dimension key in the base dimension table. – This reference key in base dimension should be of Type 1 in nature. Therefore it would reflect the current version of mini dimension attributes in the dimension table • When to use ? – When there is a need to access the current values in the mini-dimension directly from the base dimension without joining a fact table • What is SCD 5 ? – Type 1 referential key should get updated in the base dimension in all the version of the dimension records whenever there is a change involved in corresponding mini dimension attributes values • Design aspects of Mini Dimension – Should have its own surrogate key of mini dimension table. – There is direct connection btw the base & mini dimension table. – Fact table contains Primary Key of both Base & Mini Dimension table.
  • 36. Fact Tables • Stores the performance measurements resulting from an organization’s business process events • Store the low-level measurement data resulting from a business process in a single dimensional model • The term fact represents a business measure. • Each row in a fact table corresponds to a measurement event • Contains two or more foreign keys • Tend to have huge numbers of records • Useful facts tend to be numeric and additive Types of Fact Table: 1. Transactional Fact Table 2.Factless Fact Table 3. Snapshot Fact Table 4. Accumulating Fact Table 5. Aggregate Fact Table 6. Consolidated Fact Tables
  • 37. Transactional Fact table • These fact tables represent an event that occurred at an instantaneous point in time. A row exists in the fact table for a given customer or product only if a transaction has occurred • Grain is the individual transaction • Mostly Additive Facts
  • 38. Periodic Snapshot Fact table • Fact table summarizes many measuresment events occuring over a standard period such as a day, week, month or Quarter • Grain is the period not the individual transaction • If we have 1000 peopleliving in a region at the end of month 1 and 1500 people living in the same region at the end of month 2 then the total number of people will not be 2500 • Semi Additive & Non – Additive Facts
  • 39. Aggregate Fact table • Fact table contains Aggregated Data • Mostly Additive Facts
  • 40. Factless Fact table • Factless fact table contains no measures • Only Keys from Dimension tables
  • 42. Consolidated Fact table It is often convenient to combine facts from multiple processes together into a single consolidated fact table if they can be expressed at the same grain. For example, sales actuals can be consolidated with sales forecasts in a single fact table to make the task of analyzing actuals versus forecasts simple and fast, as compared to assembling a drill-across application using separate fact tables. Consolidated fact tables add burden to the ETL processing, but ease the analytic burden on the BI applications. They should be considered for cross-process metrics that are frequently analyzed together.
  • 43. Type of Fact / Measure • Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table. • Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others. • Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. • • The purpose of this table is to record the current balance for each account at the end of each day, as well as the profit margin for each account for each day. Current_Balance and Profit_Margin are the facts. Current_Balance is a semi-additive fact, as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?), but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information). Profit_Margin is a non-additive fact, for it does not make sense to add them up for the account level or the day level.
  • 44. Type of Fact / Measure Cont.. • Additive The purpose of this table is to record the sales amount for each product in each store on a daily basis. Sales_Amount is the fact. In this case, Sales_Amount is an additive fact, because you can sum up this fact along any of the three dimensions present in the fact table -- date, store, and product. For example, the sum of Sales_Amount for all 7 days in a week represents the total sales amount for that week. • Semi-Additive & Non-Additive: The purpose of this table is to record the current balance for each account at the end of each day, as well as the profit margin for each account for each day. Current_Balance and Profit_Margin are the facts. Current_Balance is a semi-additive fact, as it makes sense to add them up for all accounts (what's the total current balance for all accounts in the bank?), but it does not make sense to add them up through time (adding up all current balances for a given account for each day of the month does not give us any useful information). Profit_Margin is a non-additive fact, for it does not make sense to add them up for the account level or the day level.
  • 45. Dimensional Models • A denormalized relational model – Made up of tables with attributes – Relationships defined by keys and foreign keys • Organized for understandability and ease of reporting rather than update. • Queried and maintained by SQL or special purpose management tools. • Star Schemas Versus OLAP Cubes – Dimensional models implemented in relational database management systems are referred to as star schemas because of their resemblance to a star-like structure. – Dimensional models implemented in multidimensional database environments are referred to as online analytical processing (OLAP) cubes. – Both stars and cubes have a common logical design with recognizable dimensions; however, the physical implementation differs
  • 46. OLAP • OLAP stands for On-Line Analytical Processing • For people on the business side, the key feature out of the above list is "Multidimensional." In other words, the ability to analyze metrics in different dimensions such as time, geography, gender, product, etc. For example, sales for the company are up. - What region is most responsible for this increase? - Which store in this region is most responsible for the increase? - What particular product category contributed the most to the increase? Answering these types of questions in order means that you are performing an OLAP analysis. • In the OLAP world, there are mainly two different types: 1. Multidimensional OLAP (MOLAP) 2. Relational OLAP (ROLAP) 3. Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP.
  • 47. MOLAP • This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Advantages: • Excellent performance: MOLAP cubes are built for fast data retrieval, and are optimal for slicing and dicing operations. • Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Disadvantages: • Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself. • Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed.
  • 48. MOLAP Operation • Since OLAP servers are based on multidimensional view of data, we will discuss OLAP operations in multidimensional data. • Here is the list of OLAP operations: 1. Roll-up 2. Drill-down 3.Slice and dice 4. Pivot (rotate)
  • 49. MOLAP Operation – Roll Up • Roll-up performs aggregation on a data cube in any of the following ways: – By climbing up a concept hierarchy for a dimension – By dimension reduction • The following diagram illustrates how roll-up works – Roll-up is performed by climbing up a concept hierarchy for the dimension location. – Initially the concept hierarchy was "street < city < province < country". – On rolling up, the data is aggregated by ascending the location hierarchy from the level of city to the level of country. – The data is grouped into cities rather than countries. – When roll-up is performed, one or more dimensions from the data cube are removed.
  • 50. MOLAP Operation – Drill Down • Drill-down is the reverse operation of roll-up. It is performed by either of the following ways: – By stepping down a concept hierarchy for a dimension – By introducing a new dimension. • The following diagram illustrates how drill-down works: – Drill-down is performed by stepping down a concept hierarchy for the dimension time. – Initially the concept hierarchy was "day < month < quarter < year." – On drilling down, the time dimension is descended from the level of quarter to the level of month. – When drill-down is performed, one or more dimensions from the data cube are added. – It navigates the data from less detailed data to highly detailed data.
  • 51. MOLAP Operation – Slice • The slice operation selects one particular dimension from a given cube and provides a new sub-cube. Consider the following diagram that shows how slice works. – Here Slice is performed for the dimension "time" using the criterion time = "Q1". – It will form a new sub-cube by selecting one or more dimensions.
  • 52. MOLAP Operation – Dice • Dice selects two or more dimensions from a given cube and provides a new sub-cube. Consider the following diagram that shows the dice operation. • The dice operation on the cube based on the following selection criteria involves three dimensions. – (location = "Toronto" or "Vancouver") – (time = "Q1" or "Q2") – (item =" Mobile" or "Modem")
  • 53. MOLAP Operation – Pivot • The pivot operation is also known as rotation. It rotates the data axes in view in order to provide an alternative presentation of data. Consider the following diagram that shows the pivot operation
  • 54. ROLAP • This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement. Advantages: • Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount. • Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities. Disadvantages: • Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large. • Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions.
  • 55. Type’s of Relational Dimensional Models • Star Schema • Snow Flake’s Schema • Fact Centipede schema • Fact Constellation Schema
  • 57. Snow Flake Schema Same as Star Schema, but Dimension tables are normalized (Spilt)
  • 58. Fact Centipede Schema Every Dimension tables are connected to Fact Table
  • 59. Fact Constellation Schema • For each star schema it is possible to construct fact constellation schema (for example by splitting the original star schema into more star schemes each of them describes facts on another level of dimension hierarchies). The fact constellation architecture contains multiple fact tables that share many dimension tables. • The main shortcoming of the fact constellation schema is a more complicated design because many variants for particular kinds of aggregation must be considered and selected. Moreover, dimension tables are still large.
  • 60. HOLAP • HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data.
  • 61. Difference btw ERD & Dimensional Model • One table per entity • Minimize data redundancy • Optimize update / insert • The Transaction Processing Model • One fact table for data organization • Maximize understandability • Optimized for retrieval • The data warehousing model
  • 62. Choosing the Data Mart / Dimensional Design Process 1. Select the business process 2. Declare the grain 3. Identify the dimensions 4. Identify the facts
  • 63. Business Process • Businnes process are the operational activities performed by your organization, such taking an order, registring students etc. • It is important to determine the identity of the transaction table and specify exactly what it represents. • Represent a process or reporting environment that is of value to the organization
  • 64. Grain (unit of analysis) • Atomic graing refers to the lowest level at which data is captured by a given business process • The grain determines what each fact record represents: the level of detail • For example – Individual transactions – Snapshots (points in time) – Line items • Generally better to focus on the smallest grain
  • 65. Dimensions • A table (or hierarchy of tables) connected with the fact table with keys and foreign keys • Preferably single valued for each fact record (1:m) • Connected with surrogate (generated) keys, not operational keys • Dimension tables contain text or numeric attributes
  • 66. Facts • Normally numeric Keys and additive measures • Measurements associated with fact table records at fact table granularity • Non-key attributes in the fact table Attributes in dimension tables are constants. Facts vary with the granularity of the fact table