SlideShare a Scribd company logo
1 of 28
1
May 6, 2016
Data Mining: Concepts and
Techniques
2
Introduction toIntroduction to
DataData
WarehousingWarehousing
Chapter 3: Data Warehousing and OLAPChapter 3: Data Warehousing and OLAP
Technology: An OverviewTechnology: An Overview
• What is a data warehouse?
• Data warehouse architecture
• From data warehousing to data mining
What is Data Warehouse?What is Data Warehouse?
• Defined in many different ways, but not rigorously.
o A decision support database that is maintained separately from
the organization’s operational database
o Support information processing by providing a solid platform of
consolidated, historical data for analysis.
• “A data warehouse is a subject-oriented, integrated, time-variant,
and nonvolatile collection of data in support of management’s
decision-making process.”—W. H. Inmon
o Data warehousing:The process of constructing and using
data warehouses
Data Warehouse—Subject-OrientedData Warehouse—Subject-Oriented
• Organized around major subjects, such as customer, product,
sales
• Focusing on the modeling and analysis of data for decision
makers, not on daily operations or transaction processing
• Provide a simple and concise view around particular subject issues
by excluding data that are not useful in the decision support
process
Data Warehouse—IntegratedData Warehouse—Integrated
• Constructed by integrating multiple, heterogeneous data sources
o relational databases, flat files, on-line transaction records
• Data cleaning and data integration techniques are applied.
o Ensure consistency in naming conventions, encoding structures,
attribute measures, etc. among different data sources
• E.g., Hotel price: currency, tax, breakfast covered, etc.
o When data is moved to the warehouse, it is converted.
Data Warehouse—Time VariantData Warehouse—Time Variant
• The time horizon for the data warehouse is significantly longer
than that of operational systems
o Operational database: current value data
o Data warehouse data: provide information from a historical
perspective (e.g., past 5-10 years)
• Every key structure in the data warehouse
o Contains an element of time, explicitly or implicitly
o But the key of operational data may or may not contain
“time element”
Data Warehouse—NonvolatileData Warehouse—Nonvolatile
• A physically separate store of data transformed from the
operational environment
• Operational update of data does not occur in the data
warehouse environment
o Does not require transaction processing, recovery, and
concurrency control mechanisms
o Requires only two operations in data accessing:
• initial loading of data and access of data
Data Warehouse vs. Heterogeneous DBMSData Warehouse vs. Heterogeneous DBMS
• Traditional heterogeneous DB integration: A query driven approach
o Build wrappers/mediators on top of heterogeneous databases
o When a query is posed to a client site, a meta-dictionary is used to
translate the query into queries appropriate for individual
heterogeneous sites involved, and the results are integrated into a
global answer set
o Complex information filtering, compete for resources
• Data warehouse: update-driven, high performance
o Information from heterogeneous sources is integrated in advance and
stored in warehouses for direct query and analysis
Data Warehouse vs. Operational DBMSData Warehouse vs. Operational DBMS
• OLTP (on-line transaction processing)
o Major task of traditional relational DBMS
o Day-to-day operations: purchasing, inventory, banking,
manufacturing, payroll, registration, accounting, etc.
• OLAP (on-line analytical processing)
o Major task of data warehouse system
o Data analysis and decision making
• Distinct features (OLTP vs. OLAP):
o User and system orientation: customer vs. market
o Data contents: current, detailed vs. historical, consolidated
o Database design: ER + application vs. star + subject
o View: current, local vs. evolutionary, integrated
OLTP vs. OLAPOLTP vs. OLAP
OLTP OLAP
users clerk, IT professional knowledge worker
function day to day operations decision support
DB design application-oriented subject-oriented
data current, up-to-date
detailed, flat relational
isolated
historical,
summarized, multidimensional
integrated, consolidated
usage repetitive ad-hoc
access read/write
index/hash on prim. key
lots of scans
unit of work short, simple transaction complex query
# records accessed tens millions
#users thousands hundreds
DB size 100MB-GB 100GB-TB
metric transaction throughput query throughput, response
Why Separate Data Warehouse?Why Separate Data Warehouse?
• High performance for both systems
o DBMS— tuned for OLTP: access methods, indexing, concurrency
control, recovery
o Warehouse—tuned for OLAP: complex OLAP queries,
multidimensional view, consolidation
• Different functions and different data:
o missing data: Decision support requires historical data which
operational DBs do not typically maintain
o data consolidation: DS requires consolidation (aggregation,
summarization) of data from heterogeneous sources
o data quality: different sources typically use inconsistent data
representations, codes and formats which have to be reconciled
Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview
• What is a data warehouse?
• Data warehouse architecture
• From data warehousing to data mining
Design of Data Warehouse: A Business Analysis FrameworkDesign of Data Warehouse: A Business Analysis Framework
• Four views regarding the design of a data warehouse
o Top-down view
• allows selection of the relevant information necessary for the
data warehouse
o Data source view
• exposes the information being captured, stored, and
managed by operational systems
o Data warehouse view
• consists of fact tables and dimension tables
o Business query view
• sees the perspectives of data in the warehouse from the view
of end-user
Data Warehouse Design ProcessData Warehouse Design Process
• Top-down, bottom-up approaches or a combination of both
o Top-down: Starts with overall design and planning (mature)
o Bottom-up: Starts with experiments and prototypes (rapid)
• From software engineering point of view
o Waterfall: structured and systematic analysis at each step before
proceeding to the next
o Spiral: rapid generation of increasingly functional systems, short turn
around time, quick turn around
• Typical data warehouse design process
o Choose a business process to model, e.g., orders, invoices, etc.
o Choose the grain (atomic level of data) of the business process
o Choose the dimensions that will apply to each fact table record
Data Warehouse: A Multi-Tiered ArchitectureData Warehouse: A Multi-Tiered Architecture
OLAP Engine
Metadata
Data Sources Front-End ToolsData Storage
Data
Warehouse
Extract
Transform
Load
Refresh
Analysis
Query
Reports
Data mining
Monitor
&
Integrator
Serve
Data Marts
Operational
DBs
Other
sources
OLAP Server
Three Data Warehouse ModelsThree Data Warehouse Models
• Enterprise warehouse
o collects all of the information about subjects spanning the entire
organization
• Data Mart
o a subset of corporate-wide data that is of value to a specific groups
of users. Its scope is confined to specific, selected groups, such as
marketing data mart
• Independent vs. dependent (directly from warehouse) data mart
• Virtual warehouse
o A set of views over operational databases
o Only some of the possible summary views may be materialized
Data Warehouse Development: A Recommended ApproachData Warehouse Development: A Recommended Approach
Multi-Tier Data
Warehouse
Enterprise
Data
Warehouse
Define a high-level corporate data model
Data
Mart
Data
Mart
Distributed
Data Marts
Model refinementModel refinement
Data Warehouse Back-End Tools and UtilitiesData Warehouse Back-End Tools and Utilities
• Data extraction
o get data from multiple, heterogeneous, and external sources
• Data cleaning
o detect errors in the data and rectify them when possible
• Data transformation
o convert data from legacy or host format to warehouse
format
• Load
o sort, summarize, consolidate, compute views, check integrity,
and build indicies and partitions
• Refresh
o propagate the updates from the data sources to the
warehouse
Metadata RepositoryMetadata Repository
• Meta data is the data defining warehouse objects. It stores:
• Description of the structure of the data warehouse
o schema, view, dimensions, hierarchies, derived data defn, data mart
locations and contents
• Operational meta-data
o data lineage (history of migrated data and transformation path), currency of
data (active, archived, or purged), monitoring information (warehouse
usage statistics, error reports, audit trails)
• The algorithms used for summarization
• The mapping from operational environment to the data warehouse
• Data related to system performance
o warehouse schema, view and derived data definitions
• Business data
o business terms and definitions, ownership of data, charging policies
OLAP Server ArchitecturesOLAP Server Architectures
• Relational OLAP (ROLAP)
o Use relational or extended-relational DBMS to store and manage
warehouse data and OLAP middle ware
o Include optimization of DBMS backend, implementation of aggregation
navigation logic, and additional tools and services
o Greater scalability
• Multidimensional OLAP (MOLAP)
o Sparse array-based multidimensional storage engine
o Fast indexing to pre-computed summarized data
• Hybrid OLAP (HOLAP) (e.g., Microsoft SQLServer)
o Flexibility, e.g., low level: relational, high-level: array
• Specialized SQL servers (e.g., Redbricks)
o Specialized support for SQL queries over star/snowflake schemas
Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview
• What is a data warehouse?
• Data warehouse architecture
• From data warehousing to data mining
Data Warehouse UsageData Warehouse Usage
• Three kinds of data warehouse applications
o Information processing
• supports querying, basic statistical analysis, and reporting using
crosstabs, tables, charts and graphs
o Analytical processing
• multidimensional analysis of data warehouse data
• supports basic OLAP operations, slice-dice, drilling, pivoting
o Data mining
• knowledge discovery from hidden patterns
• supports associations, constructing analytical models,
performing classification and prediction, and presenting the
mining results using visualization tools
From On-Line Analytical Processing (OLAP) to On Line Analytical Mining (OLAM)From On-Line Analytical Processing (OLAP) to On Line Analytical Mining (OLAM)
• Why online analytical mining?
o High quality of data in data warehouses
• DW contains integrated, consistent, cleaned data
o Available information processing structure surrounding data
warehouses
• ODBC, OLEDB, Web accessing, service facilities, reporting and
OLAP tools
o OLAP-based exploratory data analysis
• Mining with drilling, dicing, pivoting, etc.
o On-line selection of data mining functions
• Integration and swapping of multiple mining functions,
algorithms, and tasks
An OLAM System ArchitectureAn OLAM System Architecture
User GUI API
Data
Warehouse
Meta
Data
MDDB
OLAM
Engine
OLAP
Engine
Data Cube API
Database API
Data cleaning
Data integration
Layer3
OLAP/OLAM
Layer2
MDDB
Layer1
Data
Repository
Layer4
User Interface
Filtering&Integration Filtering
Databases
Mining query Mining result
Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview
• What is a data warehouse?
• A multi-dimensional data model
• Data warehouse architecture
• Data warehouse implementation
• From data warehousing to data mining
• Summary
Summary: Data Warehouse and OLAP TechnologySummary: Data Warehouse and OLAP Technology
• Why data warehousing?
• Data warehouse architecture
• From OLAP to OLAM (on-line analytical mining)
ThankThank You !!!You !!!
For More Information click below link:
Follow Us on:
http://vibranttechnologies.co.in/datawarehousing-classes-in-mumbai.html

More Related Content

What's hot

Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingEyad Manna
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouseAmin Choroomi
 
Dwdm 2(data warehouse)
Dwdm 2(data warehouse)Dwdm 2(data warehouse)
Dwdm 2(data warehouse)Er Bansal
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing Girish Dhareshwar
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecturesamaksh1982
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data miningmaxonlinetr
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing OverviewAhmed Gamal
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data WarehousingAlex Meadows
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaRadhika Kotecha
 

What's hot (20)

Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing and Data mining
Data warehousing and Data mining Data warehousing and Data mining
Data warehousing and Data mining
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Dwdm 2(data warehouse)
Dwdm 2(data warehouse)Dwdm 2(data warehouse)
Dwdm 2(data warehouse)
 
Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data warehouseconceptsandarchitecture
Data warehouseconceptsandarchitectureData warehouseconceptsandarchitecture
Data warehouseconceptsandarchitecture
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Difference between data warehouse and data mining
Difference between data warehouse and data miningDifference between data warehouse and data mining
Difference between data warehouse and data mining
 
Data ware house
Data ware houseData ware house
Data ware house
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Data warehousing ppt
Data warehousing pptData warehousing ppt
Data warehousing ppt
 
Data Warehousing Overview
Data Warehousing OverviewData Warehousing Overview
Data Warehousing Overview
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Open Source Datawarehouse
Open Source DatawarehouseOpen Source Datawarehouse
Open Source Datawarehouse
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 

Viewers also liked

Dspsainstahun3 120925225125-phpapp02
Dspsainstahun3 120925225125-phpapp02Dspsainstahun3 120925225125-phpapp02
Dspsainstahun3 120925225125-phpapp02Maimon Abdullah
 
WannaBiz Fund
WannaBiz FundWannaBiz Fund
WannaBiz FundClickky
 
Concepções de linguagem e ensino de português
Concepções de linguagem e ensino de portuguêsConcepções de linguagem e ensino de português
Concepções de linguagem e ensino de portuguêscelia.tanajura
 
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.WebCamp
 
Parabola demystified
Parabola demystifiedParabola demystified
Parabola demystifiedAshish Khare
 
7 Major Venue Types That Unique Venues Represents
7 Major Venue Types That Unique Venues Represents7 Major Venue Types That Unique Venues Represents
7 Major Venue Types That Unique Venues RepresentsUnique Venues
 
Ronald´S diary
Ronald´S diaryRonald´S diary
Ronald´S diaryronnian
 
Kata mutiara
Kata mutiaraKata mutiara
Kata mutiarawakman78
 
Tips For Attracting and Engaging Millennials At Your Event
Tips For Attracting and Engaging Millennials At Your EventTips For Attracting and Engaging Millennials At Your Event
Tips For Attracting and Engaging Millennials At Your EventUnique Venues
 
Choosing a Theme for Your Corporate Event
Choosing a Theme for Your Corporate EventChoosing a Theme for Your Corporate Event
Choosing a Theme for Your Corporate EventUnique Venues
 
Web design principles
Web design principlesWeb design principles
Web design principlesBen Peak
 
EL REINO DE IRÁS Y NO VOLVERÁS
EL REINO DE IRÁS Y NO VOLVERÁSEL REINO DE IRÁS Y NO VOLVERÁS
EL REINO DE IRÁS Y NO VOLVERÁSjjvallescalvo
 

Viewers also liked (18)

Dspsainstahun3 120925225125-phpapp02
Dspsainstahun3 120925225125-phpapp02Dspsainstahun3 120925225125-phpapp02
Dspsainstahun3 120925225125-phpapp02
 
Entertainment
EntertainmentEntertainment
Entertainment
 
Rec05 primero. módulo de paso
Rec05 primero. módulo de pasoRec05 primero. módulo de paso
Rec05 primero. módulo de paso
 
Módulo 2.1 proceso de enseñanza aprendizaje
Módulo 2.1 proceso de enseñanza aprendizajeMódulo 2.1 proceso de enseñanza aprendizaje
Módulo 2.1 proceso de enseñanza aprendizaje
 
WannaBiz Fund
WannaBiz FundWannaBiz Fund
WannaBiz Fund
 
Concepções de linguagem e ensino de português
Concepções de linguagem e ensino de portuguêsConcepções de linguagem e ensino de português
Concepções de linguagem e ensino de português
 
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.
WebCamp 2016: BizDev. Марина Никитчук : Искусство продажи мечты, а не сервиса.
 
Parabola demystified
Parabola demystifiedParabola demystified
Parabola demystified
 
Screens
ScreensScreens
Screens
 
7 Major Venue Types That Unique Venues Represents
7 Major Venue Types That Unique Venues Represents7 Major Venue Types That Unique Venues Represents
7 Major Venue Types That Unique Venues Represents
 
Ronald´S diary
Ronald´S diaryRonald´S diary
Ronald´S diary
 
Kata mutiara
Kata mutiaraKata mutiara
Kata mutiara
 
Tips For Attracting and Engaging Millennials At Your Event
Tips For Attracting and Engaging Millennials At Your EventTips For Attracting and Engaging Millennials At Your Event
Tips For Attracting and Engaging Millennials At Your Event
 
Choosing a Theme for Your Corporate Event
Choosing a Theme for Your Corporate EventChoosing a Theme for Your Corporate Event
Choosing a Theme for Your Corporate Event
 
Web design principles
Web design principlesWeb design principles
Web design principles
 
EL REINO DE IRÁS Y NO VOLVERÁS
EL REINO DE IRÁS Y NO VOLVERÁSEL REINO DE IRÁS Y NO VOLVERÁS
EL REINO DE IRÁS Y NO VOLVERÁS
 
Alexandr iii
Alexandr iiiAlexandr iii
Alexandr iii
 
My gmail account
My gmail accountMy gmail account
My gmail account
 

Similar to Data ware housing - Introduction to data ware housing process.

data warehousing
data warehousingdata warehousing
data warehousing143sohil
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxnikshaikh786
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introductionMurli Jha
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its conceptsGaurav Garg
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousingEr. Nawaraj Bhandari
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processingSamraiz Tejani
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptRafiulHasan19
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptxAnusuya123
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousingAhmad Shlool
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousingAhmad Shlool
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxSalehaMariyam
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 

Similar to Data ware housing - Introduction to data ware housing process. (20)

data warehousing
data warehousingdata warehousing
data warehousing
 
Module 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptxModule 1_Data Warehousing Fundamentals.pptx
Module 1_Data Warehousing Fundamentals.pptx
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Data warehouse system and its concepts
Data warehouse system and its conceptsData warehouse system and its concepts
Data warehouse system and its concepts
 
Introduction to data mining and data warehousing
Introduction to data mining and data warehousingIntroduction to data mining and data warehousing
Introduction to data mining and data warehousing
 
DATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptxDATA WAREHOUSING.2.pptx
DATA WAREHOUSING.2.pptx
 
Lecture1
Lecture1Lecture1
Lecture1
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehousing.pptx
Data warehousing.pptxData warehousing.pptx
Data warehousing.pptx
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousing
 
Ch1 data-warehousing
Ch1 data-warehousingCh1 data-warehousing
Ch1 data-warehousing
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 

More from Vibrant Technologies & Computers

More from Vibrant Technologies & Computers (20)

Buisness analyst business analysis overview ppt 5
Buisness analyst business analysis overview ppt 5Buisness analyst business analysis overview ppt 5
Buisness analyst business analysis overview ppt 5
 
SQL Introduction to displaying data from multiple tables
SQL Introduction to displaying data from multiple tables  SQL Introduction to displaying data from multiple tables
SQL Introduction to displaying data from multiple tables
 
SQL- Introduction to MySQL
SQL- Introduction to MySQLSQL- Introduction to MySQL
SQL- Introduction to MySQL
 
SQL- Introduction to SQL database
SQL- Introduction to SQL database SQL- Introduction to SQL database
SQL- Introduction to SQL database
 
ITIL - introduction to ITIL
ITIL - introduction to ITILITIL - introduction to ITIL
ITIL - introduction to ITIL
 
Salesforce - Introduction to Security & Access
Salesforce -  Introduction to Security & Access Salesforce -  Introduction to Security & Access
Salesforce - Introduction to Security & Access
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
 
Salesforce - classification of cloud computing
Salesforce - classification of cloud computingSalesforce - classification of cloud computing
Salesforce - classification of cloud computing
 
Salesforce - cloud computing fundamental
Salesforce - cloud computing fundamentalSalesforce - cloud computing fundamental
Salesforce - cloud computing fundamental
 
SQL- Introduction to PL/SQL
SQL- Introduction to  PL/SQLSQL- Introduction to  PL/SQL
SQL- Introduction to PL/SQL
 
SQL- Introduction to advanced sql concepts
SQL- Introduction to  advanced sql conceptsSQL- Introduction to  advanced sql concepts
SQL- Introduction to advanced sql concepts
 
SQL Inteoduction to SQL manipulating of data
SQL Inteoduction to SQL manipulating of data   SQL Inteoduction to SQL manipulating of data
SQL Inteoduction to SQL manipulating of data
 
SQL- Introduction to SQL Set Operations
SQL- Introduction to SQL Set OperationsSQL- Introduction to SQL Set Operations
SQL- Introduction to SQL Set Operations
 
Sas - Introduction to designing the data mart
Sas - Introduction to designing the data martSas - Introduction to designing the data mart
Sas - Introduction to designing the data mart
 
Sas - Introduction to working under change management
Sas - Introduction to working under change managementSas - Introduction to working under change management
Sas - Introduction to working under change management
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Teradata - Architecture of Teradata
Teradata - Architecture of TeradataTeradata - Architecture of Teradata
Teradata - Architecture of Teradata
 
Teradata - Restoring Data
Teradata - Restoring Data Teradata - Restoring Data
Teradata - Restoring Data
 
Datastage database design and data modeling ppt 4
Datastage database design and data modeling ppt 4Datastage database design and data modeling ppt 4
Datastage database design and data modeling ppt 4
 
Sql server select queries ppt 18
Sql server select queries ppt 18Sql server select queries ppt 18
Sql server select queries ppt 18
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

Data ware housing - Introduction to data ware housing process.

  • 1. 1
  • 2. May 6, 2016 Data Mining: Concepts and Techniques 2 Introduction toIntroduction to DataData WarehousingWarehousing
  • 3. Chapter 3: Data Warehousing and OLAPChapter 3: Data Warehousing and OLAP Technology: An OverviewTechnology: An Overview • What is a data warehouse? • Data warehouse architecture • From data warehousing to data mining
  • 4. What is Data Warehouse?What is Data Warehouse? • Defined in many different ways, but not rigorously. o A decision support database that is maintained separately from the organization’s operational database o Support information processing by providing a solid platform of consolidated, historical data for analysis. • “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process.”—W. H. Inmon o Data warehousing:The process of constructing and using data warehouses
  • 5. Data Warehouse—Subject-OrientedData Warehouse—Subject-Oriented • Organized around major subjects, such as customer, product, sales • Focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing • Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process
  • 6. Data Warehouse—IntegratedData Warehouse—Integrated • Constructed by integrating multiple, heterogeneous data sources o relational databases, flat files, on-line transaction records • Data cleaning and data integration techniques are applied. o Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources • E.g., Hotel price: currency, tax, breakfast covered, etc. o When data is moved to the warehouse, it is converted.
  • 7. Data Warehouse—Time VariantData Warehouse—Time Variant • The time horizon for the data warehouse is significantly longer than that of operational systems o Operational database: current value data o Data warehouse data: provide information from a historical perspective (e.g., past 5-10 years) • Every key structure in the data warehouse o Contains an element of time, explicitly or implicitly o But the key of operational data may or may not contain “time element”
  • 8. Data Warehouse—NonvolatileData Warehouse—Nonvolatile • A physically separate store of data transformed from the operational environment • Operational update of data does not occur in the data warehouse environment o Does not require transaction processing, recovery, and concurrency control mechanisms o Requires only two operations in data accessing: • initial loading of data and access of data
  • 9. Data Warehouse vs. Heterogeneous DBMSData Warehouse vs. Heterogeneous DBMS • Traditional heterogeneous DB integration: A query driven approach o Build wrappers/mediators on top of heterogeneous databases o When a query is posed to a client site, a meta-dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a global answer set o Complex information filtering, compete for resources • Data warehouse: update-driven, high performance o Information from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis
  • 10. Data Warehouse vs. Operational DBMSData Warehouse vs. Operational DBMS • OLTP (on-line transaction processing) o Major task of traditional relational DBMS o Day-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc. • OLAP (on-line analytical processing) o Major task of data warehouse system o Data analysis and decision making • Distinct features (OLTP vs. OLAP): o User and system orientation: customer vs. market o Data contents: current, detailed vs. historical, consolidated o Database design: ER + application vs. star + subject o View: current, local vs. evolutionary, integrated
  • 11. OLTP vs. OLAPOLTP vs. OLAP OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write index/hash on prim. key lots of scans unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 100MB-GB 100GB-TB metric transaction throughput query throughput, response
  • 12. Why Separate Data Warehouse?Why Separate Data Warehouse? • High performance for both systems o DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery o Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation • Different functions and different data: o missing data: Decision support requires historical data which operational DBs do not typically maintain o data consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sources o data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled
  • 13. Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview • What is a data warehouse? • Data warehouse architecture • From data warehousing to data mining
  • 14. Design of Data Warehouse: A Business Analysis FrameworkDesign of Data Warehouse: A Business Analysis Framework • Four views regarding the design of a data warehouse o Top-down view • allows selection of the relevant information necessary for the data warehouse o Data source view • exposes the information being captured, stored, and managed by operational systems o Data warehouse view • consists of fact tables and dimension tables o Business query view • sees the perspectives of data in the warehouse from the view of end-user
  • 15. Data Warehouse Design ProcessData Warehouse Design Process • Top-down, bottom-up approaches or a combination of both o Top-down: Starts with overall design and planning (mature) o Bottom-up: Starts with experiments and prototypes (rapid) • From software engineering point of view o Waterfall: structured and systematic analysis at each step before proceeding to the next o Spiral: rapid generation of increasingly functional systems, short turn around time, quick turn around • Typical data warehouse design process o Choose a business process to model, e.g., orders, invoices, etc. o Choose the grain (atomic level of data) of the business process o Choose the dimensions that will apply to each fact table record
  • 16. Data Warehouse: A Multi-Tiered ArchitectureData Warehouse: A Multi-Tiered Architecture OLAP Engine Metadata Data Sources Front-End ToolsData Storage Data Warehouse Extract Transform Load Refresh Analysis Query Reports Data mining Monitor & Integrator Serve Data Marts Operational DBs Other sources OLAP Server
  • 17. Three Data Warehouse ModelsThree Data Warehouse Models • Enterprise warehouse o collects all of the information about subjects spanning the entire organization • Data Mart o a subset of corporate-wide data that is of value to a specific groups of users. Its scope is confined to specific, selected groups, such as marketing data mart • Independent vs. dependent (directly from warehouse) data mart • Virtual warehouse o A set of views over operational databases o Only some of the possible summary views may be materialized
  • 18. Data Warehouse Development: A Recommended ApproachData Warehouse Development: A Recommended Approach Multi-Tier Data Warehouse Enterprise Data Warehouse Define a high-level corporate data model Data Mart Data Mart Distributed Data Marts Model refinementModel refinement
  • 19. Data Warehouse Back-End Tools and UtilitiesData Warehouse Back-End Tools and Utilities • Data extraction o get data from multiple, heterogeneous, and external sources • Data cleaning o detect errors in the data and rectify them when possible • Data transformation o convert data from legacy or host format to warehouse format • Load o sort, summarize, consolidate, compute views, check integrity, and build indicies and partitions • Refresh o propagate the updates from the data sources to the warehouse
  • 20. Metadata RepositoryMetadata Repository • Meta data is the data defining warehouse objects. It stores: • Description of the structure of the data warehouse o schema, view, dimensions, hierarchies, derived data defn, data mart locations and contents • Operational meta-data o data lineage (history of migrated data and transformation path), currency of data (active, archived, or purged), monitoring information (warehouse usage statistics, error reports, audit trails) • The algorithms used for summarization • The mapping from operational environment to the data warehouse • Data related to system performance o warehouse schema, view and derived data definitions • Business data o business terms and definitions, ownership of data, charging policies
  • 21. OLAP Server ArchitecturesOLAP Server Architectures • Relational OLAP (ROLAP) o Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware o Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services o Greater scalability • Multidimensional OLAP (MOLAP) o Sparse array-based multidimensional storage engine o Fast indexing to pre-computed summarized data • Hybrid OLAP (HOLAP) (e.g., Microsoft SQLServer) o Flexibility, e.g., low level: relational, high-level: array • Specialized SQL servers (e.g., Redbricks) o Specialized support for SQL queries over star/snowflake schemas
  • 22. Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview • What is a data warehouse? • Data warehouse architecture • From data warehousing to data mining
  • 23. Data Warehouse UsageData Warehouse Usage • Three kinds of data warehouse applications o Information processing • supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts and graphs o Analytical processing • multidimensional analysis of data warehouse data • supports basic OLAP operations, slice-dice, drilling, pivoting o Data mining • knowledge discovery from hidden patterns • supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools
  • 24. From On-Line Analytical Processing (OLAP) to On Line Analytical Mining (OLAM)From On-Line Analytical Processing (OLAP) to On Line Analytical Mining (OLAM) • Why online analytical mining? o High quality of data in data warehouses • DW contains integrated, consistent, cleaned data o Available information processing structure surrounding data warehouses • ODBC, OLEDB, Web accessing, service facilities, reporting and OLAP tools o OLAP-based exploratory data analysis • Mining with drilling, dicing, pivoting, etc. o On-line selection of data mining functions • Integration and swapping of multiple mining functions, algorithms, and tasks
  • 25. An OLAM System ArchitectureAn OLAM System Architecture User GUI API Data Warehouse Meta Data MDDB OLAM Engine OLAP Engine Data Cube API Database API Data cleaning Data integration Layer3 OLAP/OLAM Layer2 MDDB Layer1 Data Repository Layer4 User Interface Filtering&Integration Filtering Databases Mining query Mining result
  • 26. Chapter 3: Data Warehousing and OLAP Technology: An OverviewChapter 3: Data Warehousing and OLAP Technology: An Overview • What is a data warehouse? • A multi-dimensional data model • Data warehouse architecture • Data warehouse implementation • From data warehousing to data mining • Summary
  • 27. Summary: Data Warehouse and OLAP TechnologySummary: Data Warehouse and OLAP Technology • Why data warehousing? • Data warehouse architecture • From OLAP to OLAM (on-line analytical mining)
  • 28. ThankThank You !!!You !!! For More Information click below link: Follow Us on: http://vibranttechnologies.co.in/datawarehousing-classes-in-mumbai.html