2. LIVE Online Class
Class Recording in LMS
24/7 Post Class Support
Module Wise Quiz
Project Work
Verifiable Certificate
Slide2
Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
www.edureka.co/datawarehousing
How it Works?
3. Slide3
www.edureka.co/datawarehousing
For Queries during the session and class recording:
Post on Twitter @edurekaIN: #askEdureka
Post on Facebook /edurekaIN
Objectives of this Session
What is Datawarehouse?
Datawarehouse Architecture
Why Datawarehouse is used?
What is ETL?
What all you will learn in Datawarehousing and ETL course?
Hands On
4. Slide4
www.edureka.co/datawarehousing
What is DataWarehouse?
A Data Warehouse is a central location where consolidated data from multiple locations are stored
The end user accesses it whenever he needs some information
Data Warehouse is not loaded every time when new data is generated
There are timelines determined by the business as to when a Data Warehouse needs to be loaded –daily, monthly, once in a quarter etc
Source 1
Source 2
Source n
User 1
User 2
User n
Data Warehouse
.
.
.
.
.
.
5. Slide5
www.edureka.co/datawarehousing
Why do we need Datawarehouse?
The primary reason for a Datawarehouse is, for a company to get that extra edge over its competitors
This extra edge can be gained by taking smarter decisions
Smarter decisions can be taken only if the executives responsible for taking such decisions have data at their disposal
For Example: Let’s consider some strategic questions that a manager or an executive has to answer to get an extra edge over his company’s competitors
QHow do we increase the market share of this company by 5 %?
QWhich product is not doing well in the market?
QWhich agent needs help with selling policies?
QWhat is the quality of the customer service provided and what improvements are needed?
These questions may not be needed to run a business but are needed for the survival and growth of the business.
Strategic Questions
6. Slide6
www.edureka.co/datawarehousing
Let’s consider one of the strategic question for which a manager or an executive is trying to find answer
What is the quality of the customer service provided and what improvements are needed?
How many customer feedbacks do we have in the last 6 months?
How many customers have given a feedback of Excellent, how many averages? How many bad?
What are the comments or improvement areas highlighted by customers who have rated us bad or average?
Result 1
Result 2
Result 3
Subset
Question 1
Subset
Question 2
Subset
Question 3
Database
Why is Datawarehouse so Important?
7. Slide7
www.edureka.co/datawarehousing
Strategic questions can be answered by studying the trends.
Data Warehouse
What is the quality of the customer service provided and what improvements are needed?
Operational System
Operational System doesn’t provide trends
Data Warehouse provides trends
Result provided is in ready to access format
Result 1
Result 2
Result 3
OLTP
Why is Datawarehouse so Important?
8. Slide8
www.edureka.co/datawarehousing
What is ETL?
Source 1
Source 2
ETL
Datawarehouse
What and from where to Extract?
How to Transform?
Where to Load?
Tools available
9. Slide9
www.edureka.co/datawarehousing
Datawarehouse Architecture
Source
File 1
Other Sources
Transactional Sources
OLTP
Data Warehouse
DM 1
Reporting
Data Presentation Layer
Reporting tools
ETL
User generates reports
DM 3
DM 2
Data Access Layer
10. Slide10
www.edureka.co/datawarehousing
Advantages of DataWarehouse
Standardizes data across an organization
Smarter decisions for companies –Move towards fact based decisions
REDUCE COSTS
»Drop products not doing well
»Negotiate for improvement with suppliers
INCREASE REVENUE
»Work on the high selling products
»Customer satisfaction –Know what is working and what is not
11. Slide11
www.edureka.co/datawarehousing
Creating and Populating the Tables
Problem Statement
»From data files provided, based on requirement,createand populate the tables
»Use PostgreSQL for creating tables and Talend Open studio for loading tables
12. Slide12
www.edureka.co/datawarehousing
Requirement in English statements
Identify entities and the relations between them
Attributes and facts
Develop a model
Create the tables using POSTGRESQL
Populate the tables using ETL
Test the Jobs
Creating and Populating the Tables (Flow Diagram)