SlideShare a Scribd company logo
1 of 11
Download to read offline
Data warehouse and business intelligent
project for the analysis of
Starbucks
Student Name: Sonali Gupta
Student ID: x01527245
Course: Msc. Data Analytics
Table of Contents
INTRODUCTION ...................................................................................................................... 3
DATA SOURCES................................................................................................................... 3
TECHNOLOGY USED ..........................................................................................................4
DATA WAREHOUSE DESIGN AND ARCHIETECTURE ....................................................................4
Design of Data Warehouse ......................................................................................................6
Business Query........................................................................................................................ 9
Case Study 1............................................................................................................................. 9
Case Study2:....................................................................................................................... 10
Case Study 3:...................................................................................................................... 10
Conclusion:............................................................................................................................ 11
INTRODUCTION
Grabbing a cup of coffee in the morning is always delightful as it provides a punch to energize
our day, and when coffee comes with sense of ownership and lot of offer only names comes in
my mind is Starbucks. what makes me feel more delightful is having a cup of coffee at
Starbucks and trying every new variety of coffee with different beverages. I am big lover of
coffee and when it comes to buy one, I am always looking for Starbucks and It used to give me
feeling of joy, their way of presenting different variety coffee which is chosen around the globe
and the service they provide is applaudable.
Afterward I used to wonder how Starbucks manages its inventory and how handles their
business. This curiosity made me chose the Starbucks as the topic of my data warehouse
project. This project is working model of data warehousing for Starbucks and shows its
business intelligence capabilities.
Information related to Starbucks:
It is an American coffee company and was started Seattle, Washington in 1971. At present
CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. This is
knowledgeable Starbucks is the third largest fast food restaurant chain.
DATA SOURCES
1. Structured data
I found this data set by Kaggle website. This data contains all the details of Starbucks
worldwide location. The columns of metadata are Brand, Store Number, Store Name,
Ownership Type, Street address, City, State/Province, Country, Postcode, Phone Number,
Timezone, Longitude and Latitude. From this data I took the dataset of united states which I
used in project.
Link of the source – https://www.kaggle.com/starbucks/store-locations/data.
2. Semi - Structured data
I generate this data set using Mockaroo API. This data set all about the Starbucks sales
report and column of this data sets are year, month, revenue details, number of visitors,
food sales quality and Beverages sales quality.
Link of the source - http://my.api.mockaroo.com/
3. Unstructured data
This data means that does not have relational table. Data that have high text related data
That can be date, points, rating and comments. I generate this data set using the API of
Yelp.com. Yelp is basically used for to publish review rating of any local business (Restaurants,
Hotel). This data set all about the review rating of Starbucks store.
Link of the source - https://www.yelp.com/developers/documentation/v3/business_search
TECHNOLOGY USED
Different types of technology used in this project which shown below:
Database Management
• SQL Server Management Studio (SSMS)
• SQL Server Integration Services (SSIS)
Programming Language
• R is used for Twitter sentimental analysis and cleaning.
• SQL for dimensions table and fact table for every component.
Additional Software
• Tableau for creating graphs.
DATA WAREHOUSE DESIGN AND ARCHIETECTURE
Use of Data warehouse:
1. Data integrate from various sources in real time which is good for the business decision
so that in future user can access data and also time saving.
2. We have historic data, can integrate at one place with common keys, common formats
and common data model.
3. Improve the quality of the data and reports generate faster.
4. Business intelligence create. For ex: SSAS cubes
When we talk about designing and storage part for data warehouse as business intelligence
purpose. At that time, two methodology use that is Inmon and Kimball both approach have
their own advantage.
Kimball’s methodology uses as dimensional design approach and also known as the bottom-
up design. In this first create data marts reports then integrate and create data warehouse. So,
using this star schema and snow flake easy to create. This methodology gives business value
in short span of time. This is the reason I was decided to choose this approach.
Inmon’s methodology use in enterprise data warehouse. This approach also known as the top
down design. First create the normalised data model, then build the data marts and data required
for specifically business process.so this approach take lots of time and more ETL work
required.
For the analysis of Starbucks store in different area of USA like how much is the revenue
generate, number of visitors, maximum and minimum sales of food and Beverages in which
month and year. So here, Kimball’s approach is used to build this Data Warehouse.
These are the four steps for design of dimensional data model.
1. Select the business process.
2. Declare the Grain
3. Identify the dimensions
4. Identify the fact.
I have considered that how my data warehouse Starbucks look like and what be its performance
matrix on a high level before deciding my dimensions and facts. In this project are Starbucks
on atomic level after that I have selected 3 dimensions as per the need of filtering and grouping
the fact.
Fig.1
Star Schema:
Star Schema is the simplest form of data warehouse schema because diagram resembles as a
star. Star schema consist of facts table and dimensions table where as fact table is in centre
and dimension tables are joined with fact table. In this data is systematized in to facts and
dimensions.
Fact Table:
Fact table is the combination of Foreign key column and Measures column whereas foreign
key column behaves as primary key in dimension table and measures columns contain data
that is being analysed.
In Starbucks of data warehouse, fact table contains Store details, date, location, sale report
and yelp rating data. these all details helping to analysing Business query.
Loading in to
staging area Starbucks
DW
DW
Cube
Reports in
BI
1.Data source
Kaggle
(Structured
2.Mocakro API
4.YelpAPI
Dimension Table:
In data warehouse, dimension table used for define dimension, keys, attributes and values.
Every dimension table have own primary key which is unique table. It contains details of
each object data. Star schema of dimension and fact table is shown in below figure.
Fig.2
Benefits of Star Schema
If Star Schema is fine designed then it is easy to understand and analyse large data sets. Main
benefits are described below:
• ETL process is easy to create
• Complexity is very low because table has direct relationship
• Every dimension directly connected to fact table.
• Query Performance
• Load Performance and Administration
• Built in Referential Integrity
• Efficient Navigation through Data
Designof Data Warehouse
In this Starbucks Data Warehouse three dimensions and one fact table have created.
DimStoreDetails: Store details dimensions consist of Store_id, Store_name, Store_number,
Ownership and Yelp rating. Store_id is the primary key in this dimension.
Dimlocation: location details consist of Location_id, Latitude, Longitude, City, Country,
Postcode and Address. Location_id is the primary key in this dimension.
DimDate: Date dimensions consist of Date_id, Year and Month. Date_id is the primary key
in this dimension.
Facttable: For created the Starbucks data warehouse create one fact table which connected
with all dimension table with foreign key relationship. In these four columns for this
measurement.
1. Visitor_count – It contains the number of visitors
2. Revenue – It contain store details revenue.
3. Beverage_count –
4. Food_count-
Fig 3
Extract Transform Load (ETL) Process The main task of any data warehouse is to
rearrange, integrate and consolidate data over many systems. Basically, ETL means extract
data from different sources and then transformed in to staging stage and then load in to
destination stage. This is called ETL process. For ETL process SSIS tool is used. The first
step is extract data in to staging database then next transformation stage and last stage is
Loading stage where data is loaded in fact table. In end ETL process data is populated in fact
table along with dimensions table.
Extraction: I have extract the data from three different sources. First data set directly load in
to flat file and other two data files extract using API and storing in to csv format. Extraction
load the data in to staging stage connect with the OLEDB dimensions.
In this stage, yelp and mocaroo are the unstructured data set so using scrapping and R
language with help of API data generated.
In truncate means no multiple data set generated.
Transformation: After extracting, data is extracted and then transformed. I am used lookup,
join and SQL query for loading the data in dimension table.
These are the three dimensions:
1. dbo.DimDate
2. dbo.Dimlocation
3. dbo.DimStoreDetails
Loading: After populate the dimensions, then another step is populate fact table where fact
table includes all the primary key of the dimensions and lookup is used for populate the
dimensions table and measure in fact table.
Deploying the cube: With the help of SSAS which is basically used for analyse the data on
the basis of measure. Which is used in fact table and the textual form in dimension table.
When the cube deployed that means. We can apply the Business query in database
External Source
StagingDatabase
Dimensiontable
Fact table
Business Query
Case Study 1:
Whichcity has maximumrevenue?
ThisQuerycontainsthe store_sales_reportandStore_details.Sobelow graph represents the how
much revenue isgeneratedwithcity.
Analysis:
From thisbar chart representation, we caneasilyanalysethe maximumrevenue isgeneratedinNew
York city,thenChicago.
Case Study2:
Whichcity has maximumnumberof visitorsandBeveragessales?
This query contains Store_details and store_name. so below bubble graph represent the city
with grouping with visitors and Beverages.
Analysis:
From this bubble chart representation, we can easily analyse the city
Case Study 3:
Whichcity has highratinginthe basisof foodcount andBeverage count?
Thisdata set contain yelp_Rating,store_sales_report.Sobelow bargraphrepresentsthe scenarioof
thissituation:
Analysis:
Afteranalyze,clearlyseenthatNewYorkhashighrating on the basisof foodcount and beverages
count.
Conclusion:
Data warehouse easytohandle,analyze large amountof data.Usingthe data warehouse,we can
easilyfindthe inwhichmonthStarbuckssale highorlow,whichcityhas maximumrevenue,rating
and manymore.At final decide thatNew Yorkalwaysgetgoodratingand alwaysmaintainhigh
revenue.

More Related Content

What's hot

Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big DataJames Serra
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreAmazon Web Services
 
Apache Spark sql
Apache Spark sqlApache Spark sql
Apache Spark sqlaftab alam
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...Medicaps University
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture StrategiesDATAVERSITY
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : BeginnersShweta Patnaik
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Accenture informatica interview question answers
Accenture informatica interview question answersAccenture informatica interview question answers
Accenture informatica interview question answersSweta Singh
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Simplilearn
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...Jochem van Grondelle
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overviewDataArt
 

What's hot (20)

Finding business value in Big Data
Finding business value in Big DataFinding business value in Big Data
Finding business value in Big Data
 
Dynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache SparkDynamic Partition Pruning in Apache Spark
Dynamic Partition Pruning in Apache Spark
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and moreBig Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
Big Data & Analytics - Use Cases in Mobile, E-commerce, Media and more
 
Data monetization pov
Data monetization   povData monetization   pov
Data monetization pov
 
Star schema PPT
Star schema PPTStar schema PPT
Star schema PPT
 
Apache Spark sql
Apache Spark sqlApache Spark sql
Apache Spark sql
 
ETL QA
ETL QAETL QA
ETL QA
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
Data Warehousing (Need,Application,Architecture,Benefits), Data Mart, Schema,...
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture Strategies
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Accenture informatica interview question answers
Accenture informatica interview question answersAccenture informatica interview question answers
Accenture informatica interview question answers
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
Big Data
Big DataBig Data
Big Data
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
Map reduce vs spark
Map reduce vs sparkMap reduce vs spark
Map reduce vs spark
 

Similar to Dwbi Project

Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project reportsonalighai
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail storeSiddharth Chaudhary
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15AnwarrChaudary
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1guest9529cb
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNabclearnn
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationKate Subramanian
 
Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxhajon27910
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousingwork
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptxjainyshah20
 
Dataware housing
Dataware housingDataware housing
Dataware housingwork
 

Similar to Dwbi Project (20)

Data warehousing and business intelligence project report
Data warehousing and business intelligence project reportData warehousing and business intelligence project report
Data warehousing and business intelligence project report
 
Data warehouse project on retail store
Data warehouse project on retail storeData warehouse project on retail store
Data warehouse project on retail store
 
Intro to Data warehousing lecture 15
Intro to Data warehousing   lecture 15Intro to Data warehousing   lecture 15
Intro to Data warehousing lecture 15
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
A Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence ApplicationA Data Warehouse And Business Intelligence Application
A Data Warehouse And Business Intelligence Application
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dw concepts
Dw conceptsDw concepts
Dw concepts
 
Resume
ResumeResume
Resume
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Star schema
Star schemaStar schema
Star schema
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Resume
ResumeResume
Resume
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 

Recently uploaded

办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 

Recently uploaded (20)

办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 

Dwbi Project

  • 1. Data warehouse and business intelligent project for the analysis of Starbucks Student Name: Sonali Gupta Student ID: x01527245 Course: Msc. Data Analytics
  • 2. Table of Contents INTRODUCTION ...................................................................................................................... 3 DATA SOURCES................................................................................................................... 3 TECHNOLOGY USED ..........................................................................................................4 DATA WAREHOUSE DESIGN AND ARCHIETECTURE ....................................................................4 Design of Data Warehouse ......................................................................................................6 Business Query........................................................................................................................ 9 Case Study 1............................................................................................................................. 9 Case Study2:....................................................................................................................... 10 Case Study 3:...................................................................................................................... 10 Conclusion:............................................................................................................................ 11
  • 3. INTRODUCTION Grabbing a cup of coffee in the morning is always delightful as it provides a punch to energize our day, and when coffee comes with sense of ownership and lot of offer only names comes in my mind is Starbucks. what makes me feel more delightful is having a cup of coffee at Starbucks and trying every new variety of coffee with different beverages. I am big lover of coffee and when it comes to buy one, I am always looking for Starbucks and It used to give me feeling of joy, their way of presenting different variety coffee which is chosen around the globe and the service they provide is applaudable. Afterward I used to wonder how Starbucks manages its inventory and how handles their business. This curiosity made me chose the Starbucks as the topic of my data warehouse project. This project is working model of data warehousing for Starbucks and shows its business intelligence capabilities. Information related to Starbucks: It is an American coffee company and was started Seattle, Washington in 1971. At present CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. This is knowledgeable Starbucks is the third largest fast food restaurant chain. DATA SOURCES 1. Structured data I found this data set by Kaggle website. This data contains all the details of Starbucks worldwide location. The columns of metadata are Brand, Store Number, Store Name, Ownership Type, Street address, City, State/Province, Country, Postcode, Phone Number, Timezone, Longitude and Latitude. From this data I took the dataset of united states which I used in project. Link of the source – https://www.kaggle.com/starbucks/store-locations/data. 2. Semi - Structured data I generate this data set using Mockaroo API. This data set all about the Starbucks sales report and column of this data sets are year, month, revenue details, number of visitors, food sales quality and Beverages sales quality. Link of the source - http://my.api.mockaroo.com/ 3. Unstructured data This data means that does not have relational table. Data that have high text related data That can be date, points, rating and comments. I generate this data set using the API of Yelp.com. Yelp is basically used for to publish review rating of any local business (Restaurants, Hotel). This data set all about the review rating of Starbucks store. Link of the source - https://www.yelp.com/developers/documentation/v3/business_search
  • 4. TECHNOLOGY USED Different types of technology used in this project which shown below: Database Management • SQL Server Management Studio (SSMS) • SQL Server Integration Services (SSIS) Programming Language • R is used for Twitter sentimental analysis and cleaning. • SQL for dimensions table and fact table for every component. Additional Software • Tableau for creating graphs. DATA WAREHOUSE DESIGN AND ARCHIETECTURE Use of Data warehouse: 1. Data integrate from various sources in real time which is good for the business decision so that in future user can access data and also time saving. 2. We have historic data, can integrate at one place with common keys, common formats and common data model. 3. Improve the quality of the data and reports generate faster. 4. Business intelligence create. For ex: SSAS cubes When we talk about designing and storage part for data warehouse as business intelligence purpose. At that time, two methodology use that is Inmon and Kimball both approach have their own advantage. Kimball’s methodology uses as dimensional design approach and also known as the bottom- up design. In this first create data marts reports then integrate and create data warehouse. So, using this star schema and snow flake easy to create. This methodology gives business value in short span of time. This is the reason I was decided to choose this approach. Inmon’s methodology use in enterprise data warehouse. This approach also known as the top down design. First create the normalised data model, then build the data marts and data required for specifically business process.so this approach take lots of time and more ETL work required. For the analysis of Starbucks store in different area of USA like how much is the revenue generate, number of visitors, maximum and minimum sales of food and Beverages in which month and year. So here, Kimball’s approach is used to build this Data Warehouse. These are the four steps for design of dimensional data model. 1. Select the business process. 2. Declare the Grain 3. Identify the dimensions 4. Identify the fact.
  • 5. I have considered that how my data warehouse Starbucks look like and what be its performance matrix on a high level before deciding my dimensions and facts. In this project are Starbucks on atomic level after that I have selected 3 dimensions as per the need of filtering and grouping the fact. Fig.1 Star Schema: Star Schema is the simplest form of data warehouse schema because diagram resembles as a star. Star schema consist of facts table and dimensions table where as fact table is in centre and dimension tables are joined with fact table. In this data is systematized in to facts and dimensions. Fact Table: Fact table is the combination of Foreign key column and Measures column whereas foreign key column behaves as primary key in dimension table and measures columns contain data that is being analysed. In Starbucks of data warehouse, fact table contains Store details, date, location, sale report and yelp rating data. these all details helping to analysing Business query. Loading in to staging area Starbucks DW DW Cube Reports in BI 1.Data source Kaggle (Structured 2.Mocakro API 4.YelpAPI
  • 6. Dimension Table: In data warehouse, dimension table used for define dimension, keys, attributes and values. Every dimension table have own primary key which is unique table. It contains details of each object data. Star schema of dimension and fact table is shown in below figure. Fig.2 Benefits of Star Schema If Star Schema is fine designed then it is easy to understand and analyse large data sets. Main benefits are described below: • ETL process is easy to create • Complexity is very low because table has direct relationship • Every dimension directly connected to fact table. • Query Performance • Load Performance and Administration • Built in Referential Integrity • Efficient Navigation through Data Designof Data Warehouse In this Starbucks Data Warehouse three dimensions and one fact table have created. DimStoreDetails: Store details dimensions consist of Store_id, Store_name, Store_number, Ownership and Yelp rating. Store_id is the primary key in this dimension. Dimlocation: location details consist of Location_id, Latitude, Longitude, City, Country, Postcode and Address. Location_id is the primary key in this dimension.
  • 7. DimDate: Date dimensions consist of Date_id, Year and Month. Date_id is the primary key in this dimension. Facttable: For created the Starbucks data warehouse create one fact table which connected with all dimension table with foreign key relationship. In these four columns for this measurement. 1. Visitor_count – It contains the number of visitors 2. Revenue – It contain store details revenue. 3. Beverage_count – 4. Food_count- Fig 3 Extract Transform Load (ETL) Process The main task of any data warehouse is to rearrange, integrate and consolidate data over many systems. Basically, ETL means extract data from different sources and then transformed in to staging stage and then load in to destination stage. This is called ETL process. For ETL process SSIS tool is used. The first step is extract data in to staging database then next transformation stage and last stage is Loading stage where data is loaded in fact table. In end ETL process data is populated in fact table along with dimensions table.
  • 8. Extraction: I have extract the data from three different sources. First data set directly load in to flat file and other two data files extract using API and storing in to csv format. Extraction load the data in to staging stage connect with the OLEDB dimensions. In this stage, yelp and mocaroo are the unstructured data set so using scrapping and R language with help of API data generated. In truncate means no multiple data set generated. Transformation: After extracting, data is extracted and then transformed. I am used lookup, join and SQL query for loading the data in dimension table. These are the three dimensions: 1. dbo.DimDate 2. dbo.Dimlocation 3. dbo.DimStoreDetails Loading: After populate the dimensions, then another step is populate fact table where fact table includes all the primary key of the dimensions and lookup is used for populate the dimensions table and measure in fact table. Deploying the cube: With the help of SSAS which is basically used for analyse the data on the basis of measure. Which is used in fact table and the textual form in dimension table. When the cube deployed that means. We can apply the Business query in database External Source StagingDatabase Dimensiontable Fact table
  • 9. Business Query Case Study 1: Whichcity has maximumrevenue? ThisQuerycontainsthe store_sales_reportandStore_details.Sobelow graph represents the how much revenue isgeneratedwithcity. Analysis: From thisbar chart representation, we caneasilyanalysethe maximumrevenue isgeneratedinNew York city,thenChicago.
  • 10. Case Study2: Whichcity has maximumnumberof visitorsandBeveragessales? This query contains Store_details and store_name. so below bubble graph represent the city with grouping with visitors and Beverages. Analysis: From this bubble chart representation, we can easily analyse the city Case Study 3: Whichcity has highratinginthe basisof foodcount andBeverage count? Thisdata set contain yelp_Rating,store_sales_report.Sobelow bargraphrepresentsthe scenarioof thissituation:
  • 11. Analysis: Afteranalyze,clearlyseenthatNewYorkhashighrating on the basisof foodcount and beverages count. Conclusion: Data warehouse easytohandle,analyze large amountof data.Usingthe data warehouse,we can easilyfindthe inwhichmonthStarbuckssale highorlow,whichcityhas maximumrevenue,rating and manymore.At final decide thatNew Yorkalwaysgetgoodratingand alwaysmaintainhigh revenue.