SlideShare a Scribd company logo
1 of 26
Download to read offline
Electronic Commerce
CHIRILĂ Sorina-Georgiana -- GRAMA Mircea-Constantin
FEAA -- Data Mining -- Data Warehouses -- 2018
Summary
● Introduction
● Clickstream Source Data
● Clickstream Data Challenges
● Clickstream Dimensional Models
● Clickstream Session Fact Table
● Clickstream Page Event Fact Table
● Google Analytics
● Integrating Clickstream into Web Retailer’s Bus Matrix
● Profitability Across Channels Including Web
What is E-Commerce?
Examples of Web Stores
What it is and how to identify the Clickstream?
The Clickstream is, by definition, every page event recorded
by each company’s web server. By page event it means user
clicks anywhere in the webpage. This clicks are kept in
Clickstream Data records.
The Clickstream contains a number of 4 new dimensions,
which are not found in other data sources: Page, Event,
Session and Referral.
Opodo - spanish site to book cheap flights, hotels and
package holidays.
Clickstream Source Data
● The Clickstream is an evolving collection of data sources,
● Clickstream is captured in different server log files formats and, also, by different physical servers,
simultaneously; these log files formats have optional data components, which can be helpful in
identifying visitors, sessions, and true meaning of behavior,
● Sources of clickstream data are coming from internal and external parties,
● Some examples of external parties: reffering partners, Internet Service Providers(ISPs), search
specification given to a search engine that then directs visitor to the website,
● Two main disadvantages of clickstream data: is stateless and has a clear anonymity of the session,
● By stateless it means that the log shows an isolated page retrieval event, but does not provide a
clear tie to other page events elsewhere in the log; without contextual help is difficult to identify a
complete visitor session,
● By anonymity of the session it means that unless visitors agree to reveal their identity in some way,
you cannot be sure who they are.
Clickstream Data Challenges
Identifying the Visitor Origin
Identifying the Session
Identifying the Visitor
Identifying the Visitor Origin
● The case when your website is the default page for the visitor’s browser,
● A visitor may be directed to your site from a search at a portal such as Yahoo! or Google, external
referrals,
● Another common source of visitors is from a browser bookmark,
● You site may be reached as a result of a clickthrough - a deliberate click on a text or a graphical link
from another site.
Identifying the Session
Condition for valid analysis: Every visitor session(visit) on webpage must have its own unique identity tag (session Id),
similar to a supermarket receipt number. If missing, you could assume the entries are for the same session, by:
● Collating time-contiguos (for example, one hour) log entries from the same host (IP address),
● Let the web browser place a session-level cookie into the visitor’s web browser,
● HTTP Secure sockets layers (SSL) - may include a login action by the visitor and exchange of
encryption keys,
● By placing a session Id in a hidden field of each page returned to the visitor,
● The website may establish a persistent cookie in the visitor’s machine, that is not deleted by the
browser when the session ends.
Identifying the Visitor
Real problem for a site designer, webmaster or manager of the web analytics group, because:
● Web visitors want to be anonymous, not to provide personal identification or credit card
information, for example,
● If you demand visitor’s identity, they may not provide accurate information,
● You can’t be sure which family member is visiting your site - a particular computer can be used, but
not by the same person,
● You can’t assume an individual is always at the same computer - he can access the same website
from an office computer or home computer or mobile device, and different website cookie is put into
each machine.
Case study: Building a Web
Site for a Retailer, using Data
Warehouses concepts
Clickstream Dimensional Model
Portfolio list of dimensions for a web retailer could include:
Clickstream Dimensional Model
Only 4 unique Dimensions of the Clickstream:
Page Dimension
Event Dimension
Session Dimension
Referral Dimension
Page Dimension
The Page Dimension describes the page context for a web page (static or dynamic) event.
Event Dimension
The Event Dimension describes what happened on a particular page at a particular point in time.
Session Dimension
The Session Dimension provides one or more levels of diagnosis for the visitor’s session as a whole. For
example, one type of analysis is in this question: How many customers did not finish ordering? Where did
they stop?
Referral Dimension
The Referral Dimension describes how the customer arrived at the current page.
Clickstream Session Fact Table
Designed to focus on complete visitor sessions while keeping the size under control:
Clickstream Page Event Fact Table
Aggregate Clickstream Fact Tables
Integrating Clickstream
into Web Retailer’s Bus
Matrix
Profitability
Across
Channels
Including Web
Conclusion
● How many customers consulted your product information before ordering?
● How many customers looked at your product information and never ordered?
● How profitable is each channel (web sales, telesales and store sales)? Why?
● How profitable are your customer segments? Why?
● Which promotions work well on the web but do not work well in other channels? Why?
● When is your business most profitable? Why?
Resources
● Book: The Datawarehouse Toolkit, Third Edition - Ralph Kimball, Margy Ross, WILEY 2013,
● https://www.safaribooksonline.com/library/view/designing-web-navigation/9780596528102/ch
04.html,
● https://www.c-sharpcorner.com/UploadFile/225740/introduction-of-session-in-Asp-Net/Images/
Session%20in%20ASP.NET17.PNG,
● http://www.vileda.com/media/wysiwyg/Webshop_AUS/FAQ/wow_r.jpg,
● https://www.jasondavies.com/wordcloud/,
● http://www.worldometers.info/,
● https://www.slideshare.net/itsmenaguda4others/final-ppt-e-commerce-1

More Related Content

What's hot

Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
DataWorks Summit
 
Disk structure.45
Disk structure.45Disk structure.45
Disk structure.45
myrajendra
 

What's hot (20)

Big data components - Introduction to Flume, Pig and Sqoop
Big data components - Introduction to Flume, Pig and SqoopBig data components - Introduction to Flume, Pig and Sqoop
Big data components - Introduction to Flume, Pig and Sqoop
 
Web server
Web serverWeb server
Web server
 
Prestogres internals
Prestogres internalsPrestogres internals
Prestogres internals
 
Database replication
Database replicationDatabase replication
Database replication
 
Streaming with Oracle Data Integration
Streaming with Oracle Data IntegrationStreaming with Oracle Data Integration
Streaming with Oracle Data Integration
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Capturing Network Traffic into Database
Capturing Network Traffic into Database Capturing Network Traffic into Database
Capturing Network Traffic into Database
 
Disk structure.45
Disk structure.45Disk structure.45
Disk structure.45
 
Allocating of Frames.pptx
Allocating of Frames.pptxAllocating of Frames.pptx
Allocating of Frames.pptx
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
hive lab
hive labhive lab
hive lab
 
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
 
Big Data Analytics with Spark
Big Data Analytics with SparkBig Data Analytics with Spark
Big Data Analytics with Spark
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
CPU Scheduling in OS Presentation
CPU Scheduling in OS  PresentationCPU Scheduling in OS  Presentation
CPU Scheduling in OS Presentation
 

Similar to Electronic commerce and Data Warehouses

Google Analytics for Dummies
Google Analytics for DummiesGoogle Analytics for Dummies
Google Analytics for Dummies
Tim Lelek
 
081118 - Tracking Performance
081118 - Tracking Performance081118 - Tracking Performance
081118 - Tracking Performance
Ged Carroll
 

Similar to Electronic commerce and Data Warehouses (20)

W.A. Fisher - Getting the Most Out of Google Analytics
W.A. Fisher - Getting the Most Out of Google AnalyticsW.A. Fisher - Getting the Most Out of Google Analytics
W.A. Fisher - Getting the Most Out of Google Analytics
 
The power of BI
The power of BIThe power of BI
The power of BI
 
Google Analytics For Business - A Beginners Guide
Google Analytics For Business - A Beginners GuideGoogle Analytics For Business - A Beginners Guide
Google Analytics For Business - A Beginners Guide
 
Google Analytics for Dummies
Google Analytics for DummiesGoogle Analytics for Dummies
Google Analytics for Dummies
 
Digital Analytics Customer Segmentation
 Digital Analytics  Customer Segmentation Digital Analytics  Customer Segmentation
Digital Analytics Customer Segmentation
 
Google Analytics: understanding the data correctly
Google Analytics: understanding the data correctlyGoogle Analytics: understanding the data correctly
Google Analytics: understanding the data correctly
 
How Much Traffic Does This Website Get.pdf
How Much Traffic Does This Website Get.pdfHow Much Traffic Does This Website Get.pdf
How Much Traffic Does This Website Get.pdf
 
081118 - Tracking Performance
081118 - Tracking Performance081118 - Tracking Performance
081118 - Tracking Performance
 
Google analytics traning for beginner ( part 1)
Google analytics traning for beginner ( part 1) Google analytics traning for beginner ( part 1)
Google analytics traning for beginner ( part 1)
 
Understanding google analytics
Understanding google analyticsUnderstanding google analytics
Understanding google analytics
 
Google Analytics tutorial by Jay Murphy
Google Analytics tutorial by Jay Murphy Google Analytics tutorial by Jay Murphy
Google Analytics tutorial by Jay Murphy
 
The Art & Science of Converting Hotel Guests on Your Website
The Art & Science of Converting Hotel Guests on Your Website The Art & Science of Converting Hotel Guests on Your Website
The Art & Science of Converting Hotel Guests on Your Website
 
Omniturebasicsv1 100622051011-phpapp02
Omniturebasicsv1 100622051011-phpapp02Omniturebasicsv1 100622051011-phpapp02
Omniturebasicsv1 100622051011-phpapp02
 
Recruitment Analytics workshop - Endouble Antwerp 6-3-2017
Recruitment Analytics workshop  - Endouble Antwerp 6-3-2017Recruitment Analytics workshop  - Endouble Antwerp 6-3-2017
Recruitment Analytics workshop - Endouble Antwerp 6-3-2017
 
Ga ppt-By Shrihdar
Ga ppt-By ShrihdarGa ppt-By Shrihdar
Ga ppt-By Shrihdar
 
Google Analytics - Getting Started and How to Measure Success
Google Analytics - Getting Started and How to Measure SuccessGoogle Analytics - Getting Started and How to Measure Success
Google Analytics - Getting Started and How to Measure Success
 
Google analytics overview
Google analytics overviewGoogle analytics overview
Google analytics overview
 
Affiliate Summit Orlando Meetup Group: Google Analytics for Beginners
Affiliate Summit Orlando Meetup Group:  Google Analytics for BeginnersAffiliate Summit Orlando Meetup Group:  Google Analytics for Beginners
Affiliate Summit Orlando Meetup Group: Google Analytics for Beginners
 
Web analytics an intro
Web analytics   an introWeb analytics   an intro
Web analytics an intro
 
Google Analytics ppt
Google Analytics  pptGoogle Analytics  ppt
Google Analytics ppt
 

More from Sorina Chirilă

RIPS - static code analyzer for vulnerabilities in PHP
RIPS - static code analyzer for vulnerabilities in PHPRIPS - static code analyzer for vulnerabilities in PHP
RIPS - static code analyzer for vulnerabilities in PHP
Sorina Chirilă
 

More from Sorina Chirilă (9)

Object-Oriented Analysis And Design With Applications Grady Booch
Object-Oriented Analysis And Design With Applications Grady BoochObject-Oriented Analysis And Design With Applications Grady Booch
Object-Oriented Analysis And Design With Applications Grady Booch
 
Introducing CHAOS - A graphic guide
Introducing CHAOS - A graphic guideIntroducing CHAOS - A graphic guide
Introducing CHAOS - A graphic guide
 
SNAS - CGS - MobilPRO2016
SNAS - CGS - MobilPRO2016SNAS - CGS - MobilPRO2016
SNAS - CGS - MobilPRO2016
 
THE ZEN OF PYTHON
THE ZEN OF PYTHONTHE ZEN OF PYTHON
THE ZEN OF PYTHON
 
Scan
ScanScan
Scan
 
Nikto
NiktoNikto
Nikto
 
Nikto
NiktoNikto
Nikto
 
A5-Security misconfiguration-OWASP 2013
A5-Security misconfiguration-OWASP 2013   A5-Security misconfiguration-OWASP 2013
A5-Security misconfiguration-OWASP 2013
 
RIPS - static code analyzer for vulnerabilities in PHP
RIPS - static code analyzer for vulnerabilities in PHPRIPS - static code analyzer for vulnerabilities in PHP
RIPS - static code analyzer for vulnerabilities in PHP
 

Recently uploaded

Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat ViagraToko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
adet6151
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
dq9vz1isj
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
hwhqz6r1y
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra MalangToko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
adet6151
 

Recently uploaded (20)

The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
ℂall Girls Balbir Nagar ℂall Now Chhaya ☎ 9899900591 WhatsApp Number 24/7
ℂall Girls Balbir Nagar ℂall Now Chhaya ☎ 9899900591 WhatsApp  Number 24/7ℂall Girls Balbir Nagar ℂall Now Chhaya ☎ 9899900591 WhatsApp  Number 24/7
ℂall Girls Balbir Nagar ℂall Now Chhaya ☎ 9899900591 WhatsApp Number 24/7
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat ViagraToko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
Toko Jual Viagra Asli Di Salatiga 081229400522 Obat Kuat Viagra
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
123.docx. .
123.docx.                                 .123.docx.                                 .
123.docx. .
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
如何办理澳洲悉尼大学毕业证(USYD毕业证书)学位证成绩单原版一比一
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra MalangToko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
Toko Jual Viagra Asli Di Malang 081229400522 COD Obat Kuat Viagra Malang
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 

Electronic commerce and Data Warehouses

  • 1. Electronic Commerce CHIRILĂ Sorina-Georgiana -- GRAMA Mircea-Constantin FEAA -- Data Mining -- Data Warehouses -- 2018
  • 2. Summary ● Introduction ● Clickstream Source Data ● Clickstream Data Challenges ● Clickstream Dimensional Models ● Clickstream Session Fact Table ● Clickstream Page Event Fact Table ● Google Analytics ● Integrating Clickstream into Web Retailer’s Bus Matrix ● Profitability Across Channels Including Web
  • 5.
  • 6. What it is and how to identify the Clickstream? The Clickstream is, by definition, every page event recorded by each company’s web server. By page event it means user clicks anywhere in the webpage. This clicks are kept in Clickstream Data records. The Clickstream contains a number of 4 new dimensions, which are not found in other data sources: Page, Event, Session and Referral. Opodo - spanish site to book cheap flights, hotels and package holidays.
  • 7. Clickstream Source Data ● The Clickstream is an evolving collection of data sources, ● Clickstream is captured in different server log files formats and, also, by different physical servers, simultaneously; these log files formats have optional data components, which can be helpful in identifying visitors, sessions, and true meaning of behavior, ● Sources of clickstream data are coming from internal and external parties, ● Some examples of external parties: reffering partners, Internet Service Providers(ISPs), search specification given to a search engine that then directs visitor to the website, ● Two main disadvantages of clickstream data: is stateless and has a clear anonymity of the session, ● By stateless it means that the log shows an isolated page retrieval event, but does not provide a clear tie to other page events elsewhere in the log; without contextual help is difficult to identify a complete visitor session, ● By anonymity of the session it means that unless visitors agree to reveal their identity in some way, you cannot be sure who they are.
  • 8. Clickstream Data Challenges Identifying the Visitor Origin Identifying the Session Identifying the Visitor
  • 9. Identifying the Visitor Origin ● The case when your website is the default page for the visitor’s browser, ● A visitor may be directed to your site from a search at a portal such as Yahoo! or Google, external referrals, ● Another common source of visitors is from a browser bookmark, ● You site may be reached as a result of a clickthrough - a deliberate click on a text or a graphical link from another site.
  • 10. Identifying the Session Condition for valid analysis: Every visitor session(visit) on webpage must have its own unique identity tag (session Id), similar to a supermarket receipt number. If missing, you could assume the entries are for the same session, by: ● Collating time-contiguos (for example, one hour) log entries from the same host (IP address), ● Let the web browser place a session-level cookie into the visitor’s web browser, ● HTTP Secure sockets layers (SSL) - may include a login action by the visitor and exchange of encryption keys, ● By placing a session Id in a hidden field of each page returned to the visitor, ● The website may establish a persistent cookie in the visitor’s machine, that is not deleted by the browser when the session ends.
  • 11. Identifying the Visitor Real problem for a site designer, webmaster or manager of the web analytics group, because: ● Web visitors want to be anonymous, not to provide personal identification or credit card information, for example, ● If you demand visitor’s identity, they may not provide accurate information, ● You can’t be sure which family member is visiting your site - a particular computer can be used, but not by the same person, ● You can’t assume an individual is always at the same computer - he can access the same website from an office computer or home computer or mobile device, and different website cookie is put into each machine.
  • 12. Case study: Building a Web Site for a Retailer, using Data Warehouses concepts
  • 13. Clickstream Dimensional Model Portfolio list of dimensions for a web retailer could include:
  • 14. Clickstream Dimensional Model Only 4 unique Dimensions of the Clickstream: Page Dimension Event Dimension Session Dimension Referral Dimension
  • 15. Page Dimension The Page Dimension describes the page context for a web page (static or dynamic) event.
  • 16. Event Dimension The Event Dimension describes what happened on a particular page at a particular point in time.
  • 17. Session Dimension The Session Dimension provides one or more levels of diagnosis for the visitor’s session as a whole. For example, one type of analysis is in this question: How many customers did not finish ordering? Where did they stop?
  • 18. Referral Dimension The Referral Dimension describes how the customer arrived at the current page.
  • 19. Clickstream Session Fact Table Designed to focus on complete visitor sessions while keeping the size under control:
  • 22.
  • 23. Integrating Clickstream into Web Retailer’s Bus Matrix
  • 25. Conclusion ● How many customers consulted your product information before ordering? ● How many customers looked at your product information and never ordered? ● How profitable is each channel (web sales, telesales and store sales)? Why? ● How profitable are your customer segments? Why? ● Which promotions work well on the web but do not work well in other channels? Why? ● When is your business most profitable? Why?
  • 26. Resources ● Book: The Datawarehouse Toolkit, Third Edition - Ralph Kimball, Margy Ross, WILEY 2013, ● https://www.safaribooksonline.com/library/view/designing-web-navigation/9780596528102/ch 04.html, ● https://www.c-sharpcorner.com/UploadFile/225740/introduction-of-session-in-Asp-Net/Images/ Session%20in%20ASP.NET17.PNG, ● http://www.vileda.com/media/wysiwyg/Webshop_AUS/FAQ/wow_r.jpg, ● https://www.jasondavies.com/wordcloud/, ● http://www.worldometers.info/, ● https://www.slideshare.net/itsmenaguda4others/final-ppt-e-commerce-1