SlideShare a Scribd company logo
1 of 27
Building a Graph Database in
Neo4j with Spark & Spark SQL
to Gain New Insights from Log
Data
ROBERT HRYNIEWICZ – DATA SCIENTIST/EVANGELIST,
HORTONWORKS
RACHEL POULSEN – DATA SCIENCE DIRECTOR, TIVO
Overview
 Overview of business problem
 Introduction to TiVo and TiVo data
 Challenges with using data to optimize UI navigation
 Solution
 Demo
 Next Steps
 Questions
TiVo
Background and context
 TiVo is a discovery platform for integrated entertainment
 Multiple ways to find content
TiVo Roamio Demo Video
 TiVo collects ~2 million logs a day from their boxes
 User action events, TiVo action events, inventory events
 These events are “memory-less” and don’t know what happened prior to the
event
Motivation
TiVo Business Initiatives
 Help users get to content they want faster
 Help users discover new content easier
 Is feature X important to discovering and getting to content? (ex: Is the guide
still used to find content?
Challenge
 Measuring a KPI for initiatives in a log stream that is “memory-less”
 Identifying events or a pattern of events that impacts the KPIs
Data Objective – “Path Analysis”
 Analysis that answers questions around the navigational paths users take to
get to or from defined start and/or end points
Architecture Challenges
 Traditional data platform that was sample-based and SQL-based
 Relational databases
Technical Challenges
 Relational Databases Challenges
 Little flexibility in “click path” definition
 Decisions about defining “paths” are made during processing step
 Many business assumptions have to be made with little insight
Solution
Solution
 Graph Database (Neo4j)
 Relationships are first-class citizens
 Simple abstractions
 Enable sophisticated models
 “Path Analysis”
Prototype
One Day Graph Info
 Edges: 57K relationships
 Nodes:
135 UI or “screen” nodes
12 “watch content” nodes
Size reduction (2K times)
 70 GB log data  35MB Neo DB (all nodes & edges)
 1 Day  Oct 1, 2015
LOGS
70+ GB
UI nodes
Watch nodes
Edges
35 MB
Screens, Transitions, and Content
 Screen events
 Remote button press events
 Watch content events
Sample Graph
Live Movie My Shows
TiVo Central
(Home)
Switched to
Switched to
Switched to
Architecture Overview
What’s captured in the graph?
 Node (UI)
 Name
 Timestamp
 Node (Watch)
 Type, e.g. Recorded
 Genre, e.g. TV Show
 Timestamp
 Edge
 Average Time
 Total number of keys pressed
 Key sequence
 e.g. Home  Up/Down  Select/Play
 Total number of times path taken
 Unique number of users taking this path
What’s captured in the graph?
TiVo
Central
1/1/2016 0.4s average time
3 keys pressed
Home-Down-Select
50 times path used
27 unique users
Live Movie
1/1/2016
Raw Log File example
…
1444809715713072|Watch|live|WBINDT|MV|506|EP019641150097...
1444809715812909|Key|HOME
1444880816123454|UI|TivoCentralScreen
1444809716234553|Key|DOWN
1444809716354363|Key|SELECT
1444809716518701|Trick3|PLAY|116|1|100|-1
1444809719888072|Key|PLAY
1444809719889072|Trick3|PLAY|119|1|100|-1
1444809726966880|Watch|rec|WFXTDT|SH|508|...
…
Filtered Log
...
Watch: LIVE MOVIE
Key: HOME
UI: TIVO HOME
Key: DOWN
Key: SELECT
Key: PLAY
Watch: REC SHOW
…
Edge
Node
Node
Edge
Node
Same day
Algorithm Overview (1 of 2)
1. Filter for desired events
• Remove non-Screen, non-Watch, non-Key events
2. Session-ize and order logs to reflect Screen/Watch/Edge events
3. Define display for Key Press events - two formats
• Normal: SELECT & UP x 2 & GUIDE & SELECT
• Compact: TIVO & 9 KEYS & SELECT
4. Generate an Edge if transition < max time set by stakeholders (e.g. 5 min)
• For all logs find the following sequence:
Node X - timestamp x (start time)
key A
key B
key C
Node Y - timestamp y (end time)
Algorithm Overview (2 of 2)
5. For each unique node-edge-node calculate:
1. Average transition time
2. Number of transitions
3. Number of unique transitions
4. Number of keys pressed
5. Key sequence (normal or compact)
6. Export results to CSV files
DEMO
 What is the most popular path people take to get to content?
 live vs. recorded
 What percent of total paths are most popular?
 What path is most popular? Overall? Unique?
 What app is most popular?
 What percent of total paths involve the Guide screen?
Business Advantages
 Measure KPIs for time to content and content discovery
 Optimize KPIs (understanding user behavior that impacts the
KPIs)
 Enhance A/B Testing by helping to answer “why?”
 Simplify user experience across products
 Increase engagement with new content
 Understand feature usage interactions not only as a mutually
exclusive experience
Future Work
 Deploy to production -- multi-day queries
 Add relationships and nodes for feature usage
 Classify paths (“discovery” or “known destination”)
 Exploratory analysis
Thanks!
 @RobHryniewicz
 @Bayesbabe

More Related Content

What's hot

Stream Analytics
Stream Analytics Stream Analytics
Stream Analytics Franco Ucci
 
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...DataWorks Summit
 
Apache Druid 101
Apache Druid 101Apache Druid 101
Apache Druid 101Data Con LA
 
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Data Con LA
 
Break Free From Oracle with Attunity and Microsoft
Break Free From Oracle with Attunity and MicrosoftBreak Free From Oracle with Attunity and Microsoft
Break Free From Oracle with Attunity and MicrosoftAttunity
 
From an experiment to a real production environment
From an experiment to a real production environmentFrom an experiment to a real production environment
From an experiment to a real production environmentDataWorks Summit
 
Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLDataWorks Summit
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaAttunity
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpJoseph Arriola
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoopgluent.
 
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...Databricks
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftAttunity
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
 

What's hot (20)

Stream Analytics
Stream Analytics Stream Analytics
Stream Analytics
 
Stream Scaling in Pravega
Stream Scaling in PravegaStream Scaling in Pravega
Stream Scaling in Pravega
 
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...Reaching scale limits on a Hadoop platform: issues and errors created by spee...
Reaching scale limits on a Hadoop platform: issues and errors created by spee...
 
Apache Druid 101
Apache Druid 101Apache Druid 101
Apache Druid 101
 
Automated Analytics at Scale
Automated Analytics at ScaleAutomated Analytics at Scale
Automated Analytics at Scale
 
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
Flink Forward Berlin 2017: Bas Geerdink, Martijn Visser - Fast Data at ING - ...
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Building an Event-oriented...
 
Break Free From Oracle with Attunity and Microsoft
Break Free From Oracle with Attunity and MicrosoftBreak Free From Oracle with Attunity and Microsoft
Break Free From Oracle with Attunity and Microsoft
 
From an experiment to a real production environment
From an experiment to a real production environmentFrom an experiment to a real production environment
From an experiment to a real production environment
 
Make streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQLMake streaming processing towards ANSI SQL
Make streaming processing towards ANSI SQL
 
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and BeyondWhat's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
 
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...
 
Streaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache KafkaStreaming Data Ingest and Processing with Apache Kafka
Streaming Data Ingest and Processing with Apache Kafka
 
How to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcpHow to design and implement a data ops architecture with sdc and gcp
How to design and implement a data ops architecture with sdc and gcp
 
Building a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with RBuilding a Scalable Data Science Platform with R
Building a Scalable Data Science Platform with R
 
Gluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with HadoopGluent Extending Enterprise Applications with Hadoop
Gluent Extending Enterprise Applications with Hadoop
 
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...
Databricks Whitelabel: Making Petabyte Scale Data Consumable to All Our Custo...
 
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon Redshift
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 

Viewers also liked

Docker Swarm Cluster
Docker Swarm ClusterDocker Swarm Cluster
Docker Swarm ClusterFernando Ike
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks
 
Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014 Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014 Janos Matyas
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Jeffrey Breen
 
Hadoop Cluster on Docker Containers
Hadoop Cluster on Docker ContainersHadoop Cluster on Docker Containers
Hadoop Cluster on Docker Containerspranav_joshi
 
Managing Docker Containers In A Cluster - Introducing Kubernetes
Managing Docker Containers In A Cluster - Introducing KubernetesManaging Docker Containers In A Cluster - Introducing Kubernetes
Managing Docker Containers In A Cluster - Introducing KubernetesMarc Sluiter
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on DockerRakesh Saha
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersBlueData, Inc.
 

Viewers also liked (11)

Docker Swarm Cluster
Docker Swarm ClusterDocker Swarm Cluster
Docker Swarm Cluster
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3Hortonworks Technical Workshop: What's New in HDP 2.3
Hortonworks Technical Workshop: What's New in HDP 2.3
 
Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014 Docker based Hadoop provisioning - Hadoop Summit 2014
Docker based Hadoop provisioning - Hadoop Summit 2014
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily.....
 
Hadoop Cluster on Docker Containers
Hadoop Cluster on Docker ContainersHadoop Cluster on Docker Containers
Hadoop Cluster on Docker Containers
 
Managing Docker Containers In A Cluster - Introducing Kubernetes
Managing Docker Containers In A Cluster - Introducing KubernetesManaging Docker Containers In A Cluster - Introducing Kubernetes
Managing Docker Containers In A Cluster - Introducing Kubernetes
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on Docker
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 

Similar to Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insights from Log Data

Innovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTCInnovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTCSteve Speicher
 
Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Skelton Thatcher Consulting Ltd
 
DSC UTeM DevOps Session#1: Intro to DevOps Presentation Slides
DSC UTeM DevOps Session#1: Intro to DevOps Presentation SlidesDSC UTeM DevOps Session#1: Intro to DevOps Presentation Slides
DSC UTeM DevOps Session#1: Intro to DevOps Presentation SlidesDSC UTeM
 
Define phase lean six sigma tollgate template
Define phase   lean six sigma tollgate templateDefine phase   lean six sigma tollgate template
Define phase lean six sigma tollgate templateSteven Bonacorsi
 
You're Live, Now What?
You're Live, Now What?You're Live, Now What?
You're Live, Now What?Evergreen ILS
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Skelton Thatcher Consulting Ltd
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 
Please Define: Roles in User Experience Design
Please Define: Roles in User Experience DesignPlease Define: Roles in User Experience Design
Please Define: Roles in User Experience DesignSkye Sant
 
DevOps feedback loops
DevOps feedback loopsDevOps feedback loops
DevOps feedback loopsPaul Peissner
 
DevOps by examples - Continuous Lifecycle London 2017
DevOps by examples - Continuous Lifecycle London 2017DevOps by examples - Continuous Lifecycle London 2017
DevOps by examples - Continuous Lifecycle London 2017Giulio Vian
 
Dev Ops for systems of record - Talk at Agile Australia 2015
Dev Ops for systems of record - Talk at Agile Australia 2015Dev Ops for systems of record - Talk at Agile Australia 2015
Dev Ops for systems of record - Talk at Agile Australia 2015Mirco Hering
 
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & UnicomPractical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & UnicomSkelton Thatcher Consulting Ltd
 
Next-Generation IDS: A CEP Use Case in 10 Minutes
Next-Generation IDS: A CEP Use Case in 10 MinutesNext-Generation IDS: A CEP Use Case in 10 Minutes
Next-Generation IDS: A CEP Use Case in 10 MinutesTim Bass
 
An Introduction to Microservices
An Introduction to MicroservicesAn Introduction to Microservices
An Introduction to MicroservicesAd van der Veer
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developersDatadog
 
ADC 2017 - DevOps by examples part II – feedback loop
ADC 2017 - DevOps by examples part II – feedback loopADC 2017 - DevOps by examples part II – feedback loop
ADC 2017 - DevOps by examples part II – feedback loopGiulio Vian
 
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...BDekkema
 
Biz Nova It Project Bonus Slides
Biz Nova It Project Bonus SlidesBiz Nova It Project Bonus Slides
Biz Nova It Project Bonus SlidesTyHowardPMP
 
CEE Logging Standard: Today and Tomorrow
CEE Logging Standard: Today and TomorrowCEE Logging Standard: Today and Tomorrow
CEE Logging Standard: Today and TomorrowAnton Chuvakin
 

Similar to Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insights from Log Data (20)

Innovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTCInnovate2011 DevOps TSRM RTC
Innovate2011 DevOps TSRM RTC
 
Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017Practical operability techniques for distributed systems - Velocity EU 2017
Practical operability techniques for distributed systems - Velocity EU 2017
 
DSC UTeM DevOps Session#1: Intro to DevOps Presentation Slides
DSC UTeM DevOps Session#1: Intro to DevOps Presentation SlidesDSC UTeM DevOps Session#1: Intro to DevOps Presentation Slides
DSC UTeM DevOps Session#1: Intro to DevOps Presentation Slides
 
Define phase lean six sigma tollgate template
Define phase   lean six sigma tollgate templateDefine phase   lean six sigma tollgate template
Define phase lean six sigma tollgate template
 
You're Live, Now What?
You're Live, Now What?You're Live, Now What?
You're Live, Now What?
 
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...Practical operability techniques for teams - Matthew Skelton - Agile in the C...
Practical operability techniques for teams - Matthew Skelton - Agile in the C...
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Please Define: Roles in User Experience Design
Please Define: Roles in User Experience DesignPlease Define: Roles in User Experience Design
Please Define: Roles in User Experience Design
 
DevOps feedback loops
DevOps feedback loopsDevOps feedback loops
DevOps feedback loops
 
DevOps by examples - Continuous Lifecycle London 2017
DevOps by examples - Continuous Lifecycle London 2017DevOps by examples - Continuous Lifecycle London 2017
DevOps by examples - Continuous Lifecycle London 2017
 
Dev Ops for systems of record - Talk at Agile Australia 2015
Dev Ops for systems of record - Talk at Agile Australia 2015Dev Ops for systems of record - Talk at Agile Australia 2015
Dev Ops for systems of record - Talk at Agile Australia 2015
 
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & UnicomPractical operability techniques for teams - webinar - Skelton Thatcher & Unicom
Practical operability techniques for teams - webinar - Skelton Thatcher & Unicom
 
Next-Generation IDS: A CEP Use Case in 10 Minutes
Next-Generation IDS: A CEP Use Case in 10 MinutesNext-Generation IDS: A CEP Use Case in 10 Minutes
Next-Generation IDS: A CEP Use Case in 10 Minutes
 
An Introduction to Microservices
An Introduction to MicroservicesAn Introduction to Microservices
An Introduction to Microservices
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developers
 
ADC 2017 - DevOps by examples part II – feedback loop
ADC 2017 - DevOps by examples part II – feedback loopADC 2017 - DevOps by examples part II – feedback loop
ADC 2017 - DevOps by examples part II – feedback loop
 
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...
Kick-off nieuwe Monitoring Werkgroep bij de GSE tijdens de Nationale GSE Conf...
 
DevOps Culture and Principles
DevOps Culture and PrinciplesDevOps Culture and Principles
DevOps Culture and Principles
 
Biz Nova It Project Bonus Slides
Biz Nova It Project Bonus SlidesBiz Nova It Project Bonus Slides
Biz Nova It Project Bonus Slides
 
CEE Logging Standard: Today and Tomorrow
CEE Logging Standard: Today and TomorrowCEE Logging Standard: Today and Tomorrow
CEE Logging Standard: Today and Tomorrow
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insights from Log Data

  • 1. Building a Graph Database in Neo4j with Spark & Spark SQL to Gain New Insights from Log Data ROBERT HRYNIEWICZ – DATA SCIENTIST/EVANGELIST, HORTONWORKS RACHEL POULSEN – DATA SCIENCE DIRECTOR, TIVO
  • 2. Overview  Overview of business problem  Introduction to TiVo and TiVo data  Challenges with using data to optimize UI navigation  Solution  Demo  Next Steps  Questions
  • 3. TiVo Background and context  TiVo is a discovery platform for integrated entertainment  Multiple ways to find content TiVo Roamio Demo Video  TiVo collects ~2 million logs a day from their boxes  User action events, TiVo action events, inventory events  These events are “memory-less” and don’t know what happened prior to the event
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Motivation TiVo Business Initiatives  Help users get to content they want faster  Help users discover new content easier  Is feature X important to discovering and getting to content? (ex: Is the guide still used to find content? Challenge  Measuring a KPI for initiatives in a log stream that is “memory-less”  Identifying events or a pattern of events that impacts the KPIs Data Objective – “Path Analysis”  Analysis that answers questions around the navigational paths users take to get to or from defined start and/or end points
  • 9. Architecture Challenges  Traditional data platform that was sample-based and SQL-based  Relational databases
  • 10. Technical Challenges  Relational Databases Challenges  Little flexibility in “click path” definition  Decisions about defining “paths” are made during processing step  Many business assumptions have to be made with little insight
  • 12. Solution  Graph Database (Neo4j)  Relationships are first-class citizens  Simple abstractions  Enable sophisticated models  “Path Analysis”
  • 13. Prototype One Day Graph Info  Edges: 57K relationships  Nodes: 135 UI or “screen” nodes 12 “watch content” nodes
  • 14. Size reduction (2K times)  70 GB log data  35MB Neo DB (all nodes & edges)  1 Day  Oct 1, 2015 LOGS 70+ GB UI nodes Watch nodes Edges 35 MB
  • 15. Screens, Transitions, and Content  Screen events  Remote button press events  Watch content events
  • 16. Sample Graph Live Movie My Shows TiVo Central (Home) Switched to Switched to Switched to
  • 18. What’s captured in the graph?  Node (UI)  Name  Timestamp  Node (Watch)  Type, e.g. Recorded  Genre, e.g. TV Show  Timestamp  Edge  Average Time  Total number of keys pressed  Key sequence  e.g. Home  Up/Down  Select/Play  Total number of times path taken  Unique number of users taking this path
  • 19. What’s captured in the graph? TiVo Central 1/1/2016 0.4s average time 3 keys pressed Home-Down-Select 50 times path used 27 unique users Live Movie 1/1/2016
  • 20. Raw Log File example … 1444809715713072|Watch|live|WBINDT|MV|506|EP019641150097... 1444809715812909|Key|HOME 1444880816123454|UI|TivoCentralScreen 1444809716234553|Key|DOWN 1444809716354363|Key|SELECT 1444809716518701|Trick3|PLAY|116|1|100|-1 1444809719888072|Key|PLAY 1444809719889072|Trick3|PLAY|119|1|100|-1 1444809726966880|Watch|rec|WFXTDT|SH|508|... …
  • 21. Filtered Log ... Watch: LIVE MOVIE Key: HOME UI: TIVO HOME Key: DOWN Key: SELECT Key: PLAY Watch: REC SHOW … Edge Node Node Edge Node Same day
  • 22. Algorithm Overview (1 of 2) 1. Filter for desired events • Remove non-Screen, non-Watch, non-Key events 2. Session-ize and order logs to reflect Screen/Watch/Edge events 3. Define display for Key Press events - two formats • Normal: SELECT & UP x 2 & GUIDE & SELECT • Compact: TIVO & 9 KEYS & SELECT 4. Generate an Edge if transition < max time set by stakeholders (e.g. 5 min) • For all logs find the following sequence: Node X - timestamp x (start time) key A key B key C Node Y - timestamp y (end time)
  • 23. Algorithm Overview (2 of 2) 5. For each unique node-edge-node calculate: 1. Average transition time 2. Number of transitions 3. Number of unique transitions 4. Number of keys pressed 5. Key sequence (normal or compact) 6. Export results to CSV files
  • 24. DEMO  What is the most popular path people take to get to content?  live vs. recorded  What percent of total paths are most popular?  What path is most popular? Overall? Unique?  What app is most popular?  What percent of total paths involve the Guide screen?
  • 25. Business Advantages  Measure KPIs for time to content and content discovery  Optimize KPIs (understanding user behavior that impacts the KPIs)  Enhance A/B Testing by helping to answer “why?”  Simplify user experience across products  Increase engagement with new content  Understand feature usage interactions not only as a mutually exclusive experience
  • 26. Future Work  Deploy to production -- multi-day queries  Add relationships and nodes for feature usage  Classify paths (“discovery” or “known destination”)  Exploratory analysis

Editor's Notes

  1. What is “Path Analysis”? Analysis that answers questions around the navigational paths users take to get to or from defined start and/or end points
  2. What is “Path Analysis”? Analysis that answers questions around the navigational paths users take to get to or from defined start and/or end points
  3. Expensive C-code No flexibility in “click path” definition (make decision on processing, constrain number of “paths,” etc)
  4. Expensive C-code No flexibility in “click path” definition (make decision on processing, constrain number of “paths,” etc)
  5. Why Neo4J Team had exposure to the product before Mature product Expressive graph query language: Cypher Analysts instead of full-blown data scientists/statisticians to run queries
  6. Robert to clean up
  7. What is the most popular path people take to get to content? live vs. recorded content What percent of total paths are most popular? List top 5 paths with average time and percentage What percent of total users are most popular? List top 5 paths with average time and percentage In-degree connectivity.
  8. Evaluate whether consumers view programs differently depending on how they navigate to each program? Examine viewing/navigation options: Via program guide Via Season Pass/recorded programs By tuning directly to a station from set off or channel change