SlideShare a Scribd company logo
1 of 33
Churn Prediction: Understanding
your customers and taking action.
@datoinc
#churnPredictionDato
Hi! My name is …
Antoine Atallah
Principal Data Scientist
Dato toolkits team, novice powerlifter, Hawks fan.
2
Hi! My name is …
#churnPredictionDato
Hi! My name is …
Karla Vega
Customer Success Manager
Aerospace engineer, dog trainer, running fan
@vegakp
3
Hi! My name is …
#churnPredictionDato
About Us!
#churnPredictionDato
+ =
Questions?
• (Now) we love questions. Feel free to interrupt for questions!
• (Later) Email us antoine@dato.com, vega@dato.com
Webinar!
#churnPredictionDato
Extracting Insights from Data
Data Science Workflow
Ingest Transform Model Insight
#churnPredictionDato
Log Journey
Lots of data
Insights Profits
#churnPredictionDato
Mining Log Data
Logs are everywhere!
#churnPredictionDato
Different kinds of logs
• Raw logs
• Each row containing an individual event for a user, at a given
time
• Aggregated logs
• Each row contains the interactions for a user over a period of
time
• For instance, user activity over one-month rollups
• This is the traditional data output of Business Intelligence
infrastructures
• User side-data
• Information about each user (demographics, etc…)
#churnPredictionDato
Logs contain usage patterns
Small Purchase
Large Purchase
#churnPredictionDato
Different kinds usage patterns
Kinds of Patterns
Visits, Purchases, Events Frequency
Visits, Purchase Quantity
Changes in value over time
Change in time between visits, purchases, events
Time since last action or visit
Demographic information (age, gender, …)
Types of items purchased (seasonality, quality)
…
#churnPredictionDato
Retaining customers/visitors is important
• Cost to acquire a new customer is high vs retaining a customer
• Gives a pulse on the health of the business
• Can help take preventive actions and act before it’s too late
• Can help create more effective marketing campaigns
#churnPredictionDato
What is Churn Prediction
What is Churn
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past user behavior
#churnPredictionDato
Churn Prediction
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April
2016
More Precisely
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past user behavior
• We define a time boundary at which we want to predict churn
• Anyone not present N days (default is 30) after the boundary is
considered to have churned
• The M days (default 60) before the boundary are used to
generate features
• Multiple boundaries can be specified to extract more patterns
#churnPredictionDato
Feature and Label Generation
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April 2016
How to use Churn Prediction
Choosing Time Boundaries
• Time Boundaries are moments in the past that are used to
observe user behavior and generate labels
• The time before the boundary is used to observe patterns
• The time after the boundary is used to generate labels
Boundaries Meaning
January 1st 2016 Will use the patterns from before January 1st 2016 to
predict User Churn after January 1st 2016
January 1st 2016,
December 1st 2015
Will use the patterns from before January 1st 2016 to
predict User Churn after January 1st 2016;
Will use the patterns from before December 1st 2015 to
predict User Churn after December 1st 2015
This will analyze more patterns and build a richer model
#churnPredictionDato
Choosing a Churn Period
• The Churn Period corresponds to how far in the future we want to
predict.
• It also means that for training purposes, users who have not been
active for this amount of time will be considered to have churned
Churn Period Predicts
7 Days Probability for each user to be leaving next week
30 Days Probability for each user to be leaving next month
3 Months Probability for each user to be leaving next quarter
#churnPredictionDato
Choosing Lookback Periods
• Lookback Periods is how far in the past we look to extract user
behavior patterns (features)
• Multiple lookback periods can be provided to generate richer
features
Lookback Periods Features
3 Days Will use the 3 days before each Time Boundary
to extract usage patterns
30 Days Will use the 30 days before each Time
Boundary to extract usage patterns
7 Days, 1 Month Will use the week and the month before each
Time Boundary to extract usage patterns
#churnPredictionDato
Choosing appropriate parameters
• If we want to predict Churn for this quarter, we might want to set:
• Churn Period to be 3 Months (how far in the future we predict)
• Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past
to extract patterns from)
• Time Boundaries to be January 1st 2016, January 1st 2015,
January 1st 2014
• Notice that we chose the same quarter each year for Time
Boundary
• Choosing past data with the same underlying behavior will
provide more accurate predictions
#churnPredictionDato
Choosing appropriate parameters
• If we want to predict Churn for this month, we might want to set:
• Churn Period to be 1 Month (how far in the future we predict)
• Lookback Periods to be 7, 14, 30, 60 days (how far in the past
to extract patterns from)
• Time Boundaries to be January 1st 2016, October 1st 2015,
September 1st 2015, August 1st 2015
• In this case, we intentionally skipped over November and
December 2015 since it is the holiday season, and may exhibit
very different behavior
#churnPredictionDato
Key Takeaways
• Label generation is extremely simplified (choose a Churn Period)
• Feature generation is extremely simplified (choose Lookback
Periods and Time Boundaries)
• Choose representative time frames to predict churn in the desired
time frame
#churnPredictionDato
Interpreting the Results
Output of the model
• The Churn Prediction model returns a probability of churn for
each provided user
#churnPredictionDato
Using the Probabilities
Churn Probability
NumberofUsers
High Probability of
Churn:
Might be hard to
rescue these users
Mid-Probability of
Churn:
We should try to
rescue these users
Low-Probability of
Churn:
Send a thank-you
note!
#churnPredictionDato
Using the Probabilities
• We can target different users, using their probability of Churn as
a guideline
• Different marketing messages can be created based on the
probability of Churn
• The highest-probability users are not always the best to target,
depending on the cost of the action to take to retain them
• Gives a new dimension on the user base
• Can be used to monitor the health of the user population over
time
#churnPredictionDato
Demo
Summary
Log Data Mining
≠
Rocket Science
• Define time parameters to identify patterns and generate labels.
• Extract predictions to gain insights about your user population.
• Take action and help grow your healthy business.
Churn Prediction
#churnPredictionDato
SELECT questions FROM audience
WHERE difficulty == “Easy”
Thanks!

More Related Content

Viewers also liked

Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retention
Khellil Khellil
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
Vivastream
 
Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?
Srinath Perera
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Management
farhanmajeed
 

Viewers also liked (20)

Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
Crystal qube™ presentation tpr
Crystal qube™ presentation tprCrystal qube™ presentation tpr
Crystal qube™ presentation tpr
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015
 
Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retention
 
Data science in_action
Data science in_actionData science in_action
Data science in_action
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
 
Customer attrition and churn modeling
Customer attrition and churn modelingCustomer attrition and churn modeling
Customer attrition and churn modeling
 
Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?
 
Analytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty managementAnalytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty management
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Management
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in Telecom
 
Siddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing ImplementationsSiddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing Implementations
 
From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for All
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 

Similar to Webinar - Pattern Mining Log Data - Vega (20160426)

Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel FatulescuIasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Codecamp Romania
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
NoorJehanArif
 
Summarization and opinion detection in product reviews
Summarization and opinion detection in product reviewsSummarization and opinion detection in product reviews
Summarization and opinion detection in product reviews
papanaboinasuman
 

Similar to Webinar - Pattern Mining Log Data - Vega (20160426) (20)

Agile Scrum Estimation
Agile   Scrum EstimationAgile   Scrum Estimation
Agile Scrum Estimation
 
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel FatulescuIasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
 
Scrum - What is it good for?
Scrum - What is it good for?Scrum - What is it good for?
Scrum - What is it good for?
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
EN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptxEN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptx
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
Save Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with TailwindSave Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with Tailwind
 
How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?
 
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
 
Agile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad QureshiAgile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad Qureshi
 
Software estimation techniques
Software estimation techniquesSoftware estimation techniques
Software estimation techniques
 
Earthquake shakes twitter users
Earthquake shakes twitter usersEarthquake shakes twitter users
Earthquake shakes twitter users
 
Estimation
EstimationEstimation
Estimation
 
Sensors Aren't Enough
Sensors Aren't EnoughSensors Aren't Enough
Sensors Aren't Enough
 
Summarization and opinion detection in product reviews
Summarization and opinion detection in product reviewsSummarization and opinion detection in product reviews
Summarization and opinion detection in product reviews
 
Beyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical dataBeyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical data
 
3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivity3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivity
 
Scrum
ScrumScrum
Scrum
 
Nondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of Us
 

More from Turi, Inc.

More from Turi, Inc. (20)

Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Webinar - Pattern Mining Log Data - Vega (20160426)

  • 1. Churn Prediction: Understanding your customers and taking action. @datoinc #churnPredictionDato
  • 2. Hi! My name is … Antoine Atallah Principal Data Scientist Dato toolkits team, novice powerlifter, Hawks fan. 2 Hi! My name is … #churnPredictionDato
  • 3. Hi! My name is … Karla Vega Customer Success Manager Aerospace engineer, dog trainer, running fan @vegakp 3 Hi! My name is … #churnPredictionDato
  • 5. + = Questions? • (Now) we love questions. Feel free to interrupt for questions! • (Later) Email us antoine@dato.com, vega@dato.com Webinar! #churnPredictionDato
  • 7. Data Science Workflow Ingest Transform Model Insight #churnPredictionDato
  • 8. Log Journey Lots of data Insights Profits #churnPredictionDato
  • 11. Different kinds of logs • Raw logs • Each row containing an individual event for a user, at a given time • Aggregated logs • Each row contains the interactions for a user over a period of time • For instance, user activity over one-month rollups • This is the traditional data output of Business Intelligence infrastructures • User side-data • Information about each user (demographics, etc…) #churnPredictionDato
  • 12. Logs contain usage patterns Small Purchase Large Purchase #churnPredictionDato
  • 13. Different kinds usage patterns Kinds of Patterns Visits, Purchases, Events Frequency Visits, Purchase Quantity Changes in value over time Change in time between visits, purchases, events Time since last action or visit Demographic information (age, gender, …) Types of items purchased (seasonality, quality) … #churnPredictionDato
  • 14. Retaining customers/visitors is important • Cost to acquire a new customer is high vs retaining a customer • Gives a pulse on the health of the business • Can help take preventive actions and act before it’s too late • Can help create more effective marketing campaigns #churnPredictionDato
  • 15. What is Churn Prediction
  • 16. What is Churn • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior #churnPredictionDato
  • 17. Churn Prediction #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  • 18. More Precisely • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior • We define a time boundary at which we want to predict churn • Anyone not present N days (default is 30) after the boundary is considered to have churned • The M days (default 60) before the boundary are used to generate features • Multiple boundaries can be specified to extract more patterns #churnPredictionDato
  • 19. Feature and Label Generation #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  • 20. How to use Churn Prediction
  • 21. Choosing Time Boundaries • Time Boundaries are moments in the past that are used to observe user behavior and generate labels • The time before the boundary is used to observe patterns • The time after the boundary is used to generate labels Boundaries Meaning January 1st 2016 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016 January 1st 2016, December 1st 2015 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016; Will use the patterns from before December 1st 2015 to predict User Churn after December 1st 2015 This will analyze more patterns and build a richer model #churnPredictionDato
  • 22. Choosing a Churn Period • The Churn Period corresponds to how far in the future we want to predict. • It also means that for training purposes, users who have not been active for this amount of time will be considered to have churned Churn Period Predicts 7 Days Probability for each user to be leaving next week 30 Days Probability for each user to be leaving next month 3 Months Probability for each user to be leaving next quarter #churnPredictionDato
  • 23. Choosing Lookback Periods • Lookback Periods is how far in the past we look to extract user behavior patterns (features) • Multiple lookback periods can be provided to generate richer features Lookback Periods Features 3 Days Will use the 3 days before each Time Boundary to extract usage patterns 30 Days Will use the 30 days before each Time Boundary to extract usage patterns 7 Days, 1 Month Will use the week and the month before each Time Boundary to extract usage patterns #churnPredictionDato
  • 24. Choosing appropriate parameters • If we want to predict Churn for this quarter, we might want to set: • Churn Period to be 3 Months (how far in the future we predict) • Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, January 1st 2015, January 1st 2014 • Notice that we chose the same quarter each year for Time Boundary • Choosing past data with the same underlying behavior will provide more accurate predictions #churnPredictionDato
  • 25. Choosing appropriate parameters • If we want to predict Churn for this month, we might want to set: • Churn Period to be 1 Month (how far in the future we predict) • Lookback Periods to be 7, 14, 30, 60 days (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, October 1st 2015, September 1st 2015, August 1st 2015 • In this case, we intentionally skipped over November and December 2015 since it is the holiday season, and may exhibit very different behavior #churnPredictionDato
  • 26. Key Takeaways • Label generation is extremely simplified (choose a Churn Period) • Feature generation is extremely simplified (choose Lookback Periods and Time Boundaries) • Choose representative time frames to predict churn in the desired time frame #churnPredictionDato
  • 28. Output of the model • The Churn Prediction model returns a probability of churn for each provided user #churnPredictionDato
  • 29. Using the Probabilities Churn Probability NumberofUsers High Probability of Churn: Might be hard to rescue these users Mid-Probability of Churn: We should try to rescue these users Low-Probability of Churn: Send a thank-you note! #churnPredictionDato
  • 30. Using the Probabilities • We can target different users, using their probability of Churn as a guideline • Different marketing messages can be created based on the probability of Churn • The highest-probability users are not always the best to target, depending on the cost of the action to take to retain them • Gives a new dimension on the user base • Can be used to monitor the health of the user population over time #churnPredictionDato
  • 31. Demo
  • 32. Summary Log Data Mining ≠ Rocket Science • Define time parameters to identify patterns and generate labels. • Extract predictions to gain insights about your user population. • Take action and help grow your healthy business. Churn Prediction #churnPredictionDato
  • 33. SELECT questions FROM audience WHERE difficulty == “Easy” Thanks!

Editor's Notes

  1. Not sure if demo first or later?