Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Churn Prediction: Understanding
your customers and taking action.
@datoinc
#churnPredictionDato
Hi! My name is …
Antoine Atallah
Principal Data Scientist
Dato toolkits team, novice powerlifter, Hawks fan.
2
Hi! My name...
Hi! My name is …
Karla Vega
Customer Success Manager
Aerospace engineer, dog trainer, running fan
@vegakp
3
Hi! My name is...
About Us!
#churnPredictionDato
+ =
Questions?
• (Now) we love questions. Feel free to interrupt for questions!
• (Later) Email us antoine@dato.com, vega@...
Extracting Insights from Data
Data Science Workflow
Ingest Transform Model Insight
#churnPredictionDato
Log Journey
Lots of data
Insights Profits
#churnPredictionDato
Mining Log Data
Logs are everywhere!
#churnPredictionDato
Different kinds of logs
• Raw logs
• Each row containing an individual event for a user, at a given
time
• Aggregated logs...
Logs contain usage patterns
Small Purchase
Large Purchase
#churnPredictionDato
Different kinds usage patterns
Kinds of Patterns
Visits, Purchases, Events Frequency
Visits, Purchase Quantity
Changes in ...
Retaining customers/visitors is important
• Cost to acquire a new customer is high vs retaining a customer
• Gives a pulse...
What is Churn Prediction
What is Churn
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past us...
Churn Prediction
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April
2016
More Precisely
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past u...
Feature and Label Generation
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April 2016
How to use Churn Prediction
Choosing Time Boundaries
• Time Boundaries are moments in the past that are used to
observe user behavior and generate lab...
Choosing a Churn Period
• The Churn Period corresponds to how far in the future we want to
predict.
• It also means that f...
Choosing Lookback Periods
• Lookback Periods is how far in the past we look to extract user
behavior patterns (features)
•...
Choosing appropriate parameters
• If we want to predict Churn for this quarter, we might want to set:
• Churn Period to be...
Choosing appropriate parameters
• If we want to predict Churn for this month, we might want to set:
• Churn Period to be 1...
Key Takeaways
• Label generation is extremely simplified (choose a Churn Period)
• Feature generation is extremely simplif...
Interpreting the Results
Output of the model
• The Churn Prediction model returns a probability of churn for
each provided user
#churnPredictionDato
Using the Probabilities
Churn Probability
NumberofUsers
High Probability of
Churn:
Might be hard to
rescue these users
Mid...
Using the Probabilities
• We can target different users, using their probability of Churn as
a guideline
• Different marke...
Demo
Summary
Log Data Mining
≠
Rocket Science
• Define time parameters to identify patterns and generate labels.
• Extract pred...
SELECT questions FROM audience
WHERE difficulty == “Easy”
Thanks!
Upcoming SlideShare
Loading in …5
×
Upcoming SlideShare
Pattern Mining: Extracting Value from Log Data
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Webinar - Pattern Mining Log Data - Vega (20160426)

Download to read offline

Presented by Karla Vega

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Webinar - Pattern Mining Log Data - Vega (20160426)

  1. 1. Churn Prediction: Understanding your customers and taking action. @datoinc #churnPredictionDato
  2. 2. Hi! My name is … Antoine Atallah Principal Data Scientist Dato toolkits team, novice powerlifter, Hawks fan. 2 Hi! My name is … #churnPredictionDato
  3. 3. Hi! My name is … Karla Vega Customer Success Manager Aerospace engineer, dog trainer, running fan @vegakp 3 Hi! My name is … #churnPredictionDato
  4. 4. About Us! #churnPredictionDato
  5. 5. + = Questions? • (Now) we love questions. Feel free to interrupt for questions! • (Later) Email us antoine@dato.com, vega@dato.com Webinar! #churnPredictionDato
  6. 6. Extracting Insights from Data
  7. 7. Data Science Workflow Ingest Transform Model Insight #churnPredictionDato
  8. 8. Log Journey Lots of data Insights Profits #churnPredictionDato
  9. 9. Mining Log Data
  10. 10. Logs are everywhere! #churnPredictionDato
  11. 11. Different kinds of logs • Raw logs • Each row containing an individual event for a user, at a given time • Aggregated logs • Each row contains the interactions for a user over a period of time • For instance, user activity over one-month rollups • This is the traditional data output of Business Intelligence infrastructures • User side-data • Information about each user (demographics, etc…) #churnPredictionDato
  12. 12. Logs contain usage patterns Small Purchase Large Purchase #churnPredictionDato
  13. 13. Different kinds usage patterns Kinds of Patterns Visits, Purchases, Events Frequency Visits, Purchase Quantity Changes in value over time Change in time between visits, purchases, events Time since last action or visit Demographic information (age, gender, …) Types of items purchased (seasonality, quality) … #churnPredictionDato
  14. 14. Retaining customers/visitors is important • Cost to acquire a new customer is high vs retaining a customer • Gives a pulse on the health of the business • Can help take preventive actions and act before it’s too late • Can help create more effective marketing campaigns #churnPredictionDato
  15. 15. What is Churn Prediction
  16. 16. What is Churn • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior #churnPredictionDato
  17. 17. Churn Prediction #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  18. 18. More Precisely • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior • We define a time boundary at which we want to predict churn • Anyone not present N days (default is 30) after the boundary is considered to have churned • The M days (default 60) before the boundary are used to generate features • Multiple boundaries can be specified to extract more patterns #churnPredictionDato
  19. 19. Feature and Label Generation #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  20. 20. How to use Churn Prediction
  21. 21. Choosing Time Boundaries • Time Boundaries are moments in the past that are used to observe user behavior and generate labels • The time before the boundary is used to observe patterns • The time after the boundary is used to generate labels Boundaries Meaning January 1st 2016 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016 January 1st 2016, December 1st 2015 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016; Will use the patterns from before December 1st 2015 to predict User Churn after December 1st 2015 This will analyze more patterns and build a richer model #churnPredictionDato
  22. 22. Choosing a Churn Period • The Churn Period corresponds to how far in the future we want to predict. • It also means that for training purposes, users who have not been active for this amount of time will be considered to have churned Churn Period Predicts 7 Days Probability for each user to be leaving next week 30 Days Probability for each user to be leaving next month 3 Months Probability for each user to be leaving next quarter #churnPredictionDato
  23. 23. Choosing Lookback Periods • Lookback Periods is how far in the past we look to extract user behavior patterns (features) • Multiple lookback periods can be provided to generate richer features Lookback Periods Features 3 Days Will use the 3 days before each Time Boundary to extract usage patterns 30 Days Will use the 30 days before each Time Boundary to extract usage patterns 7 Days, 1 Month Will use the week and the month before each Time Boundary to extract usage patterns #churnPredictionDato
  24. 24. Choosing appropriate parameters • If we want to predict Churn for this quarter, we might want to set: • Churn Period to be 3 Months (how far in the future we predict) • Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, January 1st 2015, January 1st 2014 • Notice that we chose the same quarter each year for Time Boundary • Choosing past data with the same underlying behavior will provide more accurate predictions #churnPredictionDato
  25. 25. Choosing appropriate parameters • If we want to predict Churn for this month, we might want to set: • Churn Period to be 1 Month (how far in the future we predict) • Lookback Periods to be 7, 14, 30, 60 days (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, October 1st 2015, September 1st 2015, August 1st 2015 • In this case, we intentionally skipped over November and December 2015 since it is the holiday season, and may exhibit very different behavior #churnPredictionDato
  26. 26. Key Takeaways • Label generation is extremely simplified (choose a Churn Period) • Feature generation is extremely simplified (choose Lookback Periods and Time Boundaries) • Choose representative time frames to predict churn in the desired time frame #churnPredictionDato
  27. 27. Interpreting the Results
  28. 28. Output of the model • The Churn Prediction model returns a probability of churn for each provided user #churnPredictionDato
  29. 29. Using the Probabilities Churn Probability NumberofUsers High Probability of Churn: Might be hard to rescue these users Mid-Probability of Churn: We should try to rescue these users Low-Probability of Churn: Send a thank-you note! #churnPredictionDato
  30. 30. Using the Probabilities • We can target different users, using their probability of Churn as a guideline • Different marketing messages can be created based on the probability of Churn • The highest-probability users are not always the best to target, depending on the cost of the action to take to retain them • Gives a new dimension on the user base • Can be used to monitor the health of the user population over time #churnPredictionDato
  31. 31. Demo
  32. 32. Summary Log Data Mining ≠ Rocket Science • Define time parameters to identify patterns and generate labels. • Extract predictions to gain insights about your user population. • Take action and help grow your healthy business. Churn Prediction #churnPredictionDato
  33. 33. SELECT questions FROM audience WHERE difficulty == “Easy” Thanks!

Presented by Karla Vega

Views

Total views

644

On Slideshare

0

From embeds

0

Number of embeds

1

Actions

Downloads

24

Shares

0

Comments

0

Likes

0

×