SlideShare a Scribd company logo
1 of 31
Download to read offline
THE DMEF
 CLV COMPETITION
AND HOW I ENDED UP ON 2ND PLACE
THE CHALLENGE
              $   $       $           ?$?$?


   1.1.2002                   31.8.2006   31.8.2008




non-contractual Setting       non-observable Status
THE CHALLENGE
               $   $   $           ?$?$?


    1.1.2002               31.8.2006   31.8.2008




21,000 DONORS acquired in first half of 2002
   54,000 DONATIONS until mid of 2006
THE GAME PLAN

• Understand     the Data Set ➙ EDA

• Split   Estimation for # Transactions and $ Value

• Implement Parametric Stochastic Models
   NBD, Pareto/NBD, BG/NBD, CBG/NBD,..

• Benchmark     Data Fit and Predictive Power

• Try   to Improve Predictive Power
THE DATA SET
                                   SAMPLED TIMING PATTERNS
                                                          Various Timing Patterns

           11382546                |                       |                   |                  |                |

           11371770                |   | |   || |                  |       |       |       | | | | |       |   |

           11359536                |              |                                    |

           11343894            |                               |

           11329984        |
Donor ID




           11317401       |

           11303989   |

           11292547   |                                                                                            |

           11281342   |                       |       |                |       |                       |                      |

           11270451   |

           11259736   |

           10870988   ||||||||||||||||||||||||||||||||||||||||||||

                      2002                     2003                    2004                      2005                  2006
                                                                           Time Scale
THE DATA SET
                   TRENDS AT AGGREGATE LEVEL


              Nr of Donations                 Avg Donation Amount




                                  50
8000




                                  40
                                  30
4000




                 13% 15% 14%      20
                                  10               +24% 10% +12%
0




                                  0

       2002      2004      2006        2002        2004     2006

                   Time                              Time
THE DATA SET
               TRENDS AT AGGREGATE LEVEL


     Percentage of Donors                 Average Nr of Donations
who Have Donated Within that Year            per Active Donor
0.5




                                    2.0
                                                               1.55
0.4




                                                 1.46   1.51




                                    1.5
                                          1.42
      27.8% 29.5%
0.3




                    23.5%




                                    1.0
                            18.8%
0.2




                                    0.5
0.1
0.0




                                    0.0

      2002   2003   2004    2005          2002   2003   2004   2005

                Time                                Time
THE DATA SET
                               INTERTRANSACTION TIMES


                               Overall Distribution of Intertransaction Times
        4000




                   1

                                   12
        3000
Count

        2000
        1000




                                                   24
        0




               0       3   6   9   12 15 18 21 24 27 30 33 36 39 42 45 48 51

                                        Nr of Months in between Donations
THE MODELS
                NBD ASSUMPTIONS (1959)


   A) The number of transactions follows a Poisson
   process with rate λ
   B) Heterogeneity in λ follows a Gamma distribution
   with shape parameter r and rate parameter α


„while there is not enough information to reliably estimate
 the purchase rate for each person, there will generally be
 enough to estimate the distribution of it over customers“
THE MODELS
             NBD - ESTIMATION




r = 0,475                avg IPT:   2,9 years
α = 489.5                med IPT: 6,6 years
THE MODELS
                  PARETO/NBD ASSUMPTIONS (1987)

             A) The number of transactions follows a Poisson
NBD
         {   process with rate λ
             B) Heterogeneity in λ follows a Gamma distribution
             with shape parameter r and rate parameter α
             C) Customer Lifetime is exponentially distributed
Pareto
         {   with death rate μ
             D) Heterogeneity in μ follows a Gamma distribution
             with shape parameter s and rate parameter β
             E) λ and μ are distributed independently
THE MODELS
           BG/NBD ASSUMPTIONS (2005)

A) The number of transactions follows a Poisson
process with rate λ
B) Heterogeneity in λ follows a Gamma distribution
with shape parameter r and rate parameter α
C) Directly after each purchase there is a constant
drop-out probabilty p
D) Heterogeneity in p follows a Beta distribution
with parameter a and b
E) λ and p are distributed independently
THE MODELS
           CBG/NBD ASSUMPTIONS (2007)

A) The number of transactions follows a Poisson
process with rate λ
B) Heterogeneity in λ follows a Gamma distribution
with shape parameter r and rate parameter α
C) At time zero and directly after each purchase
there is a constant drop-out probabilty p
D) Heterogeneity in p follows a Beta distribution
with parameter a and b
E) λ and p are distributed independently
THE BENCHMARK
                                        DATA FIT

                        Actual vs Fitted Frequency of Repeat Transactions
            10000




                                                                       Observed
                                                                       NBD
                                                                       Pareto/NBD
                                                                       BG/NBD
            8000




                                             2
                                                     = 366.1           CBG/NBD
                                                 NBD
                                        2
                                         Pareto/NBD = 391.5
                                            2
                                              BG/NBD = 487.2
            6000
Frequency




                                          2
                                            CBG/NBD = 363.7
            4000
            2000
            0




                    0       1       2            3         4   5   6        7+
THE BENCHMARK
              PREDICTIVE POWER



                     Time Split

                Calibration              Validation
                 Period                   Period




2002   2003        2004           2005      2006
THE BENCHMARK
                     PREDICTIVE POWER




MSLE = Mean Squared Logarithmic Error
RMSE = Root Mean Squared Error
MAE = Mean Absolute Error
Corr = Correlation
THE PROBLEM
 A SIMPLE LINEAR MODEL
THE APPROACH
                                                                     INVESTIGATE IN ERRORS


        Timing Patterns for the                                                                                           Timing Patterns for the
    10 Worst Underestimated Donors                                                                                    10 Worst Overestimated Donors
            |                |                           |       |           |       | ||||| | | ||| | |||| ||
                                                                                           | | |                           | | ||||||||||||||||||||||||||||||
            ||                                   |                                   | |||||||||||||             |                    |||| || |||| | ||     ||| |||| | |
                                                                                                                                                             | |
|                      || |                          |                                  | ||||||||||                 ||||||||||||||||                     ||||||||||||
    |        |                       |               ||              |               |||||||||||||||                  ||       | | || | | | | ||| | | | |           || |
        | |                                               | |            | | |||||||||||||||                          ||       | ||                 || | | |||||| |||
                                                                                                                                                                |
        |                |                   |               |       |           |   |||||||||||||||             | | |           ||        | ||| | |||||||||||
                 | |             |       |                   |       | |              ||||||||||||||             |||||||||||| |||||||||||||||||||||||
        |                        |               |                                         |||||
                                                                                               |       |               |||||||||||||||||||||||||||||||||
    || |                                                             |                 |      || |||||||              | ||| |||| | | | |                    | || | |
        |                                                                |                  || | | || |               | |        |               | || | ||| | | | |
                       Calibration Period                                                  Validation Period                  Calibration Period                           Validation Period
REGULARITY
IT‘S NOT JUST ABOUT RECENCY AND FREQUENCY


  Two Users with same Recency and Frequency




But one of them is more likely to be active after T.
THE POISSON PROCESS
             PROBLEMATIC IMPLICATIONS


      Poisson implies Exponentially Distributed IPT

•Mode Zero: The most likely time of purchase is
immediately after a purchase. No dead period.
•Memoryless Property: No regularity within timing
patterns. Succeeding interpurchase times are
assumed to be uncorrelated.
THE SOLUTION
         CBG/CNBD-K ASSUMPTIONS (2008)

A) While active, transactions occur with Erlang-k
(rate parameter λ) distributed waiting times
B) Heterogeneity in λ follows a Gamma distribution
with shape parameter r and rate parameter α
C) Directly after each purchase there is a constant
drop-out probabilty p
D) Heterogeneity in p follows a Beta distribution
with parameter a and b
E) λ and p are distributed independently
THE SOLUTION
                                       ERLANG-K
                       Erlang 1
                                            |                                      |                             |                                                                                                         |
0.0 0.4 0.8




                                            |      |                 |                                                                        |                        |                                   |                                          |
                                            |                         |            |                             |                             ||                          |                                       | |                              ||| |
                                            | ||   ||                  |                                                                        | |                | ||                        |                                    |                  |
                                            | |              |                                  ||                                               | |                               |                                                    |
              0   1   2       3    4    5


                       Erlang 2
                                            |                                               |           |                 |           |            |           |                       |                                                        |
0.0 0.4 0.8




                                            |                                               |            |                    |                                        |                       |                                            |               |
                                            | |   |              |                              |                                                             |  |                                     |
                                            | |                                |                                 | |                  |                    | | |                                               |                        |                       |
                                            |   | |                                     |                                 |               |        |                                                               |                |       |
              0   1   2       3    4    5


                       Erlang 3
                                            | |                                             |           |                         |                    |                       |                   |                   |                             |
0.0 0.4 0.8




                                            |   |                          |                        |                                                                              |                                   |            | |
                                            | |     |                                       |                                                 | |                                                  |               |                            |
                                            | |                                                 |                                   |                                  |                               |                    |
                                            | |   |                                     |                                         | |                          |                                   |                       |                            |
              0   1   2       3    4    5


                      Erlang 100
                                            |            |                          |                             |                            |                   |                           |                       |                        |
0.0 0.4 0.8




                                            |            |                              |                             |                        |               |                                |                               |                    |
                                            |      |                       |                            |                         |                        |                               |                       |                        |
                                            |        |                          |                                |                             |               |                           |                               |                        |
                                            |       |                          |                             |                            |                    |                               |                               |                    |
              0   1   2       3    4    5
THE SOLUTION
  CBG/CNBD-K - 2008
REGULARITY MEASURES
                                               ESTIMATING ,K‘
    Distribution of Estimated Gamma Shape Parameters


                                r=1      Exponential IPTs
                                r=2      Erlang 2 IPTs




0        2           4              6               8                   10                          Regularity Measure M
                     Shape Parameter r


                                                                  2.5
                                                                             Actual Distribution of M
                                                                             Distribution of M for r=2
                                                                             Distribution of M for r=1
                                                                  2.0
                                                                  1.5
                                                        Density

                                                                  1.0
                                                                  0.5
                                                                  0.0




                                                                             0.0           0.2           0.4     0.6       0.8   1.0
THE BENCHMARK
             MSLE     RMSE    MAE     Corr    SUM

   LM        0,0863   0,642   0,262   0,644   -31 %

Pareto/NBD   0,0977   0,653   0,359   0,628   +22%

 BG/NBD      0,0963   0,651   0,362   0,640   +19%

 CBG/NBD     0,0959   0,650   0,360   0,639   +19%

CBD/CNBD-2   0,0831   0,632   0,293   0,660   -11 %

CBD/CNBD-3   0,0816   0,637   0,275   0,663   -24 %
THE CONTEST
                              PARTICIPANTS


Companies                  US Universities   Internation Universities
DataLab                    U Pennsylvania    U Frankfurt
Targetbase                 U Connecticut     Tech Uni Munich
Hewlett-Packard            UT Dallas         Leuven
                           U Washington      PUC Chile
SAS
                           OK State          U Duisburg-Essen
Alliance Data                                Commenius U
                           Old Dominion U
Thinkanalytics, LLC                          BU Vienna
                           Georgia State
DK Shiffet & Assoc Ltd.
                           SUNY New Platz
                           U Wisconsin W
THE CONTEST
                           MODELS
•   Ad Hoc
•   Linear Regression
•   Hierarchical Bayesian
•   BG/NBD, MBG-NBD, CBG-NBD, Pareto/NBD
•   Bayesian Seemingly Unrelated Regressions
•   Probit / logistic regression
•   Tobit
•   ARIMA
•   ArtXP Time Series
•   Support Vector Machines
•   Trees
•   Kohonen Networks
•   Feedforward Neural Networks
•   Stochastic Microanalytical Simulations     No Markov chain models though
THE CONTEST
OUTCOME TASK 1: CUSTOMER EQUITY
THE CONTEST
OUTCOME TASK 2: CUSTOMER LIFETIME VALUE
THE CONTEST
               WINNING MODEL



•HP Labs - published paper
•8 Segments via Classification & Regression Trees
•Logit Model for Estimating Activeness
•Log-Linear Model for Estimating Donation Sum
•Also used R for computations
CONCLUSIONS
                              BY DMEF

• Even the Best Model is still ,bad‘ (factor 5.4)
• It is important to get to know your data with EDA
• CLV Models are not commodities
  „It’s more the modeler than the model“
• Duke Teradata Churn Competition
• Organizations should follow Contest approach
   • Split Data Sets (Modeling, Validation)
   • Stress Tests
   • Benchmark

More Related Content

Similar to My Entry to the DMEF CLV Contest

Deutsche EuroShop | Company Presentation | 07/12
Deutsche EuroShop | Company Presentation | 07/12Deutsche EuroShop | Company Presentation | 07/12
Deutsche EuroShop | Company Presentation | 07/12Deutsche EuroShop AG
 
Deutsche EuroShop | Company Presentation | 05/12
Deutsche EuroShop | Company Presentation | 05/12Deutsche EuroShop | Company Presentation | 05/12
Deutsche EuroShop | Company Presentation | 05/12Deutsche EuroShop AG
 
Duratex - 3rd Quarter 2003
Duratex - 3rd Quarter 2003Duratex - 3rd Quarter 2003
Duratex - 3rd Quarter 2003Duratex
 
energy future holindings TXU2004AR
energy future holindings TXU2004ARenergy future holindings TXU2004AR
energy future holindings TXU2004ARfinance29
 
Predicting Real Estate Prices with an ANN
Predicting Real Estate Prices with an ANNPredicting Real Estate Prices with an ANN
Predicting Real Estate Prices with an ANNChris Armstrong
 
Resultados Do Ano 2003
Resultados Do Ano 2003Resultados Do Ano 2003
Resultados Do Ano 2003TIM RI
 
2003 Year End Results
2003 Year End Results2003 Year End Results
2003 Year End ResultsTIM RI
 
Derivatives report 27 apr-2010
Derivatives report 27 apr-2010Derivatives report 27 apr-2010
Derivatives report 27 apr-2010Angel Broking
 
Deutsche EuroShop | Company Presentation | 09/12
Deutsche EuroShop | Company Presentation | 09/12Deutsche EuroShop | Company Presentation | 09/12
Deutsche EuroShop | Company Presentation | 09/12Deutsche EuroShop AG
 
Deutsche EuroShop | Company Presentation | 08/12
Deutsche EuroShop | Company Presentation | 08/12Deutsche EuroShop | Company Presentation | 08/12
Deutsche EuroShop | Company Presentation | 08/12Deutsche EuroShop AG
 
Deutsche EuroShop | Company Presentation | 11/12
Deutsche EuroShop | Company Presentation | 11/12Deutsche EuroShop | Company Presentation | 11/12
Deutsche EuroShop | Company Presentation | 11/12Deutsche EuroShop AG
 
Profarma 3Q12
Profarma 3Q12Profarma 3Q12
Profarma 3Q12Profarma
 
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012Deutsche EuroShop AG
 
Deutsche EuroShop | Company Presentation | 03/12
Deutsche EuroShop | Company Presentation | 03/12Deutsche EuroShop | Company Presentation | 03/12
Deutsche EuroShop | Company Presentation | 03/12Deutsche EuroShop AG
 
GeoDemographics and GeoCoding: An introduction to Sources and Methods
GeoDemographics and GeoCoding: An introduction to Sources and MethodsGeoDemographics and GeoCoding: An introduction to Sources and Methods
GeoDemographics and GeoCoding: An introduction to Sources and MethodsRichard Cantwell
 

Similar to My Entry to the DMEF CLV Contest (17)

Deutsche EuroShop | Company Presentation | 07/12
Deutsche EuroShop | Company Presentation | 07/12Deutsche EuroShop | Company Presentation | 07/12
Deutsche EuroShop | Company Presentation | 07/12
 
Deutsche EuroShop | Company Presentation | 05/12
Deutsche EuroShop | Company Presentation | 05/12Deutsche EuroShop | Company Presentation | 05/12
Deutsche EuroShop | Company Presentation | 05/12
 
Duratex - 3rd Quarter 2003
Duratex - 3rd Quarter 2003Duratex - 3rd Quarter 2003
Duratex - 3rd Quarter 2003
 
energy future holindings TXU2004AR
energy future holindings TXU2004ARenergy future holindings TXU2004AR
energy future holindings TXU2004AR
 
First Half 2012, Asia Pacific TPI Index
First Half 2012, Asia Pacific TPI IndexFirst Half 2012, Asia Pacific TPI Index
First Half 2012, Asia Pacific TPI Index
 
Predicting Real Estate Prices with an ANN
Predicting Real Estate Prices with an ANNPredicting Real Estate Prices with an ANN
Predicting Real Estate Prices with an ANN
 
Resultados Do Ano 2003
Resultados Do Ano 2003Resultados Do Ano 2003
Resultados Do Ano 2003
 
2003 Year End Results
2003 Year End Results2003 Year End Results
2003 Year End Results
 
Derivatives report 27 apr-2010
Derivatives report 27 apr-2010Derivatives report 27 apr-2010
Derivatives report 27 apr-2010
 
Deutsche EuroShop | Company Presentation | 09/12
Deutsche EuroShop | Company Presentation | 09/12Deutsche EuroShop | Company Presentation | 09/12
Deutsche EuroShop | Company Presentation | 09/12
 
Deutsche EuroShop | Company Presentation | 08/12
Deutsche EuroShop | Company Presentation | 08/12Deutsche EuroShop | Company Presentation | 08/12
Deutsche EuroShop | Company Presentation | 08/12
 
Deutsche EuroShop | Company Presentation | 11/12
Deutsche EuroShop | Company Presentation | 11/12Deutsche EuroShop | Company Presentation | 11/12
Deutsche EuroShop | Company Presentation | 11/12
 
Profarma 3Q12
Profarma 3Q12Profarma 3Q12
Profarma 3Q12
 
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012
Deutsche EuroShop - Conference Call Presentation - Preliminary Results FY 2012
 
Deutsche EuroShop | Company Presentation | 03/12
Deutsche EuroShop | Company Presentation | 03/12Deutsche EuroShop | Company Presentation | 03/12
Deutsche EuroShop | Company Presentation | 03/12
 
Second Quarter 2012, Global TPI Index
Second Quarter 2012, Global TPI IndexSecond Quarter 2012, Global TPI Index
Second Quarter 2012, Global TPI Index
 
GeoDemographics and GeoCoding: An introduction to Sources and Methods
GeoDemographics and GeoCoding: An introduction to Sources and MethodsGeoDemographics and GeoCoding: An introduction to Sources and Methods
GeoDemographics and GeoCoding: An introduction to Sources and Methods
 

More from MOSTLY AI

Everything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataEverything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataMOSTLY AI
 
Everything you always wanted to know about Synthetic Data
Everything you always wanted to know about Synthetic DataEverything you always wanted to know about Synthetic Data
Everything you always wanted to know about Synthetic DataMOSTLY AI
 
Synthetic Population Data with MOSTLY AI
Synthetic Population Data with MOSTLY AISynthetic Population Data with MOSTLY AI
Synthetic Population Data with MOSTLY AIMOSTLY AI
 
AI-based re-identification of behavioral data
AI-based re-identification of behavioral dataAI-based re-identification of behavioral data
AI-based re-identification of behavioral dataMOSTLY AI
 
Synthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacySynthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacyMOSTLY AI
 
Nvidia GTC18 Platzer Töglhofer
Nvidia GTC18 Platzer TöglhoferNvidia GTC18 Platzer Töglhofer
Nvidia GTC18 Platzer TöglhoferMOSTLY AI
 
Artificial Intelligence - How Machines Learn
Artificial Intelligence - How Machines LearnArtificial Intelligence - How Machines Learn
Artificial Intelligence - How Machines LearnMOSTLY AI
 
PhD Seminar Riezlern 2016
PhD Seminar Riezlern 2016PhD Seminar Riezlern 2016
PhD Seminar Riezlern 2016MOSTLY AI
 
Stochastic Models of Noncontractual Consumer Relationships
Stochastic Models of Noncontractual Consumer RelationshipsStochastic Models of Noncontractual Consumer Relationships
Stochastic Models of Noncontractual Consumer RelationshipsMOSTLY AI
 
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...MOSTLY AI
 

More from MOSTLY AI (10)

Everything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataEverything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic Data
 
Everything you always wanted to know about Synthetic Data
Everything you always wanted to know about Synthetic DataEverything you always wanted to know about Synthetic Data
Everything you always wanted to know about Synthetic Data
 
Synthetic Population Data with MOSTLY AI
Synthetic Population Data with MOSTLY AISynthetic Population Data with MOSTLY AI
Synthetic Population Data with MOSTLY AI
 
AI-based re-identification of behavioral data
AI-based re-identification of behavioral dataAI-based re-identification of behavioral data
AI-based re-identification of behavioral data
 
Synthetic Data for Big Data Privacy
Synthetic Data for Big Data PrivacySynthetic Data for Big Data Privacy
Synthetic Data for Big Data Privacy
 
Nvidia GTC18 Platzer Töglhofer
Nvidia GTC18 Platzer TöglhoferNvidia GTC18 Platzer Töglhofer
Nvidia GTC18 Platzer Töglhofer
 
Artificial Intelligence - How Machines Learn
Artificial Intelligence - How Machines LearnArtificial Intelligence - How Machines Learn
Artificial Intelligence - How Machines Learn
 
PhD Seminar Riezlern 2016
PhD Seminar Riezlern 2016PhD Seminar Riezlern 2016
PhD Seminar Riezlern 2016
 
Stochastic Models of Noncontractual Consumer Relationships
Stochastic Models of Noncontractual Consumer RelationshipsStochastic Models of Noncontractual Consumer Relationships
Stochastic Models of Noncontractual Consumer Relationships
 
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...
Incorporating Regularity into Models of Noncontractual Customer-Firm Relation...
 

Recently uploaded

Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaShree Krishna Exports
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdftbatkhuu1
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 

Recently uploaded (20)

Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
Best Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in IndiaBest Basmati Rice Manufacturers in India
Best Basmati Rice Manufacturers in India
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
Nepali Escort Girl Kakori \ 9548273370 Indian Call Girls Service Lucknow ₹,9517
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdf
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 

My Entry to the DMEF CLV Contest

  • 1. THE DMEF CLV COMPETITION AND HOW I ENDED UP ON 2ND PLACE
  • 2. THE CHALLENGE $ $ $ ?$?$? 1.1.2002 31.8.2006 31.8.2008 non-contractual Setting non-observable Status
  • 3. THE CHALLENGE $ $ $ ?$?$? 1.1.2002 31.8.2006 31.8.2008 21,000 DONORS acquired in first half of 2002 54,000 DONATIONS until mid of 2006
  • 4. THE GAME PLAN • Understand the Data Set ➙ EDA • Split Estimation for # Transactions and $ Value • Implement Parametric Stochastic Models NBD, Pareto/NBD, BG/NBD, CBG/NBD,.. • Benchmark Data Fit and Predictive Power • Try to Improve Predictive Power
  • 5. THE DATA SET SAMPLED TIMING PATTERNS Various Timing Patterns 11382546 | | | | | 11371770 | | | || | | | | | | | | | | | 11359536 | | | 11343894 | | 11329984 | Donor ID 11317401 | 11303989 | 11292547 | | 11281342 | | | | | | | 11270451 | 11259736 | 10870988 |||||||||||||||||||||||||||||||||||||||||||| 2002 2003 2004 2005 2006 Time Scale
  • 6. THE DATA SET TRENDS AT AGGREGATE LEVEL Nr of Donations Avg Donation Amount 50 8000 40 30 4000 13% 15% 14% 20 10 +24% 10% +12% 0 0 2002 2004 2006 2002 2004 2006 Time Time
  • 7. THE DATA SET TRENDS AT AGGREGATE LEVEL Percentage of Donors Average Nr of Donations who Have Donated Within that Year per Active Donor 0.5 2.0 1.55 0.4 1.46 1.51 1.5 1.42 27.8% 29.5% 0.3 23.5% 1.0 18.8% 0.2 0.5 0.1 0.0 0.0 2002 2003 2004 2005 2002 2003 2004 2005 Time Time
  • 8. THE DATA SET INTERTRANSACTION TIMES Overall Distribution of Intertransaction Times 4000 1 12 3000 Count 2000 1000 24 0 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 Nr of Months in between Donations
  • 9. THE MODELS NBD ASSUMPTIONS (1959) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α „while there is not enough information to reliably estimate the purchase rate for each person, there will generally be enough to estimate the distribution of it over customers“
  • 10. THE MODELS NBD - ESTIMATION r = 0,475 avg IPT: 2,9 years α = 489.5 med IPT: 6,6 years
  • 11. THE MODELS PARETO/NBD ASSUMPTIONS (1987) A) The number of transactions follows a Poisson NBD { process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Customer Lifetime is exponentially distributed Pareto { with death rate μ D) Heterogeneity in μ follows a Gamma distribution with shape parameter s and rate parameter β E) λ and μ are distributed independently
  • 12. THE MODELS BG/NBD ASSUMPTIONS (2005) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
  • 13. THE MODELS CBG/NBD ASSUMPTIONS (2007) A) The number of transactions follows a Poisson process with rate λ B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) At time zero and directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
  • 14. THE BENCHMARK DATA FIT Actual vs Fitted Frequency of Repeat Transactions 10000 Observed NBD Pareto/NBD BG/NBD 8000 2 = 366.1 CBG/NBD NBD 2 Pareto/NBD = 391.5 2 BG/NBD = 487.2 6000 Frequency 2 CBG/NBD = 363.7 4000 2000 0 0 1 2 3 4 5 6 7+
  • 15. THE BENCHMARK PREDICTIVE POWER Time Split Calibration Validation Period Period 2002 2003 2004 2005 2006
  • 16. THE BENCHMARK PREDICTIVE POWER MSLE = Mean Squared Logarithmic Error RMSE = Root Mean Squared Error MAE = Mean Absolute Error Corr = Correlation
  • 17. THE PROBLEM A SIMPLE LINEAR MODEL
  • 18. THE APPROACH INVESTIGATE IN ERRORS Timing Patterns for the Timing Patterns for the 10 Worst Underestimated Donors 10 Worst Overestimated Donors | | | | | | ||||| | | ||| | |||| || | | | | | |||||||||||||||||||||||||||||| || | | ||||||||||||| | |||| || |||| | || ||| |||| | | | | | || | | | |||||||||| |||||||||||||||| |||||||||||| | | | || | ||||||||||||||| || | | || | | | | ||| | | | | || | | | | | | | ||||||||||||||| || | || || | | |||||| ||| | | | | | | | ||||||||||||||| | | | || | ||| | ||||||||||| | | | | | | | |||||||||||||| |||||||||||| ||||||||||||||||||||||| | | | ||||| | | ||||||||||||||||||||||||||||||||| || | | | || ||||||| | ||| |||| | | | | | || | | | | || | | || | | | | | || | ||| | | | | Calibration Period Validation Period Calibration Period Validation Period
  • 19. REGULARITY IT‘S NOT JUST ABOUT RECENCY AND FREQUENCY Two Users with same Recency and Frequency But one of them is more likely to be active after T.
  • 20. THE POISSON PROCESS PROBLEMATIC IMPLICATIONS Poisson implies Exponentially Distributed IPT •Mode Zero: The most likely time of purchase is immediately after a purchase. No dead period. •Memoryless Property: No regularity within timing patterns. Succeeding interpurchase times are assumed to be uncorrelated.
  • 21. THE SOLUTION CBG/CNBD-K ASSUMPTIONS (2008) A) While active, transactions occur with Erlang-k (rate parameter λ) distributed waiting times B) Heterogeneity in λ follows a Gamma distribution with shape parameter r and rate parameter α C) Directly after each purchase there is a constant drop-out probabilty p D) Heterogeneity in p follows a Beta distribution with parameter a and b E) λ and p are distributed independently
  • 22. THE SOLUTION ERLANG-K Erlang 1 | | | | 0.0 0.4 0.8 | | | | | | | | | | | || | | | ||| | | || || | | | | || | | | | | | || | | | | 0 1 2 3 4 5 Erlang 2 | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5 Erlang 3 | | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5 Erlang 100 | | | | | | | | | 0.0 0.4 0.8 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 0 1 2 3 4 5
  • 23. THE SOLUTION CBG/CNBD-K - 2008
  • 24. REGULARITY MEASURES ESTIMATING ,K‘ Distribution of Estimated Gamma Shape Parameters r=1 Exponential IPTs r=2 Erlang 2 IPTs 0 2 4 6 8 10 Regularity Measure M Shape Parameter r 2.5 Actual Distribution of M Distribution of M for r=2 Distribution of M for r=1 2.0 1.5 Density 1.0 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0
  • 25. THE BENCHMARK MSLE RMSE MAE Corr SUM LM 0,0863 0,642 0,262 0,644 -31 % Pareto/NBD 0,0977 0,653 0,359 0,628 +22% BG/NBD 0,0963 0,651 0,362 0,640 +19% CBG/NBD 0,0959 0,650 0,360 0,639 +19% CBD/CNBD-2 0,0831 0,632 0,293 0,660 -11 % CBD/CNBD-3 0,0816 0,637 0,275 0,663 -24 %
  • 26. THE CONTEST PARTICIPANTS Companies US Universities Internation Universities DataLab U Pennsylvania U Frankfurt Targetbase U Connecticut Tech Uni Munich Hewlett-Packard UT Dallas Leuven U Washington PUC Chile SAS OK State U Duisburg-Essen Alliance Data Commenius U Old Dominion U Thinkanalytics, LLC BU Vienna Georgia State DK Shiffet & Assoc Ltd. SUNY New Platz U Wisconsin W
  • 27. THE CONTEST MODELS • Ad Hoc • Linear Regression • Hierarchical Bayesian • BG/NBD, MBG-NBD, CBG-NBD, Pareto/NBD • Bayesian Seemingly Unrelated Regressions • Probit / logistic regression • Tobit • ARIMA • ArtXP Time Series • Support Vector Machines • Trees • Kohonen Networks • Feedforward Neural Networks • Stochastic Microanalytical Simulations No Markov chain models though
  • 28. THE CONTEST OUTCOME TASK 1: CUSTOMER EQUITY
  • 29. THE CONTEST OUTCOME TASK 2: CUSTOMER LIFETIME VALUE
  • 30. THE CONTEST WINNING MODEL •HP Labs - published paper •8 Segments via Classification & Regression Trees •Logit Model for Estimating Activeness •Log-Linear Model for Estimating Donation Sum •Also used R for computations
  • 31. CONCLUSIONS BY DMEF • Even the Best Model is still ,bad‘ (factor 5.4) • It is important to get to know your data with EDA • CLV Models are not commodities „It’s more the modeler than the model“ • Duke Teradata Churn Competition • Organizations should follow Contest approach • Split Data Sets (Modeling, Validation) • Stress Tests • Benchmark