SlideShare a Scribd company logo
1 of 12
Matching as an Alternative to A/B Testing
Christoph Safferling
Head of Game Analytics
Ubisoft Blue Byte
Games Industry Analytics Forum
May 9th, 2013
Self-selection in games
in games, we routinely change things, and want to test if the
change was successful
game changes: quest changes, introduce new items, etc
shop configurations: amount of items, allocation, prices, etc
...and many examples more!
players self-select into the group that maximises their utility
(fun)
most game variables are the results of a player’s decision:
exogeneity is (usually) not given: E[ε|X] = 0
Treatment effects
test the outcome of a treatment effect
E[Y|X, D = 1] − E[Y|X, D = 0] = E[Y(1) − Y(0)|X]
with Y as the outcome, X as the observable data, and D as
the treatment dummy
we are intested in the average treatment effect on the treated:
ATT = E[Y(1) − Y(0)|D = 1]
= E[Y(1)|D = 1] − E[Y(0)|D = 1]
E[Y(0)|D = 1] is a counterfactual: unobservable
proper control groups (A/B testing!) provides a consistent
estimator
sometimes, A/B testing is not available/feasible
(one) different econometric modeling strategy: matching
estimator
reproduce the treatment group among the non-treated:
find individuals who differ only in their outcomes, and their
treatment effect (“statistical twins”)
Assumptions and problems
Conditional Independence Assumption: given X, we assume
the outcome Y to be independent of the treatment D.
→ conditional on observed characteristics, selection bias is
removed
Common Support is given: 0 < P(D = 1|X) < 1
→ we exclude unmatched observations
Curse of Dimensionality: increasing X improves the matching
quality, but makes matching more difficult!
→ e.g. for continuous variables: P(X1 = x) = 0
Several matching algorithms
one-to-one matching estimators
with/without replacement
nearest-neighbour
within-caliper
smoothed matching estimators
k-nearest neighbour
radius matching
weighted smoothed matching estimators
kernel smoothing
local linear regression smoothing
Mahalanobis distance matching
http://xkcd.com/800/
Zeropayments in TSO Russia
payment conversion in TSO RU was low
one explanation: payment process “scary”
“zeropayments” guide the player through the payment
process, offering a small reward for completing a fake
payment
Results of the treatment
reference: lifetime pay-to-active TSO RU a
paid at least once additionally to the zeropayment 5.9a
paid after their zeropayment 3.5a
paid after their zeropayment, not paid before 1.6a
Matching results (tobit)
(1) (2) (5) (6)
tobit full tobit2 full tobit cem tobit2 cem
had zero payments 7.376 19.71 -356.3 -350.1
(0.974) (0.931) (0.270) (0.276)
level 315.3∗∗ 354.1∗∗ 674.4 696.4
(0.007) (0.000) (0.177) (0.179)
level squared -0.796 -1.441 -9.274 -9.635
(0.709) (0.416) (0.291) (0.289)
uniqueLogins -26.27∗∗ -28.22∗∗ -33.35 -34.78
(0.018) (0.007) (0.199) (0.204)
rating for week -407.0† -400.7† 39.74 42.50
(0.076) (0.076) (0.915) (0.908)
guild 647.9∗∗ 651.2∗∗ 639.6 627.8
(0.012) (0.011) (0.388) (0.400)
age 53.18∗∗ 52.37∗∗ 185.4 171.8
(0.024) (0.025) (0.264) (0.288)
(additional controls, including intercept)
N 12376 19522 4114 6894
pseudo R2 0.162 0.189 0.139 0.158
p-values in parentheses
Matching results (zero-inflated negbin)
(1) (2) (5) (6)
zinb full zinb2 full zinb cem zinb2 cem
had zero payments 0.111 0.110 0.540∗∗ 0.538∗∗
(0.463) (0.466) (0.005) (0.006)
level 0.148∗∗ 0.150∗∗ -0.153 -0.255†
(0.012) (0.010) (0.332) (0.096)
level squared -0.00211∗∗ -0.00213∗∗ 0.00429 0.00617∗∗
(0.036) (0.032) (0.155) (0.035)
uniqueLogins -0.0180∗∗ -0.0180∗∗ -0.0308∗∗ -0.0310∗∗
(0.007) (0.006) (0.005) (0.005)
rating for week 0.747∗∗ 0.748∗∗ 1.662∗∗ 1.653∗∗
(0.000) (0.000) (0.000) (0.000)
guild -0.112 -0.112 0.280 0.297
(0.319) (0.319) (0.286) (0.264)
age 0.0383∗∗ 0.0383∗∗ 0.119 0.192†
(0.012) (0.012) (0.308) (0.096)
(additional controls, including intercept and inflate regression)
N 12376 19522 4114 6894
p-values in parentheses
further reading
Rosenbaum, P. R., Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal
effects. Biometrika 70 (1), pp. 41-55.
Heckman, J. J., H. Ichimura, and P. Todd (1997). Matching as an Econometric Evaluation Estimator: Evidence
From Evaluating a Job Training Programme. Review of Economic Studies 64, pp. 605-54.
Angrist, J. D. and A. B. Krueger (1999). Empirical Strategies in Labor Economics. pp. 1277-1366 in Handbook of
Labor Economics, vol. 3, edited by O. C. Ashenfelter and D. Card. Amsterdam: Elsevier.
Blackwell, M., Iacus, S., King, G., Porro, G., (2009). cem: Coarsened exact matching in stata. Stata Journal 9 (4),
pp. 524-546.
Iacus, S., King, G., Porro, G. (June 2008). Matching for causal inference without balance checking. UNIMI –
Research Papers in Economics, Business, and Statistics 1073, Universit´a degli Studi di Milano.
Lechner M. (2002). Some practical issues in the evaluation of heterogeneous labour market programmes by matching
methods. Journal of the Royal Statistical Society. Series A, 165, pp. 59-82.
Leuven, E., Sianesi, B. (April 2003). Psmatch2: Stata module to perform full mahalanobis and propensity score
matching, common support graphing, and covariate imbalance testing. S432001 Statistical Software Components,
Boston College Department of Economics

More Related Content

Similar to Ubisoft

Logisticregression
LogisticregressionLogisticregression
Logisticregression
rnunoo
 

Similar to Ubisoft (20)

Interpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptxInterpreting Logistic Regression.pptx
Interpreting Logistic Regression.pptx
 
Econometrics Notes
Econometrics NotesEconometrics Notes
Econometrics Notes
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W4 The Binary Logistic...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W4 The Binary Logistic...Javier Garcia - Verdugo Sanchez - Six Sigma Training - W4 The Binary Logistic...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W4 The Binary Logistic...
 
Ijsom19041398886200
Ijsom19041398886200Ijsom19041398886200
Ijsom19041398886200
 
Some study materials
Some study materialsSome study materials
Some study materials
 
Final examexamplesapr2013
Final examexamplesapr2013Final examexamplesapr2013
Final examexamplesapr2013
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Recitation decision trees-adaboost-02-09-2006-3
Recitation decision trees-adaboost-02-09-2006-3Recitation decision trees-adaboost-02-09-2006-3
Recitation decision trees-adaboost-02-09-2006-3
 
Z-SCORE.pptx
Z-SCORE.pptxZ-SCORE.pptx
Z-SCORE.pptx
 
Parameterization of Equilibrium Assessment in Bayesian Game with Its Applicat...
Parameterization of Equilibrium Assessment in Bayesian Game with Its Applicat...Parameterization of Equilibrium Assessment in Bayesian Game with Its Applicat...
Parameterization of Equilibrium Assessment in Bayesian Game with Its Applicat...
 
効率的反実仮想学習
効率的反実仮想学習効率的反実仮想学習
効率的反実仮想学習
 
Linear regression without tears
Linear regression without tearsLinear regression without tears
Linear regression without tears
 
How to use statistica for rsm study
How to use statistica for rsm studyHow to use statistica for rsm study
How to use statistica for rsm study
 
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeAlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
 
DRL challenge on Montezuma's Revenge
DRL challenge on Montezuma's RevengeDRL challenge on Montezuma's Revenge
DRL challenge on Montezuma's Revenge
 
Regression
RegressionRegression
Regression
 
logisticregression
logisticregressionlogisticregression
logisticregression
 
logisticregression.ppt
logisticregression.pptlogisticregression.ppt
logisticregression.ppt
 
Logisticregression
LogisticregressionLogisticregression
Logisticregression
 

More from GIAF (6)

Product Madness - A/B Testing
Product Madness - A/B TestingProduct Madness - A/B Testing
Product Madness - A/B Testing
 
Games Analytics Industry Fourm 2 - Opera Solutions
Games Analytics Industry Fourm 2 - Opera SolutionsGames Analytics Industry Fourm 2 - Opera Solutions
Games Analytics Industry Fourm 2 - Opera Solutions
 
Games Industry Analytics Forum 2 - GamesAnalytics
Games Industry Analytics Forum 2 - GamesAnalyticsGames Industry Analytics Forum 2 - GamesAnalytics
Games Industry Analytics Forum 2 - GamesAnalytics
 
Games Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - PlumbeeGames Industry Analytics Forum 2 - Plumbee
Games Industry Analytics Forum 2 - Plumbee
 
GA Analytics Hierarchy
GA Analytics HierarchyGA Analytics Hierarchy
GA Analytics Hierarchy
 
What analytics in the games industry can learn from other sectors
What analytics in the games industry can learn from other sectorsWhat analytics in the games industry can learn from other sectors
What analytics in the games industry can learn from other sectors
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Ubisoft

  • 1. Matching as an Alternative to A/B Testing Christoph Safferling Head of Game Analytics Ubisoft Blue Byte Games Industry Analytics Forum May 9th, 2013
  • 2. Self-selection in games in games, we routinely change things, and want to test if the change was successful game changes: quest changes, introduce new items, etc shop configurations: amount of items, allocation, prices, etc ...and many examples more! players self-select into the group that maximises their utility (fun) most game variables are the results of a player’s decision: exogeneity is (usually) not given: E[ε|X] = 0
  • 3. Treatment effects test the outcome of a treatment effect E[Y|X, D = 1] − E[Y|X, D = 0] = E[Y(1) − Y(0)|X] with Y as the outcome, X as the observable data, and D as the treatment dummy we are intested in the average treatment effect on the treated: ATT = E[Y(1) − Y(0)|D = 1] = E[Y(1)|D = 1] − E[Y(0)|D = 1]
  • 4. E[Y(0)|D = 1] is a counterfactual: unobservable proper control groups (A/B testing!) provides a consistent estimator sometimes, A/B testing is not available/feasible (one) different econometric modeling strategy: matching estimator reproduce the treatment group among the non-treated: find individuals who differ only in their outcomes, and their treatment effect (“statistical twins”)
  • 5. Assumptions and problems Conditional Independence Assumption: given X, we assume the outcome Y to be independent of the treatment D. → conditional on observed characteristics, selection bias is removed Common Support is given: 0 < P(D = 1|X) < 1 → we exclude unmatched observations Curse of Dimensionality: increasing X improves the matching quality, but makes matching more difficult! → e.g. for continuous variables: P(X1 = x) = 0
  • 6. Several matching algorithms one-to-one matching estimators with/without replacement nearest-neighbour within-caliper smoothed matching estimators k-nearest neighbour radius matching weighted smoothed matching estimators kernel smoothing local linear regression smoothing Mahalanobis distance matching
  • 8. Zeropayments in TSO Russia payment conversion in TSO RU was low one explanation: payment process “scary” “zeropayments” guide the player through the payment process, offering a small reward for completing a fake payment
  • 9. Results of the treatment reference: lifetime pay-to-active TSO RU a paid at least once additionally to the zeropayment 5.9a paid after their zeropayment 3.5a paid after their zeropayment, not paid before 1.6a
  • 10. Matching results (tobit) (1) (2) (5) (6) tobit full tobit2 full tobit cem tobit2 cem had zero payments 7.376 19.71 -356.3 -350.1 (0.974) (0.931) (0.270) (0.276) level 315.3∗∗ 354.1∗∗ 674.4 696.4 (0.007) (0.000) (0.177) (0.179) level squared -0.796 -1.441 -9.274 -9.635 (0.709) (0.416) (0.291) (0.289) uniqueLogins -26.27∗∗ -28.22∗∗ -33.35 -34.78 (0.018) (0.007) (0.199) (0.204) rating for week -407.0† -400.7† 39.74 42.50 (0.076) (0.076) (0.915) (0.908) guild 647.9∗∗ 651.2∗∗ 639.6 627.8 (0.012) (0.011) (0.388) (0.400) age 53.18∗∗ 52.37∗∗ 185.4 171.8 (0.024) (0.025) (0.264) (0.288) (additional controls, including intercept) N 12376 19522 4114 6894 pseudo R2 0.162 0.189 0.139 0.158 p-values in parentheses
  • 11. Matching results (zero-inflated negbin) (1) (2) (5) (6) zinb full zinb2 full zinb cem zinb2 cem had zero payments 0.111 0.110 0.540∗∗ 0.538∗∗ (0.463) (0.466) (0.005) (0.006) level 0.148∗∗ 0.150∗∗ -0.153 -0.255† (0.012) (0.010) (0.332) (0.096) level squared -0.00211∗∗ -0.00213∗∗ 0.00429 0.00617∗∗ (0.036) (0.032) (0.155) (0.035) uniqueLogins -0.0180∗∗ -0.0180∗∗ -0.0308∗∗ -0.0310∗∗ (0.007) (0.006) (0.005) (0.005) rating for week 0.747∗∗ 0.748∗∗ 1.662∗∗ 1.653∗∗ (0.000) (0.000) (0.000) (0.000) guild -0.112 -0.112 0.280 0.297 (0.319) (0.319) (0.286) (0.264) age 0.0383∗∗ 0.0383∗∗ 0.119 0.192† (0.012) (0.012) (0.308) (0.096) (additional controls, including intercept and inflate regression) N 12376 19522 4114 6894 p-values in parentheses
  • 12. further reading Rosenbaum, P. R., Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 (1), pp. 41-55. Heckman, J. J., H. Ichimura, and P. Todd (1997). Matching as an Econometric Evaluation Estimator: Evidence From Evaluating a Job Training Programme. Review of Economic Studies 64, pp. 605-54. Angrist, J. D. and A. B. Krueger (1999). Empirical Strategies in Labor Economics. pp. 1277-1366 in Handbook of Labor Economics, vol. 3, edited by O. C. Ashenfelter and D. Card. Amsterdam: Elsevier. Blackwell, M., Iacus, S., King, G., Porro, G., (2009). cem: Coarsened exact matching in stata. Stata Journal 9 (4), pp. 524-546. Iacus, S., King, G., Porro, G. (June 2008). Matching for causal inference without balance checking. UNIMI – Research Papers in Economics, Business, and Statistics 1073, Universit´a degli Studi di Milano. Lechner M. (2002). Some practical issues in the evaluation of heterogeneous labour market programmes by matching methods. Journal of the Royal Statistical Society. Series A, 165, pp. 59-82. Leuven, E., Sianesi, B. (April 2003). Psmatch2: Stata module to perform full mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. S432001 Statistical Software Components, Boston College Department of Economics