SlideShare a Scribd company logo
1 of 61
Download to read offline
It’s the cornerstone of many of the biggest businesses in the US, including
Google & Amazon, and the backbone of most scientific undertakings.
But data is just a tool, and like almost every tool it has both uses and abuses,
not to mention just straight up errors. How many conflicting health studies
have you seen?
As a company Kongregate uses a lot of data, and some of you have probably
seen talks I’ve given before where I share a lot of that data. But a lot of the
time I’ve been unsure whether we’re this ship, charting a clean course to
treasure, or this ship, towards disaster. Both have happened! And since I
think that’s a pretty common phenomenon, I thought it would be a good talk
for GDC.
I love using numbers & testing to understand the world. I still probably spend
at least an hour a day poking around dashboards and spreadsheets because
it’s so much more fun for me than meetings.
I’m mostly self-taught, majored in Eastern European Studies, not math or
econ. Stumbled into direct marketing, specifically catalogs, after college, and
fell in love with data. Taught myself SQL because I hated to wait for IT to pull
my data, took math & econ classes to understand more theory. After 10
years in catalogs & e-commerce and a near-miss with econ grad school I co-
founded Kongregate partly to do something completely different. But it hasn’t
turned out to be that different after all. User acquisition in particular is
fundamentally similar between catalogs & games.
Part of the reason I’m telling you this is to make my first point:
And for an organization to do data right you can’t toss analysis back and forth
over a wall to quants. It takes intimate knowledge of a game (and the
development) to do good analysis and multiple perspectives and theories are
good.
Sometimes it’s immediately obvious. One of the first games we launched on
mobile was an endless runner. It wasn’t filtering purchases from jailbroken
phones and was showing an average revenue per player of $500. That’s not
very plausible and easily caught. But most issues are much more subtle –
tracking pixels not firing correctly for a particular game on a particular
browser, tutorial steps being completed twice by some players but not by
others, clients reporting strange timestamps, etc. For this reason I
recommend never relying on any analytic system where you can’t go in and
inspect individual records. If you can’t check the detail there are some
problems you’ll never find and fix.
Even when your data is accurate it can still be deceiving. This looks like 4
separate pictures photoshopped together to create an appealing color grid,
right?
Wrong.
So much of data is like these pictures – a set-up that appears
straightforwardly to be one thing from one angle, turns out to be completely
different from another.
Except of course you know I’m setting you up
People are playing game 1 longer than game 2, and buying repeatedly. But if
you just concentrated on daily monetization stats you could miss that entirely.
The witnesses may be lying or confused. The crime scene may have been
tampered with.
You can’t trust any one piece of evidence but by cross-checking them
against each other you can figure out what’s true and false.
Client data (our SDK, Adjust) vs server data
App stores
Benchmarking against other games
Benchmarking deltas
Your goal should be to create a 3-dimensional view of your players and your
game. How people move through and interact with different parts. It’s a living,
changing system and flat views are not enough.
We tend to think of playerbases as monolithic but really they are
aggregations of all sorts of subgroups created by time in game, platform,
device, browser, demographics, source – and these subgroups are shifting
around. Changes in key KPIs are more often the result of changes in the
audience than they are of changes in the game.
These examples show dramatic changes, but more subtle audience changes
are happening all the time. Tracking cohorts by date of install/registration is a
good way to track metrics independent of certain types of mix issues, but
then it’s easy to lose track of events and changes in the game. So as ever,
it’s about building a true picture across multiple sources.
75% ARPDAU decline, then a modest recover to ~50% of
previous high.
When you break out ARPDAU by player age you can see that
the decline isn’t nearly as dramatic. There’s some decline after a
big holiday sale, and then again some as we expanded UA
aggressively. But most of it is from fewer el
This is for a collectible card game where the player who goes first has a
substantial advantage.
On this chart of player win rates for Tyrant it looks like Mission 24 is very
difficult (50% win rate) and mission 25 is easy (95% win rate). It’s sort of true:
Mission 25 is relatively easy for those who attempt it. But by deck strength
it’s harder than 22, which has a 70% win rate. Mission 25 is easy for the
players who are strong enough & skilled enough to beat Mission 24, a
selected subgroup of those who attempted 24.
So for the last 10 minutes I’ve been ranting about how important it is too look
at audience mix split
The most important metrics (revenue, sessions, battles, etc) in games are all
power distributions. Your business (especially in free-to-play games) is
driven by outliers, and their presence or absence distorts almost any data
you look at.
Your outliers are your best players so it’s a good idea to do individual
analysis on them to understand who they are, what drives them, and what
they’re most likely to distort.
Binary “yes/no” metrics like % buyer, D7 retention, tutorial completion are a
lot more stable than averages involving revenue and engagement like
ARPPU, $/DAU, Avg Sessions, and can be looked at in much smaller
samples.
Sometimes we do it consciously, but more often it’s unconscious. I’ll look at a
group of cohorts and the best one is ALWAYS the most memorable. If you’re
in test market and hoping to hit 50% the days you hit that number will imprint
on your brain that your game has 50% D1 retention, even if the average is
45%.
Cherry Picking’s great and good friend!
Part of building a mental mode of your game is having theories about
behavior, and if you have a theory you should test it. But it’s really easy to
look for the data that supports you theory and miss the data that contradicts
it, or even just muddies the picture. [Can I find an example]
How you visualize data has a big impact on how you perceive it.
Ice cream consumption and drowning are correlated, because they’re both
more likely to happen in hot weather. But ice cream kills would be a terrible
conclusion. We’ve all heard this a 1000 times but we need to keep hearing it
like a mantra every day because we all make this same mistake over and
over and over. We’re humans, we’re wired to search for causation. It’s our
superpower and a curse.
Almost every metric you look at will be positively correlated with engagement
because the most engaged users do everything more. Maybe Facebook is
increasing engagement. Maybe only engaged players were willing to hit the
button and potentially spam their friends.
This is the real way to separate correlation from causation and understand
what’s really going on. But it’s not a magic bullet, because nothing is that
easy. Testing has real costs in engineering time & overhead, complexity, and
divisions/confusions for the players, and the more you’re running the worse
that gets.
There’s also a lot of ways to screw up A/B testing even though it seems so
foolproof. Most A/B test traps are variations on themes I’ve mentioned but
some are new, particularly issues around how people get assigned to tests
For example if you’re A/B testing your store, don’t assign people to the test
unless they interact with the store. It’s often easier to split people as they
arrive in your game, or some other thing, but a) there’s a chance you would
end up with non-equal distribution of interaction with the tested feature and
b)any signal from the test group would get lost in the noise of a larger
sample.
Tests can have unintended consequences, you should look at additional
metrics beyond the one being tested to make sure that you get the full
picture. Commercial A/B products often make you choose one metric for a
test to prevent you from fishing for the good result to decide the test on. I
think it’s more important to understand the full effects of the change that you
made (though fishing is bad, too.)
Early results tend to be both volatile and fascinating – differences are
exaggerated or totally change direction. People tend to remember the early,
interesting results rather than the actual results. People also often want to
end the test early if they see a big swing, which is a bad idea. So I
recommend that you don’t look at early test results except to make sure the
test isn’t totally broken. How big should your test sample be? In my opinion
the bigger the better.
When people talk about A/B tests you’ll often hear things like “we’ve got a
statistically significant 5% lift”! And most people hear that and think that
means that the lift is definitely 5%. But that’s not how statistical significance
tests work.
Statistical significance tests assume that there is some true difference in lift,
and that if you run the same test repeatedly there will be a bell curve
distribution of results, with the true lift as the average. Your 5% result could
be right on the mean, or it could be an outlier on either end. If it’s statistically
significant then the chance is low (usually 5% or less) that there’s no lift at all.
But the true lift could be 1% or 10%. Conversely if you do a test that doesn’t
show a lift, or doesn’t pass the significance test for a small lift that doesn’t
mean there ISN’T a lift.
This is why I like to run A/B tests with larger sample sizes. It’s like running
the test again and averaging the results. It’s possible you’d get two outlier
results in the same direction, but becomes less and less likely, and more
likely that your test results represent the true mean.
Often 70-80% of a free-to-play game’s revenue will come from a small % of
buyers who spend more than $500.
Large sample sizes help here, too.
This can be really frustrating, even demoralizing for a team. When you’re
going through the effort to make and test changes, you want them to mean
something! You want to make progress. And then you get another non-result
on a test. But finding out what doesn’t matter can actually be really powerful.
Here’s an extreme example of this from the team at Butterscotch
Shenanigans, who made the game Crashlands. They had written up an
elaborate, detailed description and decided to test how much impact it had
using Google’s store testing system on Android against the most extreme
possible variant, no description at all. Just the accolades the game has
received.
They were kind enough to share the results and after 4 full months the test
shows absolutely no difference, and that actually tells you a lot: specifically
that the description has very little impact, and this is consistent with the
testing we’ve done on our own games, as well. Time and resources are a
constraint for virtually everybody, and knowing what is not important allows
you to concentrate more on things that do matter. We used to argue
endlessly over game names, but after doing test after test and not seeing
much difference we’re all much more relaxed about it.
But it’s important not to extrapolate too much. Just because you get a
particular
Specifically late game content is often very difficult to test, or any testing on
late game players.
Daniel Cook from Spryfox tweeted this recently. He was talking about
YouTube and algorithms, but I think it helps frame some of the limitations of
testing. As a player plays a game, the game is shaping their expectations
and experience, and training them to behave in certain ways. So the same
player might react very differently based on how long they had been playing
the game. And when engaged players start talking to each other in chat and
forums they affect each other, too. Plus you run into small sample sizes with
lots of outliers and other fun problems I’ve already talked about.
Tyrant successful on a small core audience, but difficult to market
CPIs for live version of Castaway Cove are okay, but much higher than we’d
been targeting. Lots of ways we probably went wrong
So far data has helped us iterate on existing games, pointing us in the
direction that helped get us from Tyrant to Animation Throwdown. But in
But what we don’t know is as important as what we do know
Data is alway going to tell you to make an existing successful game, but
better. It’s not going to tell you to make a game unlike anything people have
played before
But what we don’t know is as important as what we do know
Detectives, CSIs, Astronomers, Cartographers, Explorers:
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?

More Related Content

What's hot

GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesGDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesTamara (Tammy) Levy
 
F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)Kongregate
 
Kongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesKongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesDavidKongregate
 
Who plays mobile games? What do we know about mobile players?
Who plays mobile games? What do we know about mobile players?Who plays mobile games? What do we know about mobile players?
Who plays mobile games? What do we know about mobile players?GameCamp
 
Metrics for a Brave New Whirled
Metrics for a Brave New WhirledMetrics for a Brave New Whirled
Metrics for a Brave New Whirledcapncleaver
 
A mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalA mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalcapncleaver
 
Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Kongregate
 
Gender diversity in gaming
Gender diversity in gamingGender diversity in gaming
Gender diversity in gamingGameCamp
 
Using (Free!) App Annie data to optimize your next game
Using (Free!) App Annie data to optimize your next gameUsing (Free!) App Annie data to optimize your next game
Using (Free!) App Annie data to optimize your next gameEric Seufert
 
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?Emily Greer at GDC 2018: Data-Driven or Data-Blinded?
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?Kongregate
 
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...Adrian Crook and Associates
 
What is this Sims 4?
What is this Sims 4?What is this Sims 4?
What is this Sims 4?Ruth Deller
 
Survey and Sampling: Co-Branding Capabilities
Survey and Sampling: Co-Branding CapabilitiesSurvey and Sampling: Co-Branding Capabilities
Survey and Sampling: Co-Branding CapabilitiesPeanut Labs
 
Killer Design Patterns for F2P Mobile/Tablet Games
Killer Design Patterns for F2P Mobile/Tablet GamesKiller Design Patterns for F2P Mobile/Tablet Games
Killer Design Patterns for F2P Mobile/Tablet GamesHenric Suuronen
 
Outlook on the (potential) Future of the German Games Industry | Ralf C. Adam
Outlook on the (potential) Future of the German Games Industry | Ralf C. AdamOutlook on the (potential) Future of the German Games Industry | Ralf C. Adam
Outlook on the (potential) Future of the German Games Industry | Ralf C. AdamRalf C. Adam
 
The rise of Hyper-casual & takeaways from Tap Tap Games
The rise of Hyper-casual & takeaways from Tap Tap GamesThe rise of Hyper-casual & takeaways from Tap Tap Games
The rise of Hyper-casual & takeaways from Tap Tap GamesGameCamp
 

What's hot (19)

GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P GamesGDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
GDC Talk - Nature vs Nurture: Unpacking Player Spending in F2P Games
 
F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)F2P Design Crash Course (Casual Connect Kyiv 2013)
F2P Design Crash Course (Casual Connect Kyiv 2013)
 
Kongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership OpportunitiesKongregate Web Games Partnership Opportunities
Kongregate Web Games Partnership Opportunities
 
Who plays mobile games? What do we know about mobile players?
Who plays mobile games? What do we know about mobile players?Who plays mobile games? What do we know about mobile players?
Who plays mobile games? What do we know about mobile players?
 
Metrics for a Brave New Whirled
Metrics for a Brave New WhirledMetrics for a Brave New Whirled
Metrics for a Brave New Whirled
 
A mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_finalA mysterious adventure_in_social_games_final
A mysterious adventure_in_social_games_final
 
Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18Josh Larson’s Talk at White Nights Prague '18
Josh Larson’s Talk at White Nights Prague '18
 
Gender diversity in gaming
Gender diversity in gamingGender diversity in gaming
Gender diversity in gaming
 
Using (Free!) App Annie data to optimize your next game
Using (Free!) App Annie data to optimize your next gameUsing (Free!) App Annie data to optimize your next game
Using (Free!) App Annie data to optimize your next game
 
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?Emily Greer at GDC 2018: Data-Driven or Data-Blinded?
Emily Greer at GDC 2018: Data-Driven or Data-Blinded?
 
The Power of Free-To-Play
The Power of Free-To-PlayThe Power of Free-To-Play
The Power of Free-To-Play
 
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
The Keys to Making Successful Free-to-Play Games on Steam - A Design and Prod...
 
What is this Sims 4?
What is this Sims 4?What is this Sims 4?
What is this Sims 4?
 
Mobile legends
Mobile legendsMobile legends
Mobile legends
 
Survey and Sampling: Co-Branding Capabilities
Survey and Sampling: Co-Branding CapabilitiesSurvey and Sampling: Co-Branding Capabilities
Survey and Sampling: Co-Branding Capabilities
 
Vladimir Tomko, OK.RU
Vladimir Tomko, OK.RUVladimir Tomko, OK.RU
Vladimir Tomko, OK.RU
 
Killer Design Patterns for F2P Mobile/Tablet Games
Killer Design Patterns for F2P Mobile/Tablet GamesKiller Design Patterns for F2P Mobile/Tablet Games
Killer Design Patterns for F2P Mobile/Tablet Games
 
Outlook on the (potential) Future of the German Games Industry | Ralf C. Adam
Outlook on the (potential) Future of the German Games Industry | Ralf C. AdamOutlook on the (potential) Future of the German Games Industry | Ralf C. Adam
Outlook on the (potential) Future of the German Games Industry | Ralf C. Adam
 
The rise of Hyper-casual & takeaways from Tap Tap Games
The rise of Hyper-casual & takeaways from Tap Tap GamesThe rise of Hyper-casual & takeaways from Tap Tap Games
The rise of Hyper-casual & takeaways from Tap Tap Games
 

Similar to Emily Greer at GDC 2018: Data-Driven or Data-Blinded?

Balance between insight and noise indicia v2
Balance between insight and noise indicia v2Balance between insight and noise indicia v2
Balance between insight and noise indicia v2Nick Barthram
 
Slides from Growthcon 2014 Lean Analytics masterclass
Slides from Growthcon 2014 Lean Analytics masterclassSlides from Growthcon 2014 Lean Analytics masterclass
Slides from Growthcon 2014 Lean Analytics masterclassLean Analytics
 
Pairing Analytics With Qualitative Methods to Understand the WHY
Pairing Analytics With Qualitative Methods to Understand the WHYPairing Analytics With Qualitative Methods to Understand the WHY
Pairing Analytics With Qualitative Methods to Understand the WHYMichele Kiss
 
4 Cycles Remote Innovation - Communicate & Check
4  Cycles Remote Innovation - Communicate & Check 4  Cycles Remote Innovation - Communicate & Check
4 Cycles Remote Innovation - Communicate & Check Bryan Cassady
 
CommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedCommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedJim Parnitzke
 
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...Lean Analytics
 
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...Lean Startup Co.
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsDr Janet Bastiman
 
Prediction - the future of game analytics - white paper
Prediction - the future of game analytics - white paperPrediction - the future of game analytics - white paper
Prediction - the future of game analytics - white paperJune Lee
 
Data Science unit 2 By: Professor Lili Saghafi
Data Science unit 2 By: Professor Lili SaghafiData Science unit 2 By: Professor Lili Saghafi
Data Science unit 2 By: Professor Lili SaghafiProfessor Lili Saghafi
 
Machine Learning Interview Questions Answers
Machine Learning Interview Questions AnswersMachine Learning Interview Questions Answers
Machine Learning Interview Questions AnswersShareDocView.com
 
41 essential machine learning interview questions!
41 essential machine learning interview questions!41 essential machine learning interview questions!
41 essential machine learning interview questions!SrinevethaAR
 
Five Ways to Get Better Data From Our Users
Five Ways to Get Better Data From Our UsersFive Ways to Get Better Data From Our Users
Five Ways to Get Better Data From Our UsersSajid Reshamwala
 
2016 letter to Amazon shareholders
2016 letter to Amazon shareholders2016 letter to Amazon shareholders
2016 letter to Amazon shareholdersMatt Oh
 
Jeff Bezos' 2016 Letter to Amazon Shareholders
Jeff Bezos' 2016 Letter to Amazon ShareholdersJeff Bezos' 2016 Letter to Amazon Shareholders
Jeff Bezos' 2016 Letter to Amazon ShareholdersRazin Mustafiz
 
Amazon Jeff Bezos 2016 letter to shareholders
Amazon Jeff Bezos 2016 letter to shareholdersAmazon Jeff Bezos 2016 letter to shareholders
Amazon Jeff Bezos 2016 letter to shareholdersLaurie Ruettimann
 
BbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopBbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopSteve Feldman
 

Similar to Emily Greer at GDC 2018: Data-Driven or Data-Blinded? (20)

Being a Data-Driven Communicator
Being a Data-Driven CommunicatorBeing a Data-Driven Communicator
Being a Data-Driven Communicator
 
Balance between insight and noise indicia v2
Balance between insight and noise indicia v2Balance between insight and noise indicia v2
Balance between insight and noise indicia v2
 
Slides from Growthcon 2014 Lean Analytics masterclass
Slides from Growthcon 2014 Lean Analytics masterclassSlides from Growthcon 2014 Lean Analytics masterclass
Slides from Growthcon 2014 Lean Analytics masterclass
 
Pairing Analytics With Qualitative Methods to Understand the WHY
Pairing Analytics With Qualitative Methods to Understand the WHYPairing Analytics With Qualitative Methods to Understand the WHY
Pairing Analytics With Qualitative Methods to Understand the WHY
 
4 Cycles Remote Innovation - Communicate & Check
4  Cycles Remote Innovation - Communicate & Check 4  Cycles Remote Innovation - Communicate & Check
4 Cycles Remote Innovation - Communicate & Check
 
CommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_UnbrandedCommonAnalyticMistakes_v1.17_Unbranded
CommonAnalyticMistakes_v1.17_Unbranded
 
Jerait PDF.pdf
Jerait PDF.pdfJerait PDF.pdf
Jerait PDF.pdf
 
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...
Slides for the day-long Lean Analytics workshop at the 2014 Lean Startup conf...
 
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...
Startup Metrics: The Data That Will Make or Break Your Business by Alistair C...
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systems
 
Prediction - the future of game analytics - white paper
Prediction - the future of game analytics - white paperPrediction - the future of game analytics - white paper
Prediction - the future of game analytics - white paper
 
Data Science unit 2 By: Professor Lili Saghafi
Data Science unit 2 By: Professor Lili SaghafiData Science unit 2 By: Professor Lili Saghafi
Data Science unit 2 By: Professor Lili Saghafi
 
Machine Learning Interview Questions Answers
Machine Learning Interview Questions AnswersMachine Learning Interview Questions Answers
Machine Learning Interview Questions Answers
 
41 essential machine learning interview questions!
41 essential machine learning interview questions!41 essential machine learning interview questions!
41 essential machine learning interview questions!
 
Five Ways to Get Better Data From Our Users
Five Ways to Get Better Data From Our UsersFive Ways to Get Better Data From Our Users
Five Ways to Get Better Data From Our Users
 
It's Your Move
It's Your MoveIt's Your Move
It's Your Move
 
2016 letter to Amazon shareholders
2016 letter to Amazon shareholders2016 letter to Amazon shareholders
2016 letter to Amazon shareholders
 
Jeff Bezos' 2016 Letter to Amazon Shareholders
Jeff Bezos' 2016 Letter to Amazon ShareholdersJeff Bezos' 2016 Letter to Amazon Shareholders
Jeff Bezos' 2016 Letter to Amazon Shareholders
 
Amazon Jeff Bezos 2016 letter to shareholders
Amazon Jeff Bezos 2016 letter to shareholdersAmazon Jeff Bezos 2016 letter to shareholders
Amazon Jeff Bezos 2016 letter to shareholders
 
BbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics WorkshopBbWorld 2009 Performance Forensics Workshop
BbWorld 2009 Performance Forensics Workshop
 

More from Kongregate

Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...
Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...
Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...Kongregate
 
All the Families: The Making of Animation Throwdown (GDC 2018)
All the Families: The Making of Animation Throwdown (GDC 2018)All the Families: The Making of Animation Throwdown (GDC 2018)
All the Families: The Making of Animation Throwdown (GDC 2018)Kongregate
 
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?Kongregate
 
PC and Mobile: Going Cross Platform Post Launch
PC and Mobile: Going Cross Platform Post LaunchPC and Mobile: Going Cross Platform Post Launch
PC and Mobile: Going Cross Platform Post LaunchKongregate
 
Idle Games: The Mechanics and Monetization of Self-Playing Games
Idle Games: The Mechanics and Monetization of Self-Playing GamesIdle Games: The Mechanics and Monetization of Self-Playing Games
Idle Games: The Mechanics and Monetization of Self-Playing GamesKongregate
 
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)Kongregate
 

More from Kongregate (6)

Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...
Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...
Tammy Levy at GDC 2019: Nature vs. Nurture: Unpacking Player Spending in F2P ...
 
All the Families: The Making of Animation Throwdown (GDC 2018)
All the Families: The Making of Animation Throwdown (GDC 2018)All the Families: The Making of Animation Throwdown (GDC 2018)
All the Families: The Making of Animation Throwdown (GDC 2018)
 
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?
Casual Connect Seattle 2017: Did That Publisher Just Ghost Me?
 
PC and Mobile: Going Cross Platform Post Launch
PC and Mobile: Going Cross Platform Post LaunchPC and Mobile: Going Cross Platform Post Launch
PC and Mobile: Going Cross Platform Post Launch
 
Idle Games: The Mechanics and Monetization of Self-Playing Games
Idle Games: The Mechanics and Monetization of Self-Playing GamesIdle Games: The Mechanics and Monetization of Self-Playing Games
Idle Games: The Mechanics and Monetization of Self-Playing Games
 
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)
Building Games for the Long Term: Pragmatic F2P Guild Design (GDC Europe 2013)
 

Recently uploaded

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 

Recently uploaded (20)

Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 

Emily Greer at GDC 2018: Data-Driven or Data-Blinded?

  • 1.
  • 2. It’s the cornerstone of many of the biggest businesses in the US, including Google & Amazon, and the backbone of most scientific undertakings.
  • 3. But data is just a tool, and like almost every tool it has both uses and abuses, not to mention just straight up errors. How many conflicting health studies have you seen?
  • 4. As a company Kongregate uses a lot of data, and some of you have probably seen talks I’ve given before where I share a lot of that data. But a lot of the time I’ve been unsure whether we’re this ship, charting a clean course to treasure, or this ship, towards disaster. Both have happened! And since I think that’s a pretty common phenomenon, I thought it would be a good talk for GDC.
  • 5. I love using numbers & testing to understand the world. I still probably spend at least an hour a day poking around dashboards and spreadsheets because it’s so much more fun for me than meetings.
  • 6. I’m mostly self-taught, majored in Eastern European Studies, not math or econ. Stumbled into direct marketing, specifically catalogs, after college, and fell in love with data. Taught myself SQL because I hated to wait for IT to pull my data, took math & econ classes to understand more theory. After 10 years in catalogs & e-commerce and a near-miss with econ grad school I co- founded Kongregate partly to do something completely different. But it hasn’t turned out to be that different after all. User acquisition in particular is fundamentally similar between catalogs & games.
  • 7. Part of the reason I’m telling you this is to make my first point: And for an organization to do data right you can’t toss analysis back and forth over a wall to quants. It takes intimate knowledge of a game (and the development) to do good analysis and multiple perspectives and theories are good.
  • 8. Sometimes it’s immediately obvious. One of the first games we launched on mobile was an endless runner. It wasn’t filtering purchases from jailbroken phones and was showing an average revenue per player of $500. That’s not very plausible and easily caught. But most issues are much more subtle – tracking pixels not firing correctly for a particular game on a particular browser, tutorial steps being completed twice by some players but not by others, clients reporting strange timestamps, etc. For this reason I recommend never relying on any analytic system where you can’t go in and inspect individual records. If you can’t check the detail there are some problems you’ll never find and fix.
  • 9. Even when your data is accurate it can still be deceiving. This looks like 4 separate pictures photoshopped together to create an appealing color grid, right?
  • 10. Wrong. So much of data is like these pictures – a set-up that appears straightforwardly to be one thing from one angle, turns out to be completely different from another.
  • 11. Except of course you know I’m setting you up
  • 12. People are playing game 1 longer than game 2, and buying repeatedly. But if you just concentrated on daily monetization stats you could miss that entirely.
  • 13. The witnesses may be lying or confused. The crime scene may have been tampered with. You can’t trust any one piece of evidence but by cross-checking them against each other you can figure out what’s true and false.
  • 14. Client data (our SDK, Adjust) vs server data App stores Benchmarking against other games Benchmarking deltas
  • 15. Your goal should be to create a 3-dimensional view of your players and your game. How people move through and interact with different parts. It’s a living, changing system and flat views are not enough.
  • 16.
  • 17.
  • 18. We tend to think of playerbases as monolithic but really they are aggregations of all sorts of subgroups created by time in game, platform, device, browser, demographics, source – and these subgroups are shifting around. Changes in key KPIs are more often the result of changes in the audience than they are of changes in the game.
  • 19. These examples show dramatic changes, but more subtle audience changes are happening all the time. Tracking cohorts by date of install/registration is a good way to track metrics independent of certain types of mix issues, but then it’s easy to lose track of events and changes in the game. So as ever, it’s about building a true picture across multiple sources.
  • 20. 75% ARPDAU decline, then a modest recover to ~50% of previous high.
  • 21. When you break out ARPDAU by player age you can see that the decline isn’t nearly as dramatic. There’s some decline after a big holiday sale, and then again some as we expanded UA aggressively. But most of it is from fewer el
  • 22.
  • 23. This is for a collectible card game where the player who goes first has a substantial advantage.
  • 24. On this chart of player win rates for Tyrant it looks like Mission 24 is very difficult (50% win rate) and mission 25 is easy (95% win rate). It’s sort of true: Mission 25 is relatively easy for those who attempt it. But by deck strength it’s harder than 22, which has a 70% win rate. Mission 25 is easy for the players who are strong enough & skilled enough to beat Mission 24, a selected subgroup of those who attempted 24.
  • 25. So for the last 10 minutes I’ve been ranting about how important it is too look at audience mix split
  • 26. The most important metrics (revenue, sessions, battles, etc) in games are all power distributions. Your business (especially in free-to-play games) is driven by outliers, and their presence or absence distorts almost any data you look at.
  • 27. Your outliers are your best players so it’s a good idea to do individual analysis on them to understand who they are, what drives them, and what they’re most likely to distort. Binary “yes/no” metrics like % buyer, D7 retention, tutorial completion are a lot more stable than averages involving revenue and engagement like ARPPU, $/DAU, Avg Sessions, and can be looked at in much smaller samples.
  • 28. Sometimes we do it consciously, but more often it’s unconscious. I’ll look at a group of cohorts and the best one is ALWAYS the most memorable. If you’re in test market and hoping to hit 50% the days you hit that number will imprint on your brain that your game has 50% D1 retention, even if the average is 45%.
  • 29. Cherry Picking’s great and good friend! Part of building a mental mode of your game is having theories about behavior, and if you have a theory you should test it. But it’s really easy to look for the data that supports you theory and miss the data that contradicts it, or even just muddies the picture. [Can I find an example]
  • 30. How you visualize data has a big impact on how you perceive it.
  • 31. Ice cream consumption and drowning are correlated, because they’re both more likely to happen in hot weather. But ice cream kills would be a terrible conclusion. We’ve all heard this a 1000 times but we need to keep hearing it like a mantra every day because we all make this same mistake over and over and over. We’re humans, we’re wired to search for causation. It’s our superpower and a curse.
  • 32. Almost every metric you look at will be positively correlated with engagement because the most engaged users do everything more. Maybe Facebook is increasing engagement. Maybe only engaged players were willing to hit the button and potentially spam their friends.
  • 33. This is the real way to separate correlation from causation and understand what’s really going on. But it’s not a magic bullet, because nothing is that easy. Testing has real costs in engineering time & overhead, complexity, and divisions/confusions for the players, and the more you’re running the worse that gets.
  • 34. There’s also a lot of ways to screw up A/B testing even though it seems so foolproof. Most A/B test traps are variations on themes I’ve mentioned but some are new, particularly issues around how people get assigned to tests
  • 35. For example if you’re A/B testing your store, don’t assign people to the test unless they interact with the store. It’s often easier to split people as they arrive in your game, or some other thing, but a) there’s a chance you would end up with non-equal distribution of interaction with the tested feature and b)any signal from the test group would get lost in the noise of a larger sample.
  • 36.
  • 37. Tests can have unintended consequences, you should look at additional metrics beyond the one being tested to make sure that you get the full picture. Commercial A/B products often make you choose one metric for a test to prevent you from fishing for the good result to decide the test on. I think it’s more important to understand the full effects of the change that you made (though fishing is bad, too.)
  • 38.
  • 39. Early results tend to be both volatile and fascinating – differences are exaggerated or totally change direction. People tend to remember the early, interesting results rather than the actual results. People also often want to end the test early if they see a big swing, which is a bad idea. So I recommend that you don’t look at early test results except to make sure the test isn’t totally broken. How big should your test sample be? In my opinion the bigger the better.
  • 40. When people talk about A/B tests you’ll often hear things like “we’ve got a statistically significant 5% lift”! And most people hear that and think that means that the lift is definitely 5%. But that’s not how statistical significance tests work.
  • 41. Statistical significance tests assume that there is some true difference in lift, and that if you run the same test repeatedly there will be a bell curve distribution of results, with the true lift as the average. Your 5% result could be right on the mean, or it could be an outlier on either end. If it’s statistically significant then the chance is low (usually 5% or less) that there’s no lift at all. But the true lift could be 1% or 10%. Conversely if you do a test that doesn’t show a lift, or doesn’t pass the significance test for a small lift that doesn’t mean there ISN’T a lift. This is why I like to run A/B tests with larger sample sizes. It’s like running the test again and averaging the results. It’s possible you’d get two outlier results in the same direction, but becomes less and less likely, and more likely that your test results represent the true mean.
  • 42. Often 70-80% of a free-to-play game’s revenue will come from a small % of buyers who spend more than $500.
  • 43. Large sample sizes help here, too.
  • 44. This can be really frustrating, even demoralizing for a team. When you’re going through the effort to make and test changes, you want them to mean something! You want to make progress. And then you get another non-result on a test. But finding out what doesn’t matter can actually be really powerful.
  • 45. Here’s an extreme example of this from the team at Butterscotch Shenanigans, who made the game Crashlands. They had written up an elaborate, detailed description and decided to test how much impact it had using Google’s store testing system on Android against the most extreme possible variant, no description at all. Just the accolades the game has received.
  • 46. They were kind enough to share the results and after 4 full months the test shows absolutely no difference, and that actually tells you a lot: specifically that the description has very little impact, and this is consistent with the testing we’ve done on our own games, as well. Time and resources are a constraint for virtually everybody, and knowing what is not important allows you to concentrate more on things that do matter. We used to argue endlessly over game names, but after doing test after test and not seeing much difference we’re all much more relaxed about it.
  • 47. But it’s important not to extrapolate too much. Just because you get a particular
  • 48. Specifically late game content is often very difficult to test, or any testing on late game players. Daniel Cook from Spryfox tweeted this recently. He was talking about YouTube and algorithms, but I think it helps frame some of the limitations of testing. As a player plays a game, the game is shaping their expectations and experience, and training them to behave in certain ways. So the same player might react very differently based on how long they had been playing the game. And when engaged players start talking to each other in chat and forums they affect each other, too. Plus you run into small sample sizes with lots of outliers and other fun problems I’ve already talked about.
  • 49.
  • 50.
  • 51.
  • 52. Tyrant successful on a small core audience, but difficult to market
  • 53.
  • 54. CPIs for live version of Castaway Cove are okay, but much higher than we’d been targeting. Lots of ways we probably went wrong
  • 55. So far data has helped us iterate on existing games, pointing us in the direction that helped get us from Tyrant to Animation Throwdown. But in
  • 56. But what we don’t know is as important as what we do know
  • 57. Data is alway going to tell you to make an existing successful game, but better. It’s not going to tell you to make a game unlike anything people have played before
  • 58.
  • 59. But what we don’t know is as important as what we do know
  • 60. Detectives, CSIs, Astronomers, Cartographers, Explorers: