SlideShare a Scribd company logo
1 of 51
Download to read offline
WHAT’S THE SCIENCE IN
DATA SCIENCE?
Skipper Seabold, Director of Data Science R&D, Product Lead
Civis Analytics
@jseabold
PyData LA
Los Angeles, CA
October 23, 2018
All great ideas come while you’re focused on something else. All
motivation to act comes from Twitter.
Four decades of
econom(etr)ic history
In 120 seconds.
Four decades of
econom(etr)ic history
A focus on research design takes the con out of econometrics
1983
“Let’s Take the Con out
of Econometrics”
Criticism of the current
state of econometric
research. Calls for focus
on threats to validity
(sensitivity analysis) of
non-randomized control
trials.
Quasi-Experimental
Design
Lends credibility,
especially to micro,
through “a clear-eyed
focus on research
design … [such as] it
would command in a
real experiment.”
1980s-
2000s
Randomized Control
Trials
In parallel, increasing
use of randomized
control trials in
economics, especially in
development micro.
1990s-
2000s
Worries about External
Validity
“Counterrevolutionary”
worries that the focus
on experimental and
quasi-experimental
research design comes
at the expense of
external validity.
2000s
“Good designs have a beneficial side effect: they
typically lend themselves to a simple explanation
of empirical methods and a straightforward
presentations of results.”
These changes lead to an increased relevance in policy decisions
First, some context.
What does this have to do
with data science?
Obtain, Scrub, Explore, Model, and iNterpret (OSEMN)
Mason and Wiggins, 2010
Have the “ability to [create] prototype-level versions of ... the steps needed to
derive new insights or build data products”
Analyzing the Analyzers, 2013
Use multidisciplinary methods to understand and have a measurable impact on
a business process or product
Me, Today
Data science exists to drive better business outcomes
Multidisciplinary teams use the (Data) Scientific Method to
measure impacts
Question
Start with a product or business question. E.g., how do our marketing
campaigns perform? What’s driving employee attrition?
Hypothesize
Research, if necessary, and write a falsifiable hypothesis. E.g.,
the ROI on our marketing campaigns is greater than break-even.
Research
Design
Design a strategy that allows you to test your hypothesis,
noting all threats to validity.
Analyze
Analyze all data and evidence. Test for threats to
validity.
Communicate
or Productize
Communicate results in a way that stakeholders
will understand or engineering can use.
The current and coming credibility crisis in data science
Question Objectives are often unclear and success is left undefined.
Hypothesize
Research, if necessary, and write a falsifiable hypothesis. E.g.,
the ROI on our marketing campaigns is greater than break-even.
Research
Design
Black-box(-ish) predictive models are often the focus.
Threats to validity are an afterthought or brushed aside.
Analyze
Analyze all data and evidence. Test for threats to
validity.
Communicate
or Productize
Decision-makers don’t understand the value of
data science projects.
Well, kind of.
So, data science is about
running experiments?
The randomized control trial is the gold standard of scientific
discovery
Population Random
Assignment
to Treatment
Treated
Control
Higher
Productivity
Split &
Measure
But sometimes a randomized control trial is a hard internal sell
(yes, these are often strawman arguments)
The results will be difficult
to interpret.
“We’ll never really know
whether what we tried was
the real reason people
stayed.”
Running a control ad is too
expensive.
“I have to pay what? Just to
buy space for some PSA?”
No one wants to be “B.”
“That’s taking food out of my
mouth.”
And even if we could, sometimes experiments can go wrong
Threats to
Validity
Insufficient
Randomization
Partial
Compliance
Attrition Spillovers
This is where the social
science comes in handy.
And I’m not just justifying my life
choices.
This is where the social
science comes in handy.
And I’m not just justifying my life
choices. This is mostly true.
This is where the social
science comes in handy.
In the absence of an experiment, we might try to measure an
intervention by running a regression
Here are some threats to validity in a regression approach
Is our sample truly
representative of our
population of interest?
We didn’t track who
attended trainings but did a
survey after. 40% responded
and 65% of those attended a
training.
Are we sure that we
understand the direction of
causation?
Low productivity offices
may have been targeted for
training programs.
Did we omit variables that
could plausibly explain our
outcome of interest?
People may pursue training
or further education on their
own.
Avoiding threats to validity
Research design strategies
from the social sciences
Use instrumental variables for overcoming simultaneity or
omitted variable biases
Can get around problems like “reverse causation,”
where the outcome causes changes in the
treatment or common unobserved confounders
where an unobserved variable affects the
outcome and the variable of interest leading to
spurious correlation.
We can replace X with “instruments” that are
correlated with X but not caused by Y or that
affect Y but only through X.
An example from the Fulton fish market: stormy weather as an
instrument
Use matching to overcome non-random treatment assignment
Makes up for the lack of a properly randomized
control group by “matching” treated people to
untreated people who look almost exactly like the
treated group to create a “pseudo-control” group.
Then look at the average treatment effect across
groups.
An illustration of a matched control experiment
Population Non Random
Treatment
Treated
Pseudo-
Control
Higher
Productivity
Match &
Measure
Use difference-in-differences for repeated measures to account
for underlying trends
Used when you don’t have an RCT, but you
observe one or more groups over two or more
periods of time, and there is an intervention
between time periods.
A simple example of difference-in-differences
Regression discontinuity exploits thresholds that determine
treatment
Exploits technical, natural, or randomly occurring
cutoffs in treatment assignment to measure the
treatment effect. Those observations close to but
on opposite sides of the cutoff should be alike in
all but whether they are treated or not.
A simple example of a regression discontinuity design
Somewhere in between a “true experiment” and a
quasi-experiment, a survey experiment gives
slightly changed survey instruments to randomly
selected survey participants. They provide an
interesting way to test many hypotheses and are
heavily and increasingly used in the social and
data sciences, particularly in political science.
Survey experiments can be another great tool
Some types of survey experiments
Survey
Experiment Description
Change the context or wording of questions. This can be used, for example,
to understand how some message or creative content influences a survey
taker vs. a control group.
Priming &
Framing
Change whether a sensitive item is included in a list of response options. May
be able to avoid types of biases that can arise in other approaches. Can be
used to measure brand favorability after some bad publicity, for example.
List
Change aspects of a hypothetical scenario. When combined with a conjoint
design, can be used, for example, to understand how users view different
aspects of a potential product without asking every variation of everyone.
Scenario-
based
The attention to research
design focuses the discussion
By using these methods, you discuss particular threats to
validity and whether identifying assumptions are met
Method Identifying Assumption
Assignment to treatment is random conditional on observed characteristics
(Selection on observables; Unconfoundedness). You have covariate balance
between the treatment and control groups (Overlap).
Matching
Look for clumping at the discontinuity. People may adjust their behavior, if
they understand the selection mechanism.
Regression
Discontinuity
The instrument z affects the outcome y only through the variable x, not
directly (Exclusion Restriction), & the instrument z is correlated with the
endogenous variable x (Relevance Restriction).
Instrumental
Variables
What about all the fancy
machine learning I hear
about?
We’re not limited to regression. There have been interesting
advances in machine learning aimed at causal inference.
Bayesian Additive Regression Trees (BART; Chipman, et al.)
Causal Trees & Forests (Imbens & Athey)
Interpretable Modeling (LIME, SHAP, etc.)
Double Machine Learning (Chernozhukov, Hansen, etc.)
G-Estimation (used in epidemiology)
SuperLearner (van der Laan & Polley)
Post-Selection Inference (Taylor, Barber, Imai & Ratkovic, etc.)
A non-random walk through some
(mostly) recent papers.
A few examples of research
design-based thinking at work
A non-random walk through some
(mostly) recent papers. Pro tip: start
or join a journal club at work.
A few examples of research
design-based thinking at work
“Here, There, and Everywhere: Correlated
Online Behaviors Can Lead to Overestimates of
the Effects of Advertising”
A retreat to experiments
by Randall A. Lewis, Justin M. Rao, and David H. Reiley
Observational experiment methods often overestimate the
causal impact of advertising due to “activity bias”
○ 200M impressions for Major Firm on Y! w/ 10% control
○ Tracked sign-ups on a Competitor Firm
○ Both saw more sign-ups for both ads! (And Y! usage)
○ Survey experiment on Amazon Mechanical Turk
○ Treatment (800) and control (800) of ad on Y! activity
○ Treatment and control had the same effects
○ 250m impressions on Yahoo! FP w/ 5% control
○ 5.4% increase in search traffic via experiment
○ Matching (1198%), Regression (800%), Diff-in-diff (100%)
“The Unfavorable Economics of Measuring the
Returns to Advertising”
The limits of measurement
by Randall A. Lewis and Justin M. Rao
It is extremely difficult but not impossible to measure the effect
of an advertising campaign
Noisy Effects
The standard deviation
of sales vs. the average
is around 10:1 across
many industries.
Small Effect
Sizes
Campaigns have a low
spend per-person, and
the ROI on a break-even
campaign is small.
Experiments &
Observational Methods
Designing an experiment to measure this
effect is hard. Observational methods
look attractive but overestimate effects
due to targeted advertising.
The power of an experiment is the likelihood of finding an effect
if it exists in the population
Experimental design for control vs. treatment digital ads
Study Aspect Description
New Sales &
New Account Sign-Ups
Measured
Outcome
Ran for 2-135 days (median 14); Cost $10k - $600k+; Measured
100k to 10m impressions
25
Experiments
Retail Sales (including department stores) &
Financial ServicesIndustries
Every single experiment studied did not have enough power to
measure a precise 10% return
Question Measurable?
12 of 25 are severely underpowered;
only 3 had sufficient power to
conclude they were “wildly
profitable” (ROI of 50%).
Profitable
Campaign
Every experiment is underpowered.
Greater than
10% Return
9 out of 25 fail to reject the null of no
impact; 10 out of 25 have enough
power to test.
No Impact vs
Positive Return
Result
The powered experiments generally
reveal a significant, positive effect for
advertising.
The median campaign would have to
be 9 times larger to have sufficient
power.
The median retail campaign would
have to be 61 times larger. For
financial services, 1241 times larger
(!).
“Courtyard by Marriott: Designing a Hotel
Facility with Consumer-Based Marketing Models”
Change the course of industry
By Jerry Wind, Paul E. Green, Douglas Shifflet, and Marsha
Scarbrough
“Marriott used conjoint analysis to design a new
hotel chain [in the early 80s].”
A well-designed survey experiment resulted in a wildly
successful product ideation for Marriott
The survey design asked two target segments about 7 facets,
considering 50 attributes with 2 to 8 levels each
7 FACETS
Leisure
ServicesRooms
SecurityFood
LoungeExternal
Factors
As of 1989, Marriott went from 3 tests cases in 1983 to 90
hotel in 1987 with more than $200 million in sales. The chain
was expected to grow to 300 hotels by 1994 with an expected
$1 billion in sales.
Captured within 4% of predicted market share.
Created 3,000 new jobs within Marriott with an expected
14,000 new jobs by 1994.
Affected a restructuring of all competitors the midprice-level
lodging industry (upgrades, prices, amenities, and followers).
How successful was this experiment?
Wrapping Up
Yes, Judea Pearl is a keynote speaker,
and I talked about causal inference
but didn’t mention do-calculus.
Wrapping Up
A credibility crisis in data science is preventable
Question
Work with stakeholders to focus on business relevant and measurable
outcomes from the start.
Hypothesize Be clear about the causal mechanisms that you will test.
Research
Design
Keep it sophisticatedly simple. A well thought out
research design leads to better communication of results.
Analyze
Be honest about any threats to validity. You can test
these in the future. This is scientific progress.
Communicate
or Productize
Go the last mile. This step enables data science
to have decision-making relevance.
Did I miss some good papers? Come @ me on twitter. Until then,
here are some more papers and books on my desk.
“A comparison of Approaches to Advertising Measurement: Evidence from Big
Field Experiments at Facebook”
“Confounding in Survey Experiments”
“Using Big Data to Estimate Consumer Surplus: The Case of Uber”
Mastering Metrics: The Path from Cause to Effect [Looks good!]
“A/B Testing” [Experiments at Microsoft using Bing EXP Platform]
“Measuring Consumer Sensitivity to Audio Advertising: A Field Experiment on
Pandora Internet Radio”
“Attribution Inference for Digital Advertising using Inhomogeneous Poisson
Models”

More Related Content

What's hot

Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practiceAmit Sharma
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker StrategiesTom Plasterer
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Matt Hansen
 
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Matt Hansen
 
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Matt Hansen
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingPenn State University
 
How to calculate power in statistics
How to calculate power in statisticsHow to calculate power in statistics
How to calculate power in statisticsStat Analytica
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Amit Sharma
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Matt Hansen
 
Hypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical TestHypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical TestMatt Hansen
 
Introduction to Power Analysis
Introduction to Power AnalysisIntroduction to Power Analysis
Introduction to Power AnalysisDaria Bondareva
 
Biostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerBiostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerHopkinsCFAR
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyondMaarten van Smeden
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Matt Hansen
 
Sample size and power
Sample size and powerSample size and power
Sample size and powerChristina K J
 

What's hot (20)

Causal inference in practice
Causal inference in practiceCausal inference in practice
Causal inference in practice
 
Biomarker Strategies
Biomarker StrategiesBiomarker Strategies
Biomarker Strategies
 
Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)Hypothesis Testing: Proportions (Compare 1:Standard)
Hypothesis Testing: Proportions (Compare 1:Standard)
 
Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)Hypothesis Testing: Proportions (Compare 2+ Factors)
Hypothesis Testing: Proportions (Compare 2+ Factors)
 
Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)Hypothesis Testing: Spread (Compare 2+ Factors)
Hypothesis Testing: Spread (Compare 2+ Factors)
 
Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)Hypothesis Testing: Relationships (Compare 1:1)
Hypothesis Testing: Relationships (Compare 1:1)
 
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
Hypothesis Testing: Central Tendency – Non-Normal (Nonparametric Overview)
 
Basic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-MakingBasic Statistical Concepts & Decision-Making
Basic Statistical Concepts & Decision-Making
 
How to calculate power in statistics
How to calculate power in statisticsHow to calculate power in statistics
How to calculate power in statistics
 
Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...Predictability of popularity on online social media: Gaps between prediction ...
Predictability of popularity on online social media: Gaps between prediction ...
 
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
Hypothesis Testing: Central Tendency – Normal (Compare 1:Standard)
 
Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)Hypothesis Testing: Spread (Compare 1:Standard)
Hypothesis Testing: Spread (Compare 1:Standard)
 
Hypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical TestHypothesis Testing: Finding the Right Statistical Test
Hypothesis Testing: Finding the Right Statistical Test
 
Introduction to Power Analysis
Introduction to Power AnalysisIntroduction to Power Analysis
Introduction to Power Analysis
 
Biostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerBiostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & Power
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)Hypothesis Testing: Proportions (Compare 1:1)
Hypothesis Testing: Proportions (Compare 1:1)
 
Sti2018 jws
Sti2018 jwsSti2018 jws
Sti2018 jws
 
Bayes rpp bristol
Bayes rpp bristolBayes rpp bristol
Bayes rpp bristol
 
Sample size and power
Sample size and powerSample size and power
Sample size and power
 

Similar to What's the Science in Data Science? - Skipper Seabold

Research design
Research designResearch design
Research designBalaji P
 
SAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.pptSAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.pptnarman1402
 
CORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewCORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewTrident University
 
S6 quantitative research 2019
S6 quantitative research 2019S6 quantitative research 2019
S6 quantitative research 2019collierdr709
 
Mock Scientific Research Paper
Mock Scientific Research PaperMock Scientific Research Paper
Mock Scientific Research PaperJessica Howard
 
Differences between qualitative
Differences between qualitativeDifferences between qualitative
Differences between qualitativeShakeel Ahmad
 
QUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxQUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxssuser2123ba
 
Constructs, variables, hypotheses
Constructs, variables, hypothesesConstructs, variables, hypotheses
Constructs, variables, hypothesesPedro Martinez
 
321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)Iin Angriyani
 
Scientific inquiry
Scientific inquiryScientific inquiry
Scientific inquiryjdougherty
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experimentsnszakir
 
Stats Workshop2010
Stats Workshop2010Stats Workshop2010
Stats Workshop2010anesah
 
Relevance of statistics sgd-slideshare
Relevance of statistics sgd-slideshareRelevance of statistics sgd-slideshare
Relevance of statistics sgd-slideshareSanjeev Deshmukh
 
I/O chapter 2 by Jason Manaois
I/O chapter 2 by Jason ManaoisI/O chapter 2 by Jason Manaois
I/O chapter 2 by Jason ManaoisRoi Xcel
 

Similar to What's the Science in Data Science? - Skipper Seabold (20)

What is research
What is researchWhat is research
What is research
 
Research design
Research designResearch design
Research design
 
SAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.pptSAMPLE_AND_OTHER.ppt
SAMPLE_AND_OTHER.ppt
 
CORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewCORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An Overview
 
S6 quantitative research 2019
S6 quantitative research 2019S6 quantitative research 2019
S6 quantitative research 2019
 
Mock Scientific Research Paper
Mock Scientific Research PaperMock Scientific Research Paper
Mock Scientific Research Paper
 
Differences between qualitative
Differences between qualitativeDifferences between qualitative
Differences between qualitative
 
QUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptxQUARTER 4 – WEEK 1-1.pptx
QUARTER 4 – WEEK 1-1.pptx
 
Methodology
MethodologyMethodology
Methodology
 
Introduction+to+research
Introduction+to+researchIntroduction+to+research
Introduction+to+research
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
Constructs, variables, hypotheses
Constructs, variables, hypothesesConstructs, variables, hypotheses
Constructs, variables, hypotheses
 
321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)
 
Scientific inquiry
Scientific inquiryScientific inquiry
Scientific inquiry
 
Chapter 3 part1-Design of Experiments
Chapter 3 part1-Design of ExperimentsChapter 3 part1-Design of Experiments
Chapter 3 part1-Design of Experiments
 
sampling
samplingsampling
sampling
 
Research Design simplified
Research Design simplifiedResearch Design simplified
Research Design simplified
 
Stats Workshop2010
Stats Workshop2010Stats Workshop2010
Stats Workshop2010
 
Relevance of statistics sgd-slideshare
Relevance of statistics sgd-slideshareRelevance of statistics sgd-slideshare
Relevance of statistics sgd-slideshare
 
I/O chapter 2 by Jason Manaois
I/O chapter 2 by Jason ManaoisI/O chapter 2 by Jason Manaois
I/O chapter 2 by Jason Manaois
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshPyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiPyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerPyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaPyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottPyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroPyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydPyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverPyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardPyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 

Recently uploaded

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

What's the Science in Data Science? - Skipper Seabold

  • 1. WHAT’S THE SCIENCE IN DATA SCIENCE? Skipper Seabold, Director of Data Science R&D, Product Lead Civis Analytics @jseabold PyData LA Los Angeles, CA October 23, 2018
  • 2. All great ideas come while you’re focused on something else. All motivation to act comes from Twitter.
  • 4. In 120 seconds. Four decades of econom(etr)ic history
  • 5. A focus on research design takes the con out of econometrics 1983 “Let’s Take the Con out of Econometrics” Criticism of the current state of econometric research. Calls for focus on threats to validity (sensitivity analysis) of non-randomized control trials. Quasi-Experimental Design Lends credibility, especially to micro, through “a clear-eyed focus on research design … [such as] it would command in a real experiment.” 1980s- 2000s Randomized Control Trials In parallel, increasing use of randomized control trials in economics, especially in development micro. 1990s- 2000s Worries about External Validity “Counterrevolutionary” worries that the focus on experimental and quasi-experimental research design comes at the expense of external validity. 2000s
  • 6. “Good designs have a beneficial side effect: they typically lend themselves to a simple explanation of empirical methods and a straightforward presentations of results.” These changes lead to an increased relevance in policy decisions
  • 7. First, some context. What does this have to do with data science?
  • 8. Obtain, Scrub, Explore, Model, and iNterpret (OSEMN) Mason and Wiggins, 2010 Have the “ability to [create] prototype-level versions of ... the steps needed to derive new insights or build data products” Analyzing the Analyzers, 2013 Use multidisciplinary methods to understand and have a measurable impact on a business process or product Me, Today Data science exists to drive better business outcomes
  • 9. Multidisciplinary teams use the (Data) Scientific Method to measure impacts Question Start with a product or business question. E.g., how do our marketing campaigns perform? What’s driving employee attrition? Hypothesize Research, if necessary, and write a falsifiable hypothesis. E.g., the ROI on our marketing campaigns is greater than break-even. Research Design Design a strategy that allows you to test your hypothesis, noting all threats to validity. Analyze Analyze all data and evidence. Test for threats to validity. Communicate or Productize Communicate results in a way that stakeholders will understand or engineering can use.
  • 10. The current and coming credibility crisis in data science Question Objectives are often unclear and success is left undefined. Hypothesize Research, if necessary, and write a falsifiable hypothesis. E.g., the ROI on our marketing campaigns is greater than break-even. Research Design Black-box(-ish) predictive models are often the focus. Threats to validity are an afterthought or brushed aside. Analyze Analyze all data and evidence. Test for threats to validity. Communicate or Productize Decision-makers don’t understand the value of data science projects.
  • 11. Well, kind of. So, data science is about running experiments?
  • 12. The randomized control trial is the gold standard of scientific discovery Population Random Assignment to Treatment Treated Control Higher Productivity Split & Measure
  • 13. But sometimes a randomized control trial is a hard internal sell (yes, these are often strawman arguments) The results will be difficult to interpret. “We’ll never really know whether what we tried was the real reason people stayed.” Running a control ad is too expensive. “I have to pay what? Just to buy space for some PSA?” No one wants to be “B.” “That’s taking food out of my mouth.”
  • 14. And even if we could, sometimes experiments can go wrong Threats to Validity Insufficient Randomization Partial Compliance Attrition Spillovers
  • 15. This is where the social science comes in handy.
  • 16. And I’m not just justifying my life choices. This is where the social science comes in handy.
  • 17. And I’m not just justifying my life choices. This is mostly true. This is where the social science comes in handy.
  • 18. In the absence of an experiment, we might try to measure an intervention by running a regression
  • 19. Here are some threats to validity in a regression approach Is our sample truly representative of our population of interest? We didn’t track who attended trainings but did a survey after. 40% responded and 65% of those attended a training. Are we sure that we understand the direction of causation? Low productivity offices may have been targeted for training programs. Did we omit variables that could plausibly explain our outcome of interest? People may pursue training or further education on their own.
  • 20. Avoiding threats to validity Research design strategies from the social sciences
  • 21. Use instrumental variables for overcoming simultaneity or omitted variable biases Can get around problems like “reverse causation,” where the outcome causes changes in the treatment or common unobserved confounders where an unobserved variable affects the outcome and the variable of interest leading to spurious correlation. We can replace X with “instruments” that are correlated with X but not caused by Y or that affect Y but only through X.
  • 22. An example from the Fulton fish market: stormy weather as an instrument
  • 23. Use matching to overcome non-random treatment assignment Makes up for the lack of a properly randomized control group by “matching” treated people to untreated people who look almost exactly like the treated group to create a “pseudo-control” group. Then look at the average treatment effect across groups.
  • 24. An illustration of a matched control experiment Population Non Random Treatment Treated Pseudo- Control Higher Productivity Match & Measure
  • 25. Use difference-in-differences for repeated measures to account for underlying trends Used when you don’t have an RCT, but you observe one or more groups over two or more periods of time, and there is an intervention between time periods.
  • 26. A simple example of difference-in-differences
  • 27. Regression discontinuity exploits thresholds that determine treatment Exploits technical, natural, or randomly occurring cutoffs in treatment assignment to measure the treatment effect. Those observations close to but on opposite sides of the cutoff should be alike in all but whether they are treated or not.
  • 28. A simple example of a regression discontinuity design
  • 29. Somewhere in between a “true experiment” and a quasi-experiment, a survey experiment gives slightly changed survey instruments to randomly selected survey participants. They provide an interesting way to test many hypotheses and are heavily and increasingly used in the social and data sciences, particularly in political science. Survey experiments can be another great tool
  • 30. Some types of survey experiments Survey Experiment Description Change the context or wording of questions. This can be used, for example, to understand how some message or creative content influences a survey taker vs. a control group. Priming & Framing Change whether a sensitive item is included in a list of response options. May be able to avoid types of biases that can arise in other approaches. Can be used to measure brand favorability after some bad publicity, for example. List Change aspects of a hypothetical scenario. When combined with a conjoint design, can be used, for example, to understand how users view different aspects of a potential product without asking every variation of everyone. Scenario- based
  • 31. The attention to research design focuses the discussion
  • 32. By using these methods, you discuss particular threats to validity and whether identifying assumptions are met Method Identifying Assumption Assignment to treatment is random conditional on observed characteristics (Selection on observables; Unconfoundedness). You have covariate balance between the treatment and control groups (Overlap). Matching Look for clumping at the discontinuity. People may adjust their behavior, if they understand the selection mechanism. Regression Discontinuity The instrument z affects the outcome y only through the variable x, not directly (Exclusion Restriction), & the instrument z is correlated with the endogenous variable x (Relevance Restriction). Instrumental Variables
  • 33. What about all the fancy machine learning I hear about?
  • 34. We’re not limited to regression. There have been interesting advances in machine learning aimed at causal inference. Bayesian Additive Regression Trees (BART; Chipman, et al.) Causal Trees & Forests (Imbens & Athey) Interpretable Modeling (LIME, SHAP, etc.) Double Machine Learning (Chernozhukov, Hansen, etc.) G-Estimation (used in epidemiology) SuperLearner (van der Laan & Polley) Post-Selection Inference (Taylor, Barber, Imai & Ratkovic, etc.)
  • 35. A non-random walk through some (mostly) recent papers. A few examples of research design-based thinking at work
  • 36. A non-random walk through some (mostly) recent papers. Pro tip: start or join a journal club at work. A few examples of research design-based thinking at work
  • 37. “Here, There, and Everywhere: Correlated Online Behaviors Can Lead to Overestimates of the Effects of Advertising” A retreat to experiments by Randall A. Lewis, Justin M. Rao, and David H. Reiley
  • 38. Observational experiment methods often overestimate the causal impact of advertising due to “activity bias” ○ 200M impressions for Major Firm on Y! w/ 10% control ○ Tracked sign-ups on a Competitor Firm ○ Both saw more sign-ups for both ads! (And Y! usage) ○ Survey experiment on Amazon Mechanical Turk ○ Treatment (800) and control (800) of ad on Y! activity ○ Treatment and control had the same effects ○ 250m impressions on Yahoo! FP w/ 5% control ○ 5.4% increase in search traffic via experiment ○ Matching (1198%), Regression (800%), Diff-in-diff (100%)
  • 39. “The Unfavorable Economics of Measuring the Returns to Advertising” The limits of measurement by Randall A. Lewis and Justin M. Rao
  • 40. It is extremely difficult but not impossible to measure the effect of an advertising campaign Noisy Effects The standard deviation of sales vs. the average is around 10:1 across many industries. Small Effect Sizes Campaigns have a low spend per-person, and the ROI on a break-even campaign is small. Experiments & Observational Methods Designing an experiment to measure this effect is hard. Observational methods look attractive but overestimate effects due to targeted advertising.
  • 41. The power of an experiment is the likelihood of finding an effect if it exists in the population
  • 42. Experimental design for control vs. treatment digital ads Study Aspect Description New Sales & New Account Sign-Ups Measured Outcome Ran for 2-135 days (median 14); Cost $10k - $600k+; Measured 100k to 10m impressions 25 Experiments Retail Sales (including department stores) & Financial ServicesIndustries
  • 43. Every single experiment studied did not have enough power to measure a precise 10% return Question Measurable? 12 of 25 are severely underpowered; only 3 had sufficient power to conclude they were “wildly profitable” (ROI of 50%). Profitable Campaign Every experiment is underpowered. Greater than 10% Return 9 out of 25 fail to reject the null of no impact; 10 out of 25 have enough power to test. No Impact vs Positive Return Result The powered experiments generally reveal a significant, positive effect for advertising. The median campaign would have to be 9 times larger to have sufficient power. The median retail campaign would have to be 61 times larger. For financial services, 1241 times larger (!).
  • 44. “Courtyard by Marriott: Designing a Hotel Facility with Consumer-Based Marketing Models” Change the course of industry By Jerry Wind, Paul E. Green, Douglas Shifflet, and Marsha Scarbrough
  • 45. “Marriott used conjoint analysis to design a new hotel chain [in the early 80s].” A well-designed survey experiment resulted in a wildly successful product ideation for Marriott
  • 46. The survey design asked two target segments about 7 facets, considering 50 attributes with 2 to 8 levels each 7 FACETS Leisure ServicesRooms SecurityFood LoungeExternal Factors
  • 47. As of 1989, Marriott went from 3 tests cases in 1983 to 90 hotel in 1987 with more than $200 million in sales. The chain was expected to grow to 300 hotels by 1994 with an expected $1 billion in sales. Captured within 4% of predicted market share. Created 3,000 new jobs within Marriott with an expected 14,000 new jobs by 1994. Affected a restructuring of all competitors the midprice-level lodging industry (upgrades, prices, amenities, and followers). How successful was this experiment?
  • 49. Yes, Judea Pearl is a keynote speaker, and I talked about causal inference but didn’t mention do-calculus. Wrapping Up
  • 50. A credibility crisis in data science is preventable Question Work with stakeholders to focus on business relevant and measurable outcomes from the start. Hypothesize Be clear about the causal mechanisms that you will test. Research Design Keep it sophisticatedly simple. A well thought out research design leads to better communication of results. Analyze Be honest about any threats to validity. You can test these in the future. This is scientific progress. Communicate or Productize Go the last mile. This step enables data science to have decision-making relevance.
  • 51. Did I miss some good papers? Come @ me on twitter. Until then, here are some more papers and books on my desk. “A comparison of Approaches to Advertising Measurement: Evidence from Big Field Experiments at Facebook” “Confounding in Survey Experiments” “Using Big Data to Estimate Consumer Surplus: The Case of Uber” Mastering Metrics: The Path from Cause to Effect [Looks good!] “A/B Testing” [Experiments at Microsoft using Bing EXP Platform] “Measuring Consumer Sensitivity to Audio Advertising: A Field Experiment on Pandora Internet Radio” “Attribution Inference for Digital Advertising using Inhomogeneous Poisson Models”