SlideShare a Scribd company logo
1 of 34
Download to read offline
Measuring Model Fairness
Stephen Hoover
PyData LA
2018 October 23
Measuring Model Fairness
Stephen Hoover
PyData LA
2018 October 23
J. Henry Hinnefeld, Peter Cooman, Nat Mammo, and Rupert Deese,
"Evaluating Fairness Metrics in the Presence of Dataset Bias";
https://arxiv.org/abs/1809.09245
What is fair?
https://pxhere.com/en/photo/199386
Models determine whether you can buy a home...
https://www.flickr.com/photos/cafecredit/26700612773
and how long you spend on parole...
https://www.flickr.com/photos/archivesnz/27160240521
and what advertisements you see.
https://www.flickr.com/photos/dno1967b/8283313605
and what advertisements you see.
https://www.flickr.com/photos/44313045@N08/6290270129
How do you measure if your model is fair?
https://pixabay.com/en/legal-scales-of-justice-judge-450202/
How do you measure if your model is fair?
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
How do you measure if your model is fair?
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
How do you measure if your model is fair?
http://www.northpointeinc.com/files/publications/Criminal-Justice-Behavior-COMPAS.pdf
https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
The bias comes from your data.
https://commons.wikimedia.org/wiki/File:Bias_t.png
Subtlety #1: Different groups can have different ground truth positive rates
https://www.breastcancer.org/symptoms/understand_bc/statistics
Certain fairness metrics make assumptions about the balance of ground
truth positive rates
Disparate Impact is a popular metric which assumes that the ground truth positive
rates for both groups are the same
Certifying and removing disparate impact, Feldman et al. (https://arxiv.org/abs/1412.3756)
Subtlety #2: Your data are a biased representation of ground truth
Datasets can contain label bias when a protected attribute affects the way
individuals are assigned labels.
In addition, the results indicate that students from African American and Latino families are more likely
than their White peers to receive expulsion or out of school suspension as consequences for the same or
similar problem behavior.
A dataset for predicting “student problem behavior” that used “has been suspended”
for its label could contain label bias.
”Race is not neutral: A national investigation of African American and Latino disproportionality in school discipline.” Skiba et al.
Subtlety #2: Your data are a biased representation of ground truth
Datasets can contain sample bias when a protected attribute affects the sampling
process that generated your data
We find that persons of African and Hispanic descent were stopped more frequently than whites, even
after controlling for precinct variability and race-specific estimates of crime participation.
A dataset for predicting contraband possession that used stop-and-frisk data could
contain sample bias.
”An analysis of the NYPD’s stop-and-frisk policy in the context of claims of racial bias” Gelman et al.
Certain fairness metrics are based on accuracy with respect to possibly
biased labels
Equal Opportunity is a popular metric which compares the True Positive rates
between protected groups
Equality of Opportunity in Supervised Learning, Hardt et al. (https://arxiv.org/pdf/1610.02413.pdf)
Subtlety #3: It matters whether the modeled decision’s consequences are
positive or negative
When a model is punitive you might care more about False Positives.
When a model is assistive you might care more about False Negatives.
The point is you have to think about these questions.
We can’t math our way out of thinking about fairness
You still need a person to think about the ethical implications of your model
Originally people thought ‘Models are just math, so they must be fair’
→ definitely not true
Now there’s a temptation to say ‘Adding this constraint will make my model fair’
→ still not automatically true
Can we detect real bias in real data?
Spoiler: it can be tough!
● Start with real data from Civis's work
○ Features are demographics, outcome is a probability
○ Consider racial bias; white versus African American
Can we detect real bias in real data?
Create artificial datasets with known causal bias; then we'll see if we can detect it.
● Start with real data from Civis's work
○ Features are demographics, outcome is a probability
○ Consider racial bias; white versus African American
● Two datasets:
○ Artificially balanced: select white subset and randomly re-assign race
○ Unmodified (imbalanced) dataset
Next introduce known sample and label bias
Label bias: you're in the dataset, but protected class affects your label
Use the original dataset but modify the labels
For unbiased data, use
Next introduce known sample and label bias
Sample bias: protected class affects whether you're in the sample at all
Create a modified dataset with labels taken from the original data
For unbiased data, use
Two experiments, four datasets each
Train an elastic net logistic regression classification model on each dataset
Train, predict, then measure
Apply each fairness metric to the model's probability predictions
Disparate Impact
Equal Opportunity
Equal Mis-Opportunity
Difference in Average Residuals
With balanced ground truth, all metrics detect bias
Good news!
No
bias
Sam
ple
bias
Label bias
Both
No bias
Sample bias
Label bias
Both
No bias
Sample bias
Label bias
Both
No
bias
Sam
ple
bias
Label bias
Both
No bias
Sample bias
Label bias
Both
No bias
Sample bias
Label bias
Both
With imbalanced ground truth, all metrics still detect bias...
...even when there isn't any bias in the "truth".
No
bias
Sam
ple
bias
Label bias
Both
No bias
Sample bias
Label bias
Both
No bias
Sample bias
Label bias
Both
Label bias is particularly hard to distinguish from imbalanced ground truth
Which is fair, since reality has label bias in this case.
There's no one-size fits all solution
Except for "think hard about your inputs and your outputs"
● These metrics can help
There's no one-size fits all solution
Except for "think hard about your inputs and your outputs"
● These metrics can help
● There are methods you can use in training to make your model more fair
○ (assuming you know what "fair" means for you)
https://arxiv.org/pdf/1412.3756.pdf
https://arxiv.org/pdf/1806.08010.pdf
http://proceedings.mlr.press/v28/zemel13.pdf
There's no one-size fits all solution
Except for "think hard about your inputs and your outputs"
● These metrics can help
● There are methods you can use in training to make your model more fair
● Use a diverse team to create the models!
https://imgur.com/gallery/hem9m
There's no one-size fits all solution
Except for "think hard about your inputs and your outputs"
● These metrics can help
● There are methods you can use in training to make your model more fair
● Use a diverse team to create the models!
● Know your data and check your predictions
https://pixabay.com/en/isolated-thinking-freedom-ape-1052504/
“Big Data processes codify the past. They do not invent the future. Doing
that requires moral imagination, and that’s something only humans can
provide. We have to explicitly embed better values into our algorithms,
creating Big Data models that follow our ethical lead. Sometimes that will
mean putting fairness ahead of profit.”
― Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases
Inequality and Threatens Democracy
“Big Data processes codify the past. They do not invent the future. Doing
that requires moral imagination, and that’s something only humans can
provide. We have to explicitly embed better values into our algorithms,
creating Big Data models that follow our ethical lead. Sometimes that will
mean putting fairness ahead of profit.”
― Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases
Inequality and Threatens Democracy
J. Henry Hinnefeld, Peter Cooman, Nat Mammo, and Rupert Deese,
"Evaluating Fairness Metrics in the Presence of Dataset Bias";
https://arxiv.org/abs/1809.09245
Stephen Hoover
@StephenActual

More Related Content

What's hot

Millner teacha15 math_presentation
Millner teacha15 math_presentationMillner teacha15 math_presentation
Millner teacha15 math_presentationlynmillner
 
The Ethics of AI – dealing with difficult choices in a non-binary world
The Ethics of AI – dealing with difficult choices in a non-binary worldThe Ethics of AI – dealing with difficult choices in a non-binary world
The Ethics of AI – dealing with difficult choices in a non-binary worldEric Reiss
 
MLSEV. Cluster Analysis and Anomaly Detection
MLSEV. Cluster Analysis and Anomaly DetectionMLSEV. Cluster Analysis and Anomaly Detection
MLSEV. Cluster Analysis and Anomaly DetectionBigML, Inc
 
Ethical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceEthical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceRah Abdelhak
 
Website Wellness and Preventive Care
Website Wellness and Preventive CareWebsite Wellness and Preventive Care
Website Wellness and Preventive CareNewCity
 
Identification1
Identification1Identification1
Identification1veesingh
 
Ethical Considerations in the Design of Artificial Intelligence
Ethical Considerations in the Design of Artificial IntelligenceEthical Considerations in the Design of Artificial Intelligence
Ethical Considerations in the Design of Artificial IntelligenceJohn C. Havens
 

What's hot (8)

ACP Digging Deeper
ACP Digging DeeperACP Digging Deeper
ACP Digging Deeper
 
Millner teacha15 math_presentation
Millner teacha15 math_presentationMillner teacha15 math_presentation
Millner teacha15 math_presentation
 
The Ethics of AI – dealing with difficult choices in a non-binary world
The Ethics of AI – dealing with difficult choices in a non-binary worldThe Ethics of AI – dealing with difficult choices in a non-binary world
The Ethics of AI – dealing with difficult choices in a non-binary world
 
MLSEV. Cluster Analysis and Anomaly Detection
MLSEV. Cluster Analysis and Anomaly DetectionMLSEV. Cluster Analysis and Anomaly Detection
MLSEV. Cluster Analysis and Anomaly Detection
 
Ethical issues facing Artificial Intelligence
Ethical issues facing Artificial IntelligenceEthical issues facing Artificial Intelligence
Ethical issues facing Artificial Intelligence
 
Website Wellness and Preventive Care
Website Wellness and Preventive CareWebsite Wellness and Preventive Care
Website Wellness and Preventive Care
 
Identification1
Identification1Identification1
Identification1
 
Ethical Considerations in the Design of Artificial Intelligence
Ethical Considerations in the Design of Artificial IntelligenceEthical Considerations in the Design of Artificial Intelligence
Ethical Considerations in the Design of Artificial Intelligence
 

Similar to Measuring Model Fairness - Stephen Hoover

Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessManojit Nandi
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AISeth Grimes
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesClément DUFFAU
 
Designing Against a Data Dystopia
Designing Against a Data DystopiaDesigning Against a Data Dystopia
Designing Against a Data DystopiaAgnes Pyrchla
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
 
Nabep analytics presentation
Nabep analytics presentationNabep analytics presentation
Nabep analytics presentationaarongblack1
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayHjk6653284
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairnessAnthonyMelson
 
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Data Driven Innovation
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningRahul Sahai
 
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017Big Data Spain
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsDr Janet Bastiman
 
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversTurning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversUNCResearchHub
 
Statistics Assignment Help
Statistics Assignment HelpStatistics Assignment Help
Statistics Assignment Helphwmsocial
 
Lane-SlidesMania.pptx
Lane-SlidesMania.pptxLane-SlidesMania.pptx
Lane-SlidesMania.pptxAngeCustodio
 
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...Robert Williams
 
Ethics for Conversational AI
Ethics for Conversational AIEthics for Conversational AI
Ethics for Conversational AIVerena Rieser
 

Similar to Measuring Model Fairness - Stephen Hoover (20)

Measures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairnessMeasures and mismeasures of algorithmic fairness
Measures and mismeasures of algorithmic fairness
 
Fairness in Machine Learning and AI
Fairness in Machine Learning and AIFairness in Machine Learning and AI
Fairness in Machine Learning and AI
 
When the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biasesWhen the AIs failures send us back to our own societal biases
When the AIs failures send us back to our own societal biases
 
Designing Against a Data Dystopia
Designing Against a Data DystopiaDesigning Against a Data Dystopia
Designing Against a Data Dystopia
 
Module 1 introduction to machine learning
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learning
 
Nabep analytics presentation
Nabep analytics presentationNabep analytics presentation
Nabep analytics presentation
 
IE_expressyourself_EssayH
IE_expressyourself_EssayHIE_expressyourself_EssayH
IE_expressyourself_EssayH
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Algorithmic fairness
Algorithmic fairnessAlgorithmic fairness
Algorithmic fairness
 
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
Don't blindly trust your ML System, it may change your life (Azzurra Ragone, ...
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017The Data Errors we Make by Sean Taylor at Big Data Spain 2017
The Data Errors we Make by Sean Taylor at Big Data Spain 2017
 
AI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systemsAI Fails: Avoiding bias in your systems
AI Fails: Avoiding bias in your systems
 
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem SolversTurning Data into Infographics: An Interactive Workshop for Problem Solvers
Turning Data into Infographics: An Interactive Workshop for Problem Solvers
 
Statistics Assignment Help
Statistics Assignment HelpStatistics Assignment Help
Statistics Assignment Help
 
Lane-SlidesMania.pptx
Lane-SlidesMania.pptxLane-SlidesMania.pptx
Lane-SlidesMania.pptx
 
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
 
Ethics for Conversational AI
Ethics for Conversational AIEthics for Conversational AI
Ethics for Conversational AI
 
A.I.pptx
A.I.pptxA.I.pptx
A.I.pptx
 
intro_big_data.pptx
intro_big_data.pptxintro_big_data.pptx
intro_big_data.pptx
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshPyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiPyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerPyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaPyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroPyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottPyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroPyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydPyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldPyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardPyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 
Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...Deprecating the state machine: building conversational AI with the Rasa stack...
Deprecating the state machine: building conversational AI with the Rasa stack...
 

Recently uploaded

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Measuring Model Fairness - Stephen Hoover

  • 1. Measuring Model Fairness Stephen Hoover PyData LA 2018 October 23
  • 2. Measuring Model Fairness Stephen Hoover PyData LA 2018 October 23 J. Henry Hinnefeld, Peter Cooman, Nat Mammo, and Rupert Deese, "Evaluating Fairness Metrics in the Presence of Dataset Bias"; https://arxiv.org/abs/1809.09245
  • 4. Models determine whether you can buy a home... https://www.flickr.com/photos/cafecredit/26700612773
  • 5. and how long you spend on parole... https://www.flickr.com/photos/archivesnz/27160240521
  • 6. and what advertisements you see. https://www.flickr.com/photos/dno1967b/8283313605
  • 7. and what advertisements you see. https://www.flickr.com/photos/44313045@N08/6290270129
  • 8. How do you measure if your model is fair? https://pixabay.com/en/legal-scales-of-justice-judge-450202/
  • 9. How do you measure if your model is fair? https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
  • 10. How do you measure if your model is fair? https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
  • 11. How do you measure if your model is fair? http://www.northpointeinc.com/files/publications/Criminal-Justice-Behavior-COMPAS.pdf https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
  • 12. The bias comes from your data. https://commons.wikimedia.org/wiki/File:Bias_t.png
  • 13. Subtlety #1: Different groups can have different ground truth positive rates https://www.breastcancer.org/symptoms/understand_bc/statistics
  • 14. Certain fairness metrics make assumptions about the balance of ground truth positive rates Disparate Impact is a popular metric which assumes that the ground truth positive rates for both groups are the same Certifying and removing disparate impact, Feldman et al. (https://arxiv.org/abs/1412.3756)
  • 15. Subtlety #2: Your data are a biased representation of ground truth Datasets can contain label bias when a protected attribute affects the way individuals are assigned labels. In addition, the results indicate that students from African American and Latino families are more likely than their White peers to receive expulsion or out of school suspension as consequences for the same or similar problem behavior. A dataset for predicting “student problem behavior” that used “has been suspended” for its label could contain label bias. ”Race is not neutral: A national investigation of African American and Latino disproportionality in school discipline.” Skiba et al.
  • 16. Subtlety #2: Your data are a biased representation of ground truth Datasets can contain sample bias when a protected attribute affects the sampling process that generated your data We find that persons of African and Hispanic descent were stopped more frequently than whites, even after controlling for precinct variability and race-specific estimates of crime participation. A dataset for predicting contraband possession that used stop-and-frisk data could contain sample bias. ”An analysis of the NYPD’s stop-and-frisk policy in the context of claims of racial bias” Gelman et al.
  • 17. Certain fairness metrics are based on accuracy with respect to possibly biased labels Equal Opportunity is a popular metric which compares the True Positive rates between protected groups Equality of Opportunity in Supervised Learning, Hardt et al. (https://arxiv.org/pdf/1610.02413.pdf)
  • 18. Subtlety #3: It matters whether the modeled decision’s consequences are positive or negative When a model is punitive you might care more about False Positives. When a model is assistive you might care more about False Negatives. The point is you have to think about these questions.
  • 19. We can’t math our way out of thinking about fairness You still need a person to think about the ethical implications of your model Originally people thought ‘Models are just math, so they must be fair’ → definitely not true Now there’s a temptation to say ‘Adding this constraint will make my model fair’ → still not automatically true
  • 20. Can we detect real bias in real data? Spoiler: it can be tough! ● Start with real data from Civis's work ○ Features are demographics, outcome is a probability ○ Consider racial bias; white versus African American
  • 21. Can we detect real bias in real data? Create artificial datasets with known causal bias; then we'll see if we can detect it. ● Start with real data from Civis's work ○ Features are demographics, outcome is a probability ○ Consider racial bias; white versus African American ● Two datasets: ○ Artificially balanced: select white subset and randomly re-assign race ○ Unmodified (imbalanced) dataset
  • 22. Next introduce known sample and label bias Label bias: you're in the dataset, but protected class affects your label Use the original dataset but modify the labels For unbiased data, use
  • 23. Next introduce known sample and label bias Sample bias: protected class affects whether you're in the sample at all Create a modified dataset with labels taken from the original data For unbiased data, use
  • 24. Two experiments, four datasets each Train an elastic net logistic regression classification model on each dataset
  • 25. Train, predict, then measure Apply each fairness metric to the model's probability predictions Disparate Impact Equal Opportunity Equal Mis-Opportunity Difference in Average Residuals
  • 26. With balanced ground truth, all metrics detect bias Good news! No bias Sam ple bias Label bias Both No bias Sample bias Label bias Both No bias Sample bias Label bias Both
  • 27. No bias Sam ple bias Label bias Both No bias Sample bias Label bias Both No bias Sample bias Label bias Both With imbalanced ground truth, all metrics still detect bias... ...even when there isn't any bias in the "truth".
  • 28. No bias Sam ple bias Label bias Both No bias Sample bias Label bias Both No bias Sample bias Label bias Both Label bias is particularly hard to distinguish from imbalanced ground truth Which is fair, since reality has label bias in this case.
  • 29. There's no one-size fits all solution Except for "think hard about your inputs and your outputs" ● These metrics can help
  • 30. There's no one-size fits all solution Except for "think hard about your inputs and your outputs" ● These metrics can help ● There are methods you can use in training to make your model more fair ○ (assuming you know what "fair" means for you) https://arxiv.org/pdf/1412.3756.pdf https://arxiv.org/pdf/1806.08010.pdf http://proceedings.mlr.press/v28/zemel13.pdf
  • 31. There's no one-size fits all solution Except for "think hard about your inputs and your outputs" ● These metrics can help ● There are methods you can use in training to make your model more fair ● Use a diverse team to create the models! https://imgur.com/gallery/hem9m
  • 32. There's no one-size fits all solution Except for "think hard about your inputs and your outputs" ● These metrics can help ● There are methods you can use in training to make your model more fair ● Use a diverse team to create the models! ● Know your data and check your predictions https://pixabay.com/en/isolated-thinking-freedom-ape-1052504/
  • 33. “Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit.” ― Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
  • 34. “Big Data processes codify the past. They do not invent the future. Doing that requires moral imagination, and that’s something only humans can provide. We have to explicitly embed better values into our algorithms, creating Big Data models that follow our ethical lead. Sometimes that will mean putting fairness ahead of profit.” ― Cathy O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy J. Henry Hinnefeld, Peter Cooman, Nat Mammo, and Rupert Deese, "Evaluating Fairness Metrics in the Presence of Dataset Bias"; https://arxiv.org/abs/1809.09245 Stephen Hoover @StephenActual