SlideShare a Scribd company logo
1 of 20
CATS4ML:
Crowdsourcing Adverse Test Sets for ML
Welcome!
ˈl ɪ k ə r t
cats4ml.humancomputation.com
● Click on the three dots on the lower corner of
your Google Meets screen.
● Select “Change layout.”
● Choose your preferred layout:
○ Sidebar View
○ Spotlight View
○ Tiled View
● Select “Turn on captions” for Closed
Captioning
Your Google Meet View
Fill out our interest form to hear
about events and opportunities
at goo.gle/neurips-booth-form
Google at NeurIPS 2020
Build
for
everyone
ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed in AI practices
especially in the way we evaluate AI
improving evaluation of AI must consider ways to
measure variance and capture bias to bring us
one step closer to data excellence
to address variance in AI evaluation we propose a
number of novel metrics for reliability,
significance (metrology for AI) and disagreement
(CrowdTruth)
to address bias in AI evaluation we propose a
novel method for crowdsourcing adverse test
sets for ML models (CATS4ML)
ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed in AI practices
especially in the way we evaluate AI
improving evaluation of AI must consider ways to
measure variance and capture bias to bring us
one step closer to data excellence
to address variance in AI evaluation we propose a
number of novel metrics for reliability ,
significance (metrology for AI) and disagreement
(CrowdTruth)
to address bias in AI evaluation we propose a
novel method for crowdsourcing adverse test
sets for ML models (CATS4ML)
ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
reactive
data improvement
ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
to reach here
we need proactive
data improvement
ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
cats4ml.humancomputation.com
ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
How can we find such
examples, especially if they are
AI blindspots
(i.e. unknown unknowns)? cats4ml.humancomputation.com
ˈl ɪ k ə r t
offers a crowdsourced solution for finding
blindspots of your AI models
inspired by prior research in human computation
Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” (Attenberg, Ipeirotis, Provost, 2014)
Vulnerability Reward Program (Google)
CATS4ML Challenge
Crowdsourcing Adverse Test Sets for ML
ˈl ɪ k ə r t
offers a crowdsourced red team
for finding blindspots of your AI models
CATS4ML Challenge
Crowdsourcing Adverse Test Sets for ML
cats4ml.humancomputation.com
ˈl ɪ k ə r t
In this first version of the CATS4ML challenge
participants will discover AI blindspots in the Open Images Dataset
Lipstick?
Airplane?
Car?
Construction worker?
Thanksgiving?
These AI blindspots are real images with visual
patterns that confuse AI models
in ways humans might find meaningful
cats4ml.humancomputation.com
ˈl ɪ k ə r t
image1 label1
image2 label2
…
challenge
participants
labels comparison
Open
Images
dataset
submission
image 1: label 1: lipstick
image 2: label 2: thanksgiving
...
human-in-the-loop verification
image1-label1
image2-label2
...
image1-label1
image2-label2
...
submission scoring
vs.
The Challenge Process
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
cats4ml.humancomputation.com
The challenge will run until 30 April, 2021
ˈl ɪ k ə r t
cats4ml.humancomputation.com
Join us now!
● join the data challenge
individually or form a team
● discover interesting AI
blindspots in Open Images
Dataset & contribute these to
the challenge
● be part of this research effort
● winning contributions will be
promoted at next
CrowdCamp 2021
ˈl ɪ k ə r t
The Team
Praveen Paritosh Ka Wong
Lora Aroyo Devi Krishna
ˈl ɪ k ə r t
cats4ml.humancomputation.com
ˈl ɪ k ə r t
Share your voice about ML data!
We are inviting ML professionals to
participate in a short survey to
learn about your challenges and
needs with data.
Scan to participate now!

More Related Content

What's hot

Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
Business Intelligence and Big Data in Cloud
Business Intelligence and Big Data in CloudBusiness Intelligence and Big Data in Cloud
Business Intelligence and Big Data in CloudDing Li
 
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...Carol Smith
 
SEO & Artificial Intelligence: The new rules to stay on top!
SEO & Artificial Intelligence: The new rules to stay on top!SEO & Artificial Intelligence: The new rules to stay on top!
SEO & Artificial Intelligence: The new rules to stay on top!TheFamily
 
M2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparencyM2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparencyBoPeng76
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science DemystifiedEmily Robinson
 
Codes of Ethics and the Ethics of Code
Codes of Ethics and the Ethics of CodeCodes of Ethics and the Ethics of Code
Codes of Ethics and the Ethics of CodeMark Underwood
 
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in Toronto
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in TorontoDynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in Toronto
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in TorontoCarol Smith
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USCSri Ambati
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningChase Aucoin
 
ESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaArjen de Vries
 
Panel: AI for Social Good - Fairness, Ethics, Accountability, and Transparency
Panel: AI for Social Good - Fairness, Ethics, Accountability, and TransparencyPanel: AI for Social Good - Fairness, Ethics, Accountability, and Transparency
Panel: AI for Social Good - Fairness, Ethics, Accountability, and TransparencyAmazon Web Services
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big DataRevolution Analytics
 
Google Cloud - Google's vision on AI
Google Cloud - Google's vision on AIGoogle Cloud - Google's vision on AI
Google Cloud - Google's vision on AIBigDataExpo
 
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessRussell Miles
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsKrishna Sankar
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo
 
Looking beyond plain text for document representation in the enterprise
Looking beyond plain text for document representation in the enterpriseLooking beyond plain text for document representation in the enterprise
Looking beyond plain text for document representation in the enterpriseArjen de Vries
 

What's hot (19)

Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Business Intelligence and Big Data in Cloud
Business Intelligence and Big Data in CloudBusiness Intelligence and Big Data in Cloud
Business Intelligence and Big Data in Cloud
 
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...
 
SEO & Artificial Intelligence: The new rules to stay on top!
SEO & Artificial Intelligence: The new rules to stay on top!SEO & Artificial Intelligence: The new rules to stay on top!
SEO & Artificial Intelligence: The new rules to stay on top!
 
M2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparencyM2 l10 fairness, accountability, and transparency
M2 l10 fairness, accountability, and transparency
 
Data Science Demystified
Data Science DemystifiedData Science Demystified
Data Science Demystified
 
Codes of Ethics and the Ethics of Code
Codes of Ethics and the Ethics of CodeCodes of Ethics and the Ethics of Code
Codes of Ethics and the Ethics of Code
 
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in Toronto
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in TorontoDynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in Toronto
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in Toronto
 
Applications of Machine Learning at USC
Applications of Machine Learning at USCApplications of Machine Learning at USC
Applications of Machine Learning at USC
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
ESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social MediaESSIR 2013 - IR and Social Media
ESSIR 2013 - IR and Social Media
 
Panel: AI for Social Good - Fairness, Ethics, Accountability, and Transparency
Panel: AI for Social Good - Fairness, Ethics, Accountability, and TransparencyPanel: AI for Social Good - Fairness, Ethics, Accountability, and Transparency
Panel: AI for Social Good - Fairness, Ethics, Accountability, and Transparency
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big Data
 
Google Cloud - Google's vision on AI
Google Cloud - Google's vision on AIGoogle Cloud - Google's vision on AI
Google Cloud - Google's vision on AI
 
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to Business
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
Looking beyond plain text for document representation in the enterprise
Looking beyond plain text for document representation in the enterpriseLooking beyond plain text for document representation in the enterprise
Looking beyond plain text for document representation in the enterprise
 
UseR 2017
UseR 2017UseR 2017
UseR 2017
 

Similar to CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...polochau
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistRebecca Bilbro
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBigML, Inc
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Krishnaram Kenthapadi
 
VSSML18. Introduction to Machine Learning and the BigML Platform
VSSML18. Introduction to Machine Learning and the BigML PlatformVSSML18. Introduction to Machine Learning and the BigML Platform
VSSML18. Introduction to Machine Learning and the BigML PlatformBigML, Inc
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering BigML, Inc
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Dhiana Deva
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBigML, Inc
 
MLSEV. Machine Learning: Technical Perspective
MLSEV. Machine Learning: Technical PerspectiveMLSEV. Machine Learning: Technical Perspective
MLSEV. Machine Learning: Technical PerspectiveBigML, Inc
 
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsMyth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsApplitools
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptxgdgsurrey
 
All the data and still not enough
All the data and still not enoughAll the data and still not enough
All the data and still not enoughAndreea Bodnari
 
Data Science at The New York Times
Data Science at The New York TimesData Science at The New York Times
Data Science at The New York Timeschris wiggins
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...QA or the Highway
 
BSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBigML, Inc
 
How to use LLMs in synthesizing training data?
How to use LLMs in synthesizing training data?How to use LLMs in synthesizing training data?
How to use LLMs in synthesizing training data?Benjaminlapid1
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineAlbert Gareev
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech
 
The Disappearing Data Scientist
The Disappearing Data ScientistThe Disappearing Data Scientist
The Disappearing Data ScientistDATAVERSITY
 

Similar to CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning (20)

Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
Human-Centered AI: Scalable, Interactive Tools for Interpretation and Attribu...
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 
BSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, EvaluationsBSSML17 - Introduction, Models, Evaluations
BSSML17 - Introduction, Models, Evaluations
 
Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)Explainable AI in Industry (FAT* 2020 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
 
VSSML18. Introduction to Machine Learning and the BigML Platform
VSSML18. Introduction to Machine Learning and the BigML PlatformVSSML18. Introduction to Machine Learning and the BigML Platform
VSSML18. Introduction to Machine Learning and the BigML Platform
 
Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering Web UI, Algorithms, and Feature Engineering
Web UI, Algorithms, and Feature Engineering
 
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
Machine Learning: Opening the Pandora's Box - Dhiana Deva @ QCon São Paulo 2019
 
BSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and EvaluationsBSSML16 L1. Introduction, Models, and Evaluations
BSSML16 L1. Introduction, Models, and Evaluations
 
MLSEV. Machine Learning: Technical Perspective
MLSEV. Machine Learning: Technical PerspectiveMLSEV. Machine Learning: Technical Perspective
MLSEV. Machine Learning: Technical Perspective
 
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan LippsMyth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
Myth vs Reality: Understanding AI/ML for QA Automation - w/ Jonathan Lipps
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
All the data and still not enough
All the data and still not enoughAll the data and still not enough
All the data and still not enough
 
Data Science at The New York Times
Data Science at The New York TimesData Science at The New York Times
Data Science at The New York Times
 
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
Scale your Testing and Quality with Automation Engineering and ML - Carlos Ki...
 
BSSML17 - Basic Data Transformations
BSSML17 - Basic Data TransformationsBSSML17 - Basic Data Transformations
BSSML17 - Basic Data Transformations
 
How to use LLMs in synthesizing training data?
How to use LLMs in synthesizing training data?How to use LLMs in synthesizing training data?
How to use LLMs in synthesizing training data?
 
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-MagazineSTQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
STQA-Vol9-Issue2-March-2012-Software-Testing-Magazine
 
Keepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivosKeepler Data Tech | Entendiendo tus propios modelos predictivos
Keepler Data Tech | Entendiendo tus propios modelos predictivos
 
Debugging AI
Debugging AIDebugging AI
Debugging AI
 
The Disappearing Data Scientist
The Disappearing Data ScientistThe Disappearing Data Scientist
The Disappearing Data Scientist
 

More from Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AILora Aroyo
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorLora Aroyo
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataLora Aroyo
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsLora Aroyo
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesLora Aroyo
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the LoopLora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyLora Aroyo
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...Lora Aroyo
 
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumStitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumLora Aroyo
 

More from Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
Data excellence: Better data for better AI
Data excellence: Better data for better AIData excellence: Better data for better AI
Data excellence: Better data for better AI
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the RijksmuseumStitch by Stitch: Annotating Fashion at the Rijksmuseum
Stitch by Stitch: Annotating Fashion at the Rijksmuseum
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

  • 1. CATS4ML: Crowdsourcing Adverse Test Sets for ML Welcome! ˈl ɪ k ə r t cats4ml.humancomputation.com
  • 2. ● Click on the three dots on the lower corner of your Google Meets screen. ● Select “Change layout.” ● Choose your preferred layout: ○ Sidebar View ○ Spotlight View ○ Tiled View ● Select “Turn on captions” for Closed Captioning Your Google Meet View
  • 3. Fill out our interest form to hear about events and opportunities at goo.gle/neurips-booth-form Google at NeurIPS 2020 Build for everyone
  • 4. ˈl ɪ k ə r t TAKE HOME MESSAGE data is the compass for AI - AI advances where there is data data quality must be addressed in AI practices especially in the way we evaluate AI improving evaluation of AI must consider ways to measure variance and capture bias to bring us one step closer to data excellence to address variance in AI evaluation we propose a number of novel metrics for reliability, significance (metrology for AI) and disagreement (CrowdTruth) to address bias in AI evaluation we propose a novel method for crowdsourcing adverse test sets for ML models (CATS4ML)
  • 5. ˈl ɪ k ə r t TAKE HOME MESSAGE data is the compass for AI - AI advances where there is data data quality must be addressed in AI practices especially in the way we evaluate AI improving evaluation of AI must consider ways to measure variance and capture bias to bring us one step closer to data excellence to address variance in AI evaluation we propose a number of novel metrics for reliability , significance (metrology for AI) and disagreement (CrowdTruth) to address bias in AI evaluation we propose a novel method for crowdsourcing adverse test sets for ML models (CATS4ML)
  • 6. ˈl ɪ k ə r t The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... reactive data improvement
  • 7. ˈl ɪ k ə r t The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” to reach here we need proactive data improvement
  • 8. ˈl ɪ k ə r t Your AI model is as good as your evaluation data … but is your evaluation data missing relevant examples? cats4ml.humancomputation.com
  • 9. ˈl ɪ k ə r t Your AI model is as good as your evaluation data … but is your evaluation data missing relevant examples? How can we find such examples, especially if they are AI blindspots (i.e. unknown unknowns)? cats4ml.humancomputation.com
  • 10. ˈl ɪ k ə r t offers a crowdsourced solution for finding blindspots of your AI models inspired by prior research in human computation Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” (Attenberg, Ipeirotis, Provost, 2014) Vulnerability Reward Program (Google) CATS4ML Challenge Crowdsourcing Adverse Test Sets for ML
  • 11. ˈl ɪ k ə r t offers a crowdsourced red team for finding blindspots of your AI models CATS4ML Challenge Crowdsourcing Adverse Test Sets for ML cats4ml.humancomputation.com
  • 12. ˈl ɪ k ə r t In this first version of the CATS4ML challenge participants will discover AI blindspots in the Open Images Dataset Lipstick? Airplane? Car? Construction worker? Thanksgiving? These AI blindspots are real images with visual patterns that confuse AI models in ways humans might find meaningful cats4ml.humancomputation.com
  • 13. ˈl ɪ k ə r t image1 label1 image2 label2 … challenge participants labels comparison Open Images dataset submission image 1: label 1: lipstick image 2: label 2: thanksgiving ... human-in-the-loop verification image1-label1 image2-label2 ... image1-label1 image2-label2 ... submission scoring vs. The Challenge Process cats4ml.humancomputation.com
  • 14. ˈl ɪ k ə r t cats4ml.humancomputation.com
  • 15. ˈl ɪ k ə r t cats4ml.humancomputation.com
  • 16. ˈl ɪ k ə r t cats4ml.humancomputation.com
  • 17. ˈl ɪ k ə r t cats4ml.humancomputation.com The challenge will run until 30 April, 2021
  • 18. ˈl ɪ k ə r t cats4ml.humancomputation.com Join us now! ● join the data challenge individually or form a team ● discover interesting AI blindspots in Open Images Dataset & contribute these to the challenge ● be part of this research effort ● winning contributions will be promoted at next CrowdCamp 2021
  • 19. ˈl ɪ k ə r t The Team Praveen Paritosh Ka Wong Lora Aroyo Devi Krishna ˈl ɪ k ə r t cats4ml.humancomputation.com
  • 20. ˈl ɪ k ə r t Share your voice about ML data! We are inviting ML professionals to participate in a short survey to learn about your challenges and needs with data. Scan to participate now!