The document introduces CATS4ML, a crowdsourcing challenge to discover blindspots in machine learning models by having participants label images in the Open Images Dataset that are incorrectly labeled by AI. The goal is to crowdsource adverse test sets that can capture biases and improve evaluation of AI. The challenge runs through April 2021 and invites individuals and teams to discover interesting mislabeled images and contribute them for review and inclusion in the test sets. Winning contributions will be promoted at the next CrowdCamp conference.
2. ● Click on the three dots on the lower corner of
your Google Meets screen.
● Select “Change layout.”
● Choose your preferred layout:
○ Sidebar View
○ Spotlight View
○ Tiled View
● Select “Turn on captions” for Closed
Captioning
Your Google Meet View
3. Fill out our interest form to hear
about events and opportunities
at goo.gle/neurips-booth-form
Google at NeurIPS 2020
Build
for
everyone
4. ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed in AI practices
especially in the way we evaluate AI
improving evaluation of AI must consider ways to
measure variance and capture bias to bring us
one step closer to data excellence
to address variance in AI evaluation we propose a
number of novel metrics for reliability,
significance (metrology for AI) and disagreement
(CrowdTruth)
to address bias in AI evaluation we propose a
novel method for crowdsourcing adverse test
sets for ML models (CATS4ML)
5. ˈl ɪ k ə r t
TAKE HOME MESSAGE
data is the compass for AI - AI advances where
there is data
data quality must be addressed in AI practices
especially in the way we evaluate AI
improving evaluation of AI must consider ways to
measure variance and capture bias to bring us
one step closer to data excellence
to address variance in AI evaluation we propose a
number of novel metrics for reliability ,
significance (metrology for AI) and disagreement
(CrowdTruth)
to address bias in AI evaluation we propose a
novel method for crowdsourcing adverse test
sets for ML models (CATS4ML)
6. ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
reactive
data improvement
7. ˈl ɪ k ə r t
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
to reach here
we need proactive
data improvement
8. ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
cats4ml.humancomputation.com
9. ˈl ɪ k ə r t
Your AI model is as good as
your evaluation data
… but is your evaluation data
missing relevant examples?
How can we find such
examples, especially if they are
AI blindspots
(i.e. unknown unknowns)? cats4ml.humancomputation.com
10. ˈl ɪ k ə r t
offers a crowdsourced solution for finding
blindspots of your AI models
inspired by prior research in human computation
Beat the Machine: Challenging Humans to Find a Predictive Model's “Unknown Unknowns” (Attenberg, Ipeirotis, Provost, 2014)
Vulnerability Reward Program (Google)
CATS4ML Challenge
Crowdsourcing Adverse Test Sets for ML
11. ˈl ɪ k ə r t
offers a crowdsourced red team
for finding blindspots of your AI models
CATS4ML Challenge
Crowdsourcing Adverse Test Sets for ML
cats4ml.humancomputation.com
12. ˈl ɪ k ə r t
In this first version of the CATS4ML challenge
participants will discover AI blindspots in the Open Images Dataset
Lipstick?
Airplane?
Car?
Construction worker?
Thanksgiving?
These AI blindspots are real images with visual
patterns that confuse AI models
in ways humans might find meaningful
cats4ml.humancomputation.com
13. ˈl ɪ k ə r t
image1 label1
image2 label2
…
challenge
participants
labels comparison
Open
Images
dataset
submission
image 1: label 1: lipstick
image 2: label 2: thanksgiving
...
human-in-the-loop verification
image1-label1
image2-label2
...
image1-label1
image2-label2
...
submission scoring
vs.
The Challenge Process
cats4ml.humancomputation.com
17. ˈl ɪ k ə r t
cats4ml.humancomputation.com
The challenge will run until 30 April, 2021
18. ˈl ɪ k ə r t
cats4ml.humancomputation.com
Join us now!
● join the data challenge
individually or form a team
● discover interesting AI
blindspots in Open Images
Dataset & contribute these to
the challenge
● be part of this research effort
● winning contributions will be
promoted at next
CrowdCamp 2021
19. ˈl ɪ k ə r t
The Team
Praveen Paritosh Ka Wong
Lora Aroyo Devi Krishna
ˈl ɪ k ə r t
cats4ml.humancomputation.com
20. ˈl ɪ k ə r t
Share your voice about ML data!
We are inviting ML professionals to
participate in a short survey to
learn about your challenges and
needs with data.
Scan to participate now!