When Recommendations Systems Go Bad: Machine learning and recommendations systems have changed the way we interact with not just the internet, but some of the basic products and services that we use to run our lives.
While the reach and impact of big data and algorithms will continue to grow, how do we ensure that people are treated justly? Certainly there are already algorithms in use that determine if someone will receive a job interview or be accepted into a school. Misuse of data in many of these cases could have serious public relations, legal, and ethical consequences.
As the people that build these systems, we have a social responsibility to consider their effect on humanity, and we should do whatever we can to prevent these models from perpetuating some of the prejudice and bias that exist in our society today.
In this talk I intend to cover some examples of recommendation systems that have gone wrong across various industries, as well as why they went wrong and what can be done about it. The first step towards solving this larger issue is raising awareness, but there are concrete technical approaches that can be employed as well. Three that will be covered are:
- Accepting simplicity with interpretable models.
- Data segregation via ensemble modelling.
- Designing test data sets for capturing unintended bias.
15. Recommendation Systems: Learning To Rank
Active area of research
Use ML model to solve a ranking problem
Pointwise: Logistic Regression on binary label, use output for ranking
Listwise: Optimize entire list
Performance Metrics
Mean Average Precision
P@K
Discounted Cumulative Gain
16.
17. Data
Science
impacts
lives
Ads you see
Apps you download
Friend’s Activity/Facebook feed
News you’re exposed to
If a product is available
If you can get a ride
Price you pay for things
Admittance into college
Job openings you find/get
If you can get a loan
18.
19. You just wanted a
kitchen scale, now
Amazon thinks you’re
a drug dealer
20.
21.
22. Ego
Member/customer/user first
Focus on building the best product,
not on being the most clever data
scientist
Much harder to spin a positive user
story than a story about how smart
you are
24. Ethics
We have accepted that Machine Learning
can seem creepy, how do we prevent it
from becoming immoral?
We have an ethical obligation to not
teach machines to be prejudiced.
27. Interpretable
Models
For simple problems, simple solutions
are often worth a small concession
in performance
Inspectable models make it easier to
debug problems in data collection,
feature engineering etc.
Only include features that work the
way you want
Don’t include feature interactions that
you don’t want
29. Feature Engineering and Interactions
● Good Feature:
○ Join! You’re interested in Tech x Meetup is about Tech
● Good Feature:
○ Don’t join! Group is intended only for Women x You are a Man
● Bad Feature:
○ Don’t join! Group is mostly Men x You are a Woman
● Horrible Feature:
○ Don’t join! Meetup is about Tech x You are a Woman
Meetup is not interested in propagating gender stereotypes
30. Ensemble
Models and
Data
segregation
Ensemble Models: Combine outputs of
several classifiers for increased accuracy
If you have features that are useful but
you’re worried about interaction (and
your model does it automatically) use
ensemble modeling to restrict the
features to separate models.
32. Fake profiles, track ads
Career coaching for “200k+” Executive
jobs Ad
Male group: 1852 impressions
Female group: 318
33. Diversity Controlled Testing
CMU - AdFisher
Crawls ads with simulated user profiles
Same technique can work to find bias in your own models!
Generate Test Data
Randomize sensitive feature in real data set
Run Model
Evaluate for unacceptable biased treatment
Must identify what features are sensitive and what outcomes are
unwanted
34.
35. ● Twitter bot
● “Garbage in,
garbage out”
● Responsibility?
“In the span of 15 hours Tay referred to feminism as a
"cult" and a "cancer," as well as noting "gender equality
= feminism" and "i love feminism now." Tweeting
"Bruce Jenner" at the bot got similar mixed response,
ranging from "caitlyn jenner is a hero & is a stunning,
beautiful woman!" to the transphobic "caitlyn jenner
isn't a real woman yet she won woman of the year?"”
Tay.ai
36. Diverse
test data
Outliers can matter
The real world is messy
Some people will mess with you
Some people look/act different than
you
Defense
Diversity
Design
37. You know racist computers are a
bad idea
Don’t let your company invent
racist computers