Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Quora ML Workshop: Content Moderation & Machine Learning

526 views

Published on

Presentation by Alana Glassco, anti-abuse engineer at Smyte, at Quora ML Workshop: Protecting Online Spaces with Applied Machine Learning, on September 27, 2017.

Published in: Technology

Quora ML Workshop: Content Moderation & Machine Learning

  1. 1. Be Nice, Be Respectful: Protecting Online Spaces with Applied Machine Learning
  2. 2. Content Moderation & Machine Learning Common Pitfalls & How to Avoid Them Alana Glassco Anti-abuse Engineer at Smyte Alana@smyte.com
  3. 3. Content Policies ● Context is key ● Not black & white ● Designed for humans, not machines
  4. 4. Content moderation flow
  5. 5. Content moderation flow
  6. 6. Tips & tricks
  7. 7. Understand the problem ● Business goals ● Nature of the problem ● Is ML a good fit?
  8. 8. For example... ● Business goals ○ Enforce company values ○ Gain good press ● Nature of the problem ○ Short-term ○ High FP cost ● Is ML a good fit? ○ No ● Business goals ○ Reduce bad press ○ Recover advertising loss ● Nature of the problem ○ Long-term ○ High FN cost ● Is ML a good fit? ○ Yes
  9. 9. Get the right training data ● Understand policies in practice ● “Free” data won’t cut it ● Invest in a human review team
  10. 10. Example: building a “spam” classifier Repetitive content Keyword stuffing Artificial traffic Scams / phishing Behavioral signals Bots / fake accounts Real users Bots / fake accounts Bots or real users Optics Looks fine in isolation Easy to identify Invisible w/o account signals Looks bad to a trained reviewer Severity Harms reputation Harms search results Harms ranking Harms users
  11. 11. Design a solution ● Model selection ● Implementation ● Maintenance & retraining
  12. 12. Questions? alana@smyte.com

×