Walmart Big Data Expo

1. Natural Intelligence: the Human Factor in A.I. Big Data Expo 2017 Utrecht, Netherlands

2. About Me • Former Member of the Search team at @WalmartLabs • Former Head of Metrics & Measurements team • I also led the Human Evaluation team • About the Metrics and Measurements team • A team of engineers, analysts and scientists in charge of providing accurate and exhaustive measurements • we also had an auditing role towards adjacent teams • What do we measure? • Engineering metrics related to model and data quality • Business metrics (revenue, etc.) • More exotic customer-‐centric metrics (customer value, customer satisfaction, model impact, etc.) • Currently Head of Data Science at Atlassian • In charge of the Search & Smarts team

6. q Humans & Big Data • The role of human beings in the era of Big Data • Why do we need to tag data? • How to get tagged data? q The Era of Crowdsourcing • What is Crowdsourcing? • Use cases and details about Crowdsourcing • Traditional crowds vs. curated crowds q The Human-‐in-‐the-‐Loop Paradigm • Definition and details about Human-‐In-‐The-‐Loop ML • Introduction to Active Learning Outline

9. Humans & Big Data: The Role of Human Beings in the Era of Machine Learning

10. The Era of Very Big Data q VOLUME • More data created from 2013 to 2015 than in the entire previous history of the human race • By 2020, accumulated data will reach 44 trillion gigabytes q VELOCITY • By 2020, ~1.7 MB of new data / second / human being • 1.2 trillion search queries on Google per year q VARIETY • 31 million messages/2.8 million videos per minute on Facebook • Up to 300 hours of video / minute are uploaded to YouTube • In 2015, 1 trillion photos taken; billions shared online data center at Google

14. Supervised vs. Unsupervised Machine Learning Supervised ML requires tagged data • Classification: problem where the output variable is a category examples: SVM, random forest, Bayesian classifiers • Regression: problem where the output variable is a real value examples: linear regression, random forest

15. Supervised vs. Unsupervised Machine Learning Supervised ML requires tagged data Unsupervised ML doesn’t require tagged data • Classification: problem where the output variable is a category examples: SVM, random forest, Bayesian classifiers • Regression: problem where the output variable is a real value examples: linear regression, random forest • Clustering: discovery of inherent groupings in the data examples: k-‐means, k-‐nearest neighbors • Association rules: discovery of rules describing the data example: Apriori algorithm

16. Supervised vs. Unsupervised Machine Learning Supervised ML requires tagged data Unsupervised ML doesn’t require tagged data Supervised: • Image Recognition • Speech Recognition Unsupervised • Feature Learning • Autoencoders • Classification: problem where the output variable is a category examples: SVM, random forest, Bayesian classifiers • Regression: problem where the output variable is a real value examples: linear regression, random forest • Clustering: discovery of inherent groupings in the data examples: k-‐means, k-‐nearest neighbors • Association rules: discovery of rules describing the data example: Apriori algorithm The Case of Deep Learning both supervised and unsupervised applications NB: Deep Learning algorithms are data-‐greedy…

17. • Gathering quality tagged training data is a common bottleneck in ML • Expensive • Quality control is hard, requires second human pass • Hardly scalable à heavy use of sampling strategies • How do companies doing Machine Learning get tagged data? • Implicit tagging: customer engagement • Explicit tagging: manual labor • A few strategies to get tagged data for cheap/free: • Games (Google Quick Draw) • Incentivization (extra lives or bonuses in games) Tagged Data

18. • Gathering quality tagged training data is a common bottleneck in ML • Expensive • Quality control is hard, requires second human pass • Hardly scalable à heavy use of sampling strategies • How do companies doing Machine Learning get tagged data? • Implicit tagging: customer engagement • Explicit tagging: manual labor • A few strategies to get tagged data for cheap/free: • Games (Google Quick Draw) • Incentivization (extra lives or bonuses in games) Tagged Data

19. • Gathering quality tagged training data is a common bottleneck in ML • Expensive • Quality control is hard, requires second human pass • Hardly scalable à heavy use of sampling strategies • How do companies doing Machine Learning get tagged data? • Implicit tagging: customer engagement • Explicit tagging: manual labor • A few strategies to get tagged data for cheap/free: • Games (Google Quick Draw) • Incentivization (extra lives or bonuses in games) Tagged Data https://quickdraw.withgoogle.com/

20. Why human input matters: the use case of image colorization The Wisdom from the Crowd

21. Why human input matters: the use case of image colorization The Wisdom from the Crowd Colorization Model à Colorization is straightforward to humans because they can ‘tap’ into their general knowledge

22. The Wisdom from the Crowd image recognition watermelon grapesbananas pineapple orange tagged training data set “Bananas are generally ” ‘general’ knowledge • obvious for human beings • fastidious for machines colorization Why human input matters: the use case of image colorization

23. Crowdsourcing: Human Wisdom at Scale

24. What is Crowdsourcing? the process of getting labor or funding, usually online, from a crowd of people Crowdsourcing

25. What is Crowdsourcing? Ø Crowdsourcing = 'crowd' + 'outsourcing' Ø Act of taking a function once performed by employees and outsourcing it to an undefined (generally large) network of people in the form of an open call the process of getting labor or funding, usually online, from a crowd of people History of Crowdsourcing • Term was first used in 2005 by the editors at Wired • Official definition published in Wired article “The Rise of Crowdsourcing”, June 2016 • Describes how businesses were using the Internet to “outsource work to the crowd” What Crowdsourcing helps with: • Scale à peer-‐production (for jobs to be performed collaboratively) • Reach à connect with a large network of potential laborers (if task undertaken by sole individuals) Crowdsourcing

26. What is Crowdsourcing? Ø Crowdsourcing = 'crowd' + 'outsourcing' Ø Act of taking a function once performed by employees and outsourcing it to an undefined (generally large) network of people in the form of an open call the process of getting labor or funding, usually online, from a crowd of people History of Crowdsourcing • Term was first used in 2005 by the editors at Wired • Official definition published in Wired article “The Rise of Crowdsourcing”, June 2006 • Describes how businesses were using the Internet to “outsource work to the crowd” What Crowdsourcing helps with: • Scale à peer-‐production (for jobs to be performed collaboratively) • Reach à connect with a large network of potential laborers (if task undertaken by sole individuals) Crowdsourcing

27. What is Crowdsourcing? Ø Crowdsourcing = 'crowd' + 'outsourcing' Ø Act of taking a function once performed by employees and outsourcing it to an undefined (generally large) network of people in the form of an open call the process of getting labor or funding, usually online, from a crowd of people Crowdsourcing History of Crowdsourcing • Term was first used in 2005 by the editors at Wired • Official definition published in Wired article “The Rise of Crowdsourcing”, June 2016 • Describes how businesses were using the Internet to “outsource work to the crowd” What Crowdsourcing helps with: • Scale à peer-‐production (for jobs to be performed collaboratively) • Reach à connect with a large network of potential laborers (if task undertaken by sole individuals)

28. The Nature of Crowdsourcing • Data generation: user generated content such as reviews, pictures, translations, etc. • Data validation: validation of translation, etc. • Data tagging: image tagging, product categorization, etc. • Data curation: curation of news feeds, etc. Microtasks Funding Macrotasks • Solution development: algorithm improvement, etc. • Crowd contest: design competition, algorithmic competition, etc.

31. Some Cool Crowdsourcing Applications

32. Some Cool Crowdsourcing Applications Mapping • Photo Sphere • Google Maps crowdsources info for wheelchair-‐accessible places

33. Some Cool Crowdsourcing Applications Mapping • Photo Sphere • Google Maps crowdsources info for wheelchair-‐accessible places Traffic • Google Traffic • Waze: Traffic reporting app

34. Some Cool Crowdsourcing Applications Mapping • Photo Sphere • Google Maps crowdsources info for wheelchair-‐accessible places Traffic • Google Traffic • Waze: Traffic reporting app Translation • Google Translate

35. Some Cool Crowdsourcing Applications Mapping • Photo Sphere • Google Maps crowdsources info for wheelchair-‐accessible places Traffic • Google Traffic • Waze: Traffic reporting app Epidemiology • Flu tracking applications Translation • Google Translate

36. Companies Based on Crowdsourcing Quora is a question-‐and-‐answer site where questions are asked, answered, edited and organized by its community of users. Waze is a community-‐based traffic and navigation app where drivers share real-‐time traffic and road info Kaggle is a platform for predictive modelling competitions in which companies post data and data miners compete to produce the best models. Stack Overflow is a platform for users to ask and answer questions and to vote questions and answers up or down and edit them. Flickr is an image and video hosting website that is widely used by bloggers to host images that they embed in social media.

37. The Challenges of Crowdsourcing

38. Reliability • Retail: Absence of emotional involvement (judges are not actually spending money on items) • Waze: Locals were sending fake information to limit traffic in their area Relevance of knowledge • Retail: Judges might not have appropriate knowledge of the items they are evaluating Subjectivity • Search: Relevance score varies depending on profile and personal preferences Speed & cost • Human evaluations take time, can only be performed sporadically and on samples • Not practical for measurement purposes The Challenges of Crowdsourcing

42. Crowdsourcing vs. Curated Crowds Traditional Crowdsourcing Model $$$$$ + Speed: • many hands generate light work + Lower cost: • typically a few pennies per task -‐ No quality control -‐ Lack of control: • little to no incentive to deliver on time -‐ High maintenance: • clear instructions needed • automated understanding checks -‐ Lower reliability: • high overlap required -‐ Lack of confidentiality: • anyone can see your tasks Curated Crowd $$$$$ + Quality control: • judges submitted to quality metrics • removed if they don’t deliver required quality + Better quality: • very little overlap needed + Expertise: • judges become experts at required task + Constraints on crowd: • judges less likely to drop out -‐ More expensive: • typically primary source of income for judges -‐ Consistency required: • need frequent tasks to keep sharp skills

43. Catalog Curation • Product Description Curation • Product Tagging & Categorization • Product Deduplication • Taxonomy Testing Search Relevance Evaluation • Relevance score (query-‐item pair scores) • Engine comparison (ranking-‐to-‐ranking) Review Moderation • Removal/flagging of obscene reviews Mystery Shopping • Analysis and discovery of new trends • Evaluation of new products • Competitive analysis Crowdsourcing Applications in e-‐Commerce

44. Catalog Curation • Product Description Curation • Product Tagging & Categorization • Product Deduplication • Taxonomy Testing Search Relevance Evaluation • Relevance score (query-‐item pair scores) • Engine comparison (ranking-‐to-‐ranking) Review Moderation • Removal/flagging of obscene reviews Mystery Shopping • Analysis and discovery of new trends • Evaluation of new products • Competitive analysis Crowdsourcing Applications in e-‐Commerce The example of Product Tagging

48. Use Case: Evaluation of Search Engine Relevance à Human evaluation makes it possible to measure the intangible with little risk Ranking BRanking A Side-‐by-‐Side Engine Comparison Judge 1: Prefers ranking A Judge 2: Prefers ranking A Judge 3: Prefers ranking B

49. Use Case: Evaluation of Search Engine Relevance 5/5 5/5 5/5 4/5 3/5 2/5 5/5 5/5 5/5 5/5 5/5 5/5 Query-‐Item Relevance Scoring for Measurement of Ranking Quality 𝐷𝐶𝐺$ = & 𝑟𝑒𝑙* 𝑙𝑜𝑔-(𝑖 + 1) $ *34 𝑛𝐷𝐶𝐺$ = 𝐷𝐶𝐺$ 𝐼𝐷𝐶𝐺$ 𝐼𝐷𝐶𝐺$ = & 289:; − 1 𝑙𝑜𝑔-(𝑖 + 1) =>? *34 where graded relevance of item at position i Discounted cumulative gain

50. Human-‐in-‐the-‐Loop: When Human Beings still Outperform the Machine Fact: the brain has 38 petaflops (thousand trillion operations per second) of processing power…

51. The Dream of Automation FIRST REVOLUTION – 1784 Mechanical production, railroad, steam power SECOND REVOLUTION – 1870 Mass production, electrical power, assembly lines THIRD REVOLUTION – 1969 Automated production, electronics, computers FOURTH REVOLUTION – ongoing Artificial intelligence, big data The 4 Industrial Revolutions

52. The Dream of Automation FIRST REVOLUTION – 1784 Mechanical production, railroad, steam power SECOND REVOLUTION – 1870 Mass production, electrical power, assembly lines THIRD REVOLUTION – 1969 Automated production, electronics, computers FOURTH REVOLUTION – ongoing Artificial intelligence, big data à Automation is not a new idea The 4 Industrial Revolutions

53. The Dream of Automation FIRST REVOLUTION – 1784 Mechanical production, railroad, steam power SECOND REVOLUTION – 1870 Mass production, electrical power, assembly lines THIRD REVOLUTION – 1969 Automated production, electronics, computers FOURTH REVOLUTION – ongoing Artificial intelligence, big data à Automation is not a new idea The 4 Industrial Revolutions the use of various control systems for operating equipment such as machinery and processes with minimal or reduced human intervention. Automation

54. The Dream of Automation the use of various control systems for operating equipment such as machinery and processes with minimal or reduced human intervention. FIRST REVOLUTION – 1784 Mechanical production, railroad, steam power SECOND REVOLUTION – 1870 Mass production, electrical power, assembly lines THIRD REVOLUTION – 1969 Automated production, electronics, computers FOURTH REVOLUTION – ongoing Artificial intelligence, big data Why? • Automate boring/repetitive tasks • Perform tasks at scale • Perform tasks with enhanced precision • Deliver consistent products • Use machines where they outperform humans à Automation is not a new idea The 4 Industrial Revolutions Automation

55. When Full Automation can’t be Achieved… Human-‐in-‐the-‐Loop Human-in-the-loop or HITL is defined as a model or a system that requires human interaction

56. The idea of using human beings to enhance the machine is not new We have been doing Human-‐in-‐the-‐Loop all along… • Example: Autopilot technology for planes Human intervention/presence is useful: • To handle corner cases (outlier management) • To “keep an eye” on the system (sanity check) • To correct unwanted behavior (refinement) • To validate appropriate behavior (validation) When Full Automation can’t be Achieved… Human-in-the-loop or HITL is defined as a model or a system that requires human interaction Human-‐in-‐the-‐Loop

57. The idea of using human beings to enhance the machine is not new We have been doing Human-‐in-‐the-‐Loop all along… • Example: Autopilot technology for planes Human intervention/presence is useful: • To handle corner cases (outlier management) • To “keep an eye” on the system (sanity check) • To correct unwanted behavior (refinement) • To validate appropriate behavior (validation) When Full Automation can’t be Achieved… Human-in-the-loop or HITL is defined as a model or a system that requires human interaction Human-‐in-‐the-‐Loop

58. Human-‐in-‐the-‐Loop Paradigm Pareto Principle aka the 80/20 rule, the law of the vital few, or the principle of factor sparsity -‐ states that, for many events, roughly 80% of the effects come from 20% of the causes

59. ML version of the Pareto Principle: • Evidence suggests that some of the most accurate ML systems to date need: • 80% computer AI-‐driven • 19% human input • 1 % unknown randomness to balance things out • The combination of machine and human intervention achieves maximum machine accuracy How can human knowledge be incorporated to ML models? A. Helping label the original dataset that will be fed into a ML model B. Helping correct inaccurate predictions that arise as the system goes live. Human-‐in-‐the-‐Loop Paradigm aka the 80/20 rule, the law of the vital few, or the principle of factor sparsity -‐ states that, for many events, roughly 80% of the effects come from 20% of the causes Pareto Principle

60. ML version of the Pareto Principle: • Evidence suggests that some of the most accurate ML systems to date need: • 80% computer AI-‐driven • 19% human input • 1 % unknown randomness to balance things out • The combination of machine and human intervention achieves maximum machine accuracy How can human knowledge be incorporated to ML models? A. Helping label the original dataset that will be fed into a ML model B. Helping correct inaccurate predictions that arise as the system goes live Human-‐in-‐the-‐Loop Paradigm aka the 80/20 rule, the law of the vital few, or the principle of factor sparsity -‐ states that, for many events, roughly 80% of the effects come from 20% of the causes Pareto Principle

61. Human-‐In-‐The-‐Loop Use Case #1 An example of HITL approach: face recognition

62. Human-‐In-‐The-‐Loop Use Case #1 Mary Roberto Victoria LauraSebastian Cecelia An example of HITL approach: face recognition

63. Human-‐In-‐The-‐Loop Use Case #1 Mary Roberto Victoria LauraSebastian Cecelia Accuracy • Facebook's DeepFace Software reaches 97.25% of accuracy HITL as a feedback loop • When the confidence is below a certain threshold, it: • suggests a label • ask the uploader to validate/approve or correct the suggestion • The new data is used to improve the accuracy of the algorithm An example of HITL approach: face recognition

64. Human-‐In-‐The-‐Loop Use Case #1 Mary Roberto Victoria LauraSebastian Cecelia Accuracy • Facebook's DeepFace Software reaches 97.25% of accuracy HITL as a feedback loop • When the confidence is below a certain threshold, it: • suggests a label • ask the uploader to validate/approve or correct the suggestion • The new data is used to improve the accuracy of the algorithm An example of HITL approach: face recognition

65. Human-‐In-‐The-‐Loop Use Case #2 An example of HITL approach: autonomous vehicles

66. Teaching the machine • Driving systems were trained using a human to oversee the process Accuracy considerations • Autopilot system is now over 99% accurate • However, a 99% accuracy means that people can die 1% of the time (!!) • Though we have seen huge advances in accuracy of pure machine-‐ driven systems, they tend to fall short of acceptable accuracy rates Human-‐In-‐The-‐Loop Use Case #2 An example of HITL approach: autonomous vehicles

67. Teaching the machine • Driving systems were trained using a human to oversee the process Accuracy considerations • Autopilot system is now over 99% accurate • However, a 99% accuracy means that people can die 1% of the time (!!) • Though we have seen huge advances in accuracy of pure machine-‐ driven systems, they tend to fall short of acceptable accuracy rates Human-‐In-‐The-‐Loop Use Case #2 An example of HITL approach: autonomous vehicles

68. Teaching the machine • Driving systems were trained using a human to oversee the process Accuracy considerations • Autopilot system is now over 99% accurate • However, a 99% accuracy means that people can die 1% of the time (!!) • Though we have seen huge advances in accuracy of pure machine-‐ driven systems, they tend to fall short of acceptable accuracy rates Human-‐In-‐The-‐Loop Use Case #2 An example of HITL approach: autonomous vehicles Corner cases • Fun fact: Volvo’s self-‐driving cars fail in Australia because of kangaroos • Reaching 100% is hard because of corner cases • A HITL approach helps get the accuracy to ~100% • get the accuracy to ~100% Volvo's driverless cars 'confused' by kangaroos

69. The Success of Human-‐In-‐The-‐Loop The Example of Chess

70. The Human vs. the Machine • In 1997, Chess Master Garry Kasparov is beaten by IBM supercomputer Deep Blue The Success of Human-‐In-‐The-‐Loop The Example of Chess Garry Kasparov

71. The Human vs. the Machine • In 1997, Chess Master Garry Kasparov is beaten by IBM supercomputer Deep Blue The Success of Human-‐In-‐The-‐Loop The Example of Chess Freestyle or “Advanced” Chess • Advanced: A human chess master works with a computer to find the best possible move • Freestyle: A team can be made of any combination of human beings + computers • In 2005, Steven Cramton, Zackary Stephen and their 3 computers win Freestyle Chess Tournament Why it works • Computers are great at reading tough tactical situations • But humans are better at understanding long term strategy • Computers to limit “blunders” while using their intuition to force the opponent into board states that confuses the computer(s) Garry Kasparov

72. The Human vs. the Machine • In 1997, Chess Master Garry Kasparov is beaten by IBM supercomputer Deep Blue The Success of Human-‐In-‐The-‐Loop The Example of Chess Freestyle or “Advanced” Chess • Advanced: A human chess master works with a computer to find the best possible move • Freestyle: A team can be made of any combination of human beings + computers • In 2005, Steven Cramton, Zackary Stephen and their 3 computers win Freestyle Chess Tournament Why it works • Computers are great at reading tough tactical situations • But humans are better at understanding long term strategy • Computers to limit “blunders” while using their intuition to force the opponent into board states that confuses the computer(s) Garry Kasparov

73. Active Learning: The Best of Both Worlds

74. Active Learning a special case of semi-‐supervised ML in which a learning algorithm can interactively query the user (oracle) to obtain the desired outputs at new data points, maximizing validity and relevance Active Learning

75. Active Learning a special case of semi-‐supervised ML in which a learning algorithm can interactively query the user (oracle) to obtain the desired outputs at new data points, maximizing validity and relevance General Strategy If D is the entire data set, a each iteration i , D is broken up into three subsets 1. DK, i : data points where the label is known 2. DU, i : data points where the label is unknown 3. DQ, i : data points for which the label is queried (sometimes, even when the label is known) Benefits • Query labels only when necessary (lower cost) Next Generation Algorithms • Proactive learning: • relaxes the assumption that the oracle is always right • casts the problem as an optimization problem w/ a budget constraint Active Learning

78. Active Learning: How does it Work?

79. Active Learning: How does it Work? Machine Learning needs • Logics (algorithm) • Data • Optimization • Feedback ß Human-‐in-‐the-‐Loop Active Learning = a Machine Learning Algorithm using an “oracle” to reduce mistakes/uncertainty Query Strategy -‐ Labels are queried when: • Data points for which model uncertainty is high (uncertainty sampling) • Data points for which the different models of an ensemble method disagree the most (query by committee) • Data points causing the most changes on the model (expected model change) • Data points caused overall variance to be high (variance reduction)

80. Active Learning: How does it Work? Unlabeled Data Active Learning Algorithm select/remove single example Labeled Data Classifier Oracle (Human) update add labeled example provide correct label Machine Learning needs • Logics (algorithm) • Data • Optimization • Feedback ß Human-‐in-‐the-‐Loop Active Learning = a Machine Learning Algorithm using an “oracle” to reduce mistakes/uncertainty Query Strategy -‐ Labels are queried when: • Data points for which model uncertainty is high (uncertainty sampling) • Data points for which the different models of an ensemble method disagree the most (query by committee) • Data points causing the most changes on the model (expected model change) • Data points caused overall variance to be high (variance reduction)

81. Active Learning: How does it Work? Unlabeled Data Active Learning Algorithm select/remove single example Labeled Data Classifier Oracle (Human) update add labeled example provide correct label Machine Learning needs • Logics (algorithm) • Data • Optimization • Feedback ß Human-‐in-‐the-‐Loop Active Learning = a Machine Learning Algorithm using an “oracle” to reduce mistakes/uncertainty Query Strategy -‐ Labels are queried when: • Data points for which model uncertainty is high (uncertainty sampling) • Data points for which the different models of an ensemble method disagree the most (query by committee) • Data points causing the most changes on the model (expected model change) • Data points caused overall variance to be high (variance reduction)

82. Active Learning: How does it Work? Machine Learning Classifier Confidence level high? YES NO Output Annotation by Human Oracle Human-‐in-‐the-‐Loop Active Learning By adding a human feedback loop, we allow the system to: • actively learn • correct itself where it got it wrong • improve the algorithm over iterations

83. Active Learning: How does it Work? Machine Learning Classifier Confidence level high? YES NO Output Annotation by Human Oracle Human-‐in-‐the-‐Loop Active Learning By adding a human feedback loop, we allow the system to: • actively learn • correct itself where it got it wrong • improve the algorithm over iterations

84. 3 Use Cases using Active Learning in the context of Search/Retail Active Learning at Walmart e-‐Commerce

85. q Machine Learning Lifecycle Management (Programming by Feedback) • Automatic monitoring of input and output values for ML algorithm • An algorithm detects failings and outliers in real-‐time and suggest an action • A human validates the action, creating tagged data for full automation q Diagnosis of Catalog Data Issues (Reinforcement Learning) • Algorithm uncovers demoted items and suggests most likely reason for the demotion • Engineer manually confirms/corrects the suggestion, generating training data for full automation q Refinement of Query Tagging Algorithm (Optimization) • Human evaluation team manually measures accuracy of query tagging model • Mistagged queries are used to discover patterns specific to problematic queries, which are reported to engineers • Sample is enriched with problematic queries (evaluation team can diagnose problems with algorithms) 3 Use Cases using Active Learning in the context of Search/Retail Active Learning at Walmart e-‐Commerce

86. q Machine Learning Lifecycle Management (Programming by Feedback) • Automatic monitoring of input and output values for ML algorithm • An algorithm detects failings and outliers in real-‐time and suggest an action • A human validates the action, creating tagged data for full automation q Diagnosis of Catalog Data Issues (Reinforcement Learning) • Algorithm uncovers demoted items and suggests most likely reason for the demotion • Engineer manually confirms/corrects the suggestion, generating training data for full automation q Refinement of Query Tagging Algorithm (Optimization) • Human evaluation team manually measures accuracy of query tagging model • Mistagged queries are used to discover patterns specific to problematic queries, which are reported to engineers • Sample is enriched with problematic queries (evaluation team can diagnose problems with algorithms) 3 Use Cases using Active Learning in the context of Search/Retail Active Learning at Walmart e-‐Commerce

87. q Machine Learning Lifecycle Management (Programming by Feedback) • Automatic monitoring of input and output values for ML algorithm • An algorithm detects failings and outliers in real-‐time and suggest an action • A human validates the action, creating tagged data for full automation q Diagnosis of Catalog Data Issues (Reinforcement Learning) • Algorithm uncovers demoted items and suggests most likely reason for the demotion • Engineer manually confirms/corrects the suggestion, generating training data for full automation q Refinement of Query Tagging Algorithm (Optimization) • Human evaluation team manually measures accuracy of query tagging model • Mistagged queries are used to discover patterns specific to problematic queries, which are reported to engineers • Sample is enriched with problematic queries (evaluation team can diagnose problems with algorithms) 3 Use Cases using Active Learning in the context of Search/Retail red t-shirt Size M color product type size Active Learning at Walmart e-‐Commerce

88. • Why do humans and machine complement each other? • Human beings are memory-‐constrained • Computers are knowledge-‐constrained • Tagged data more important than ever • But getting quality data is challenging given the volume of data • Crowdsourcing offer more flexibility to tag data at scale • Human-‐in-‐the-‐Loop paradigm • Improve accuracy of machine learning algorithm (classifiers) • Many examples of successful endeavors using “Augmented Intelligence” • Active Learning is a booming area of ML/AI Conclusion and Takeaways

91. Thank You!

Walmart Big Data Expo

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Walmart Big Data Expo

Similar to Walmart Big Data Expo (20)

More from BigDataExpo

More from BigDataExpo (20)

Recently uploaded

Recently uploaded (20)

Walmart Big Data Expo