SlideShare a Scribd company logo
1 of 21
Download to read offline
Executive Briefing
Why managing machines is harder than you think
Peter Skomoroch - @peteskomoroch
Strata Data Conference, London - May 1, 2019
Background: Machine Learning & Data Products
Peter Skomoroch
@peteskomoroch
• Co-Founder and CEO of SkipFlag, Enterprise AI
startup acquired in 2018 by Workday
• 18+ years building machine learning products
• Principal Data Scientist, ran Data Products team at
LinkedIn. ML & Search at MIT, AOL, ProfitLogic
• Co-Host of O’Reilly AI Bots Podcast, Startup Advisor
Better, Faster Decisions at Scale
• Machine learning drove massive growth
at consumer internet companies over the
last decade
• A wave of AI startups and vertical
machine learning applications have
emerged across other industries
• For many problems, machine learning
makes better, faster, and more
repeatable decisions at scale
• Amazon, Google, and Microsoft are now
re-organizing themselves around AI
Data Products
Automated systems that collect and learn from data to
make user facing decisions with machine learning
Machine Learning Projects are Hard
• The transition to machine learning will be about 100x harder than the
transition to mobile
• Companies that adopt an experimental culture can still succeed
• Some of the biggest challenges are organizational, not technical
• Data driven companies like Google and Facebook have a strategic
advantage building ML products based on their data & compute assets,
large user population, tracking & instrumentation, and AI talent
If you only do things where you know
the answer in advance, your company
goes away.
Jeff Bezos
Founder, Chairman & CEO of Amazon.com
• Machine Learning shifts
engineering from a deterministic
process to a probabilistic one
• Take intelligent risks
• Most successful ML products are
experiments at massive scale
• Companies driven by analytics
and experimental insights are
more likely to succeed
Experimental Culture
Data Pipelines & Analytics Before AI
Credit: @mrogati
ML Algorithms Need Lots of Labelled Data
Common Crawl: ~4B pages monthly
Combined Pools of Data Give Better Results
https://www.flickr.com/photos/nakrnsm/3814916578
• Learning patterns across large
numbers of customers is the
power behind recommendations
from companies like Amazon and
Netflix
• The more precise or nuanced a
prediction, the more data will
need to be pooled
• You need large amounts of
labelled training data
• Transfer learning may help push
these limits further
Democratize Data Access
• Allow teams across your company to combine real data to improve
their product areas, design with data, and discover new insights
• Share derived data and input features for ML models across teams
• At LinkedIn we had a rich repository of signals like connection
strength, inferred skills, and other datasets that greatly accelerated
new product development
• Empower small teams to build things
quickly and compound returns on
feature engineering & derived data
See https://www.confluent.io/ebook/i-heart-logs-event-data-stream-processing-and-data-integration/
Product Management for Machine Learning
Image source: Martin Eriksson https://www.mindtheproduct.com/2011/10/what-exactly-is-a-product-manager/
• A Data Product Manager (PM)
has core product skills (strategy,
roadmaps, prioritization, etc.)
along with an intuitive grasp of
ML
• They help identify and prioritize
the highest value applications for
machine learning and do what it
takes to make them successful
Good ML Product Managers Have Data Expertise
• Know the difference between easy, hard, and impossible machine
learning problems
• Even if something is feasible from a machine learning perspective,
the level of effort may not justify building the feature
• Know your company’s data inside and out including quality issues,
limitations, biases, and gaps that need to be addressed
• Develop an intuitive understanding of your company’s data and how
it can be used to solve customer problems
Apply ML to a Metric the Business Cares About
Machine Learning Product Development
1. Verify you are solving the right problem
2. Theory + model design (in parallel with UI design)
3. Data collection, labelling, and cleaning
4. Feature engineering, model training, offline validation
5. Model deployment, monitoring & large scale training
• Iterate: repeat process, refine live model & improve
• 80% of effort and gains come from iterations after shipping v 1.0
• Use derived data from the system to build new products
ML Adds Uncertainty to Product Roadmaps
• PMs are often uncomfortable with expensive ideas that have an
uncertain probability of success
• Many organizations will struggle to justify the expense of projects that
require significant research investment upfront
• Some ML products may need to be split into time boxed projects that
get to market in a shorter time frame
• What can you productize now vs. much later on?
• Keep track of dependencies on other teams and have a “Plan B”
Every single company I've worked at
and talked to has the same problem
without a single exception so far —
poor data quality, especially tracking
data
Ruslan Belkin
VP of Engineering, Salesforce.com
• Guide user input when you can
• Use auto suggest fields
• Validate user inputs, emails
• Collect user tags, votes, ratings
• Track impressions, queries, clicks
• Sessionize logs
• Disambiguate and annotate
entities (company names,
locations, etc.)
Data Quality & Standardization
Testing Machine Learning Products
• Algorithm work that drags on without integration in the product where it can
be seen and tested by real users is risky
• Ship a complete MVP in production ASAP, benchmark, and iterate
• Beware unintended consequences from seemingly small product changes
• Remember the prototype is not the product - see what happens when you
use a more realistic data set or scale up your inputs
• Real world data changes over time, ensure your model tests and
benchmarks keep up with changes in underlying data
• Machine learning systems tend to fail in unexpected ways
Look at Your Input Data & Prediction Errors
Flywheel Effects & Data Products
• Users generate data as a side effect of
using most software products
• That data in turn, can improve the
product’s algorithms and enable new
types of recommendations, leading to
more data
• These “Flywheels” get better the more
customers use them leading to unique
competitive moats
• This works well in platforms, networks or
marketplaces where value compounds
* https://medium.freecodecamp.org/the-business-implications-of-machine-learning-11480b99184d
Final Thoughts
• Machine learning products are hard to
build, but within reach of teams who
invest in data infrastructure
• Some of the biggest challenges are
organizational, not technical
• Good product leaders are a key factor in
shipping successful ML products
• Find a machine learning application with
a direct connection to a metric your
organization values and ship it
Send me questions! @peteskomoroch
Q&A / Discussion

More Related Content

What's hot

From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOpsCarl W. Handlin
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup Omid Vahdaty
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPDatabricks
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023CoriFaklaris1
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesDianaGray10
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best PracticesMatillion
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Adrien Blind
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfLiming Zhu
 
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYGENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYAndre Muscat
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architecturesDaniel Marcous
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfSuresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfAWS Chicago
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsDataPhoenix
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scaleMaxim Salnikov
 
Algoworks - Custom Software Development Company with CRM, ECM, Mobile Consu...
Algoworks -  Custom Software Development Company with  CRM, ECM, Mobile Consu...Algoworks -  Custom Software Development Company with  CRM, ECM, Mobile Consu...
Algoworks - Custom Software Development Company with CRM, ECM, Mobile Consu...Ajeet Singh
 

What's hot (20)

MLOps.pptx
MLOps.pptxMLOps.pptx
MLOps.pptx
 
From Data Science to MLOps
From Data Science to MLOpsFrom Data Science to MLOps
From Data Science to MLOps
 
Data Pipline Observability meetup
Data Pipline Observability meetup Data Pipline Observability meetup
Data Pipline Observability meetup
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCPBuilding End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best Practices
 
Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)Introdution to Dataops and AIOps (or MLOps)
Introdution to Dataops and AIOps (or MLOps)
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYGENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdfSuresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
Suresh Poopandi_Generative AI On AWS-MidWestCommunityDay-Final.pdf
 
The A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOpsThe A-Z of Data: Introduction to MLOps
The A-Z of Data: Introduction to MLOps
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
Github copilot
Github copilotGithub copilot
Github copilot
 
Algoworks - Custom Software Development Company with CRM, ECM, Mobile Consu...
Algoworks -  Custom Software Development Company with  CRM, ECM, Mobile Consu...Algoworks -  Custom Software Development Company with  CRM, ECM, Mobile Consu...
Algoworks - Custom Software Development Company with CRM, ECM, Mobile Consu...
 

Similar to Executive Briefing: Why managing machines is harder than you think

Product Management for AI
Product Management for AIProduct Management for AI
Product Management for AIPeter Skomoroch
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportPeter Skomoroch
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning ModelsTash Bickley
 
How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...Skyl.ai
 
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...Skyl.ai
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyPerficient, Inc.
 
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...Comit Projects Ltd
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsAbhishek Sood
 
Getting Knowledge Transfer Right Enterprise Wide Webinar
Getting Knowledge Transfer Right Enterprise Wide WebinarGetting Knowledge Transfer Right Enterprise Wide Webinar
Getting Knowledge Transfer Right Enterprise Wide WebinarConcept Searching, Inc
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...Concept Searching, Inc
 
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...Manju Devadas
 
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...Skyl.ai
 
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7   -  Tableau Webinar on enabling Organization to be Data Driven in 201...Pluto7   -  Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...Manju Devadas
 
FTFCU - How to Become a Data Driven Organization
FTFCU - How to Become a Data Driven OrganizationFTFCU - How to Become a Data Driven Organization
FTFCU - How to Become a Data Driven OrganizationNaveen Jain
 
Atlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfAtlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfSubrat Kumar Dash
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsApplause
 
Analytics in manufacturing
Analytics in manufacturingAnalytics in manufacturing
Analytics in manufacturingSaurav Kumar
 

Similar to Executive Briefing: Why managing machines is harder than you think (20)

Product Management for AI
Product Management for AIProduct Management for AI
Product Management for AI
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder Support
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Productionising Machine Learning Models
Productionising Machine Learning ModelsProductionising Machine Learning Models
Productionising Machine Learning Models
 
How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...How an AI-backed recommendation system can help increase revenue for your onl...
How an AI-backed recommendation system can help increase revenue for your onl...
 
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
test - Future of Ecommerce: How to Improve the Online Shopping Experience Usi...
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data Strategy
 
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...
ETDP 2015 D1 SMAC & the Journey from Automation to Digital Factory - Snjeev K...
 
Tips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data AnalyticsTips --Break Down the Barriers to Better Data Analytics
Tips --Break Down the Barriers to Better Data Analytics
 
Getting Knowledge Transfer Right Enterprise Wide Webinar
Getting Knowledge Transfer Right Enterprise Wide WebinarGetting Knowledge Transfer Right Enterprise Wide Webinar
Getting Knowledge Transfer Right Enterprise Wide Webinar
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
You Spoke, We Listened – Achieving a New Level of Search Optimization with Go...
 
Managing AI Products
Managing AI ProductsManaging AI Products
Managing AI Products
 
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
Machine Learning Application to Manufacturing using Tableau, Tableau and Goog...
 
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...
Future of Ecommerce: How to Improve the Online Shopping Experience Using Mach...
 
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7   -  Tableau Webinar on enabling Organization to be Data Driven in 201...Pluto7   -  Tableau Webinar on enabling Organization to be Data Driven in 201...
Pluto7 - Tableau Webinar on enabling Organization to be Data Driven in 201...
 
FTFCU - How to Become a Data Driven Organization
FTFCU - How to Become a Data Driven OrganizationFTFCU - How to Become a Data Driven Organization
FTFCU - How to Become a Data Driven Organization
 
Atlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdfAtlan_Product metering_Subrat.pdf
Atlan_Product metering_Subrat.pdf
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI Applications
 
Analytics in manufacturing
Analytics in manufacturingAnalytics in manufacturing
Analytics in manufacturing
 

More from Peter Skomoroch

Managing Machines: The New AI Dev Stack
Managing Machines: The New AI Dev StackManaging Machines: The New AI Dev Stack
Managing Machines: The New AI Dev StackPeter Skomoroch
 
Building Competitive Moats With Data
Building Competitive Moats With DataBuilding Competitive Moats With Data
Building Competitive Moats With DataPeter Skomoroch
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustPeter Skomoroch
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsPeter Skomoroch
 
Skills, Reputation, and Search
Skills, Reputation, and SearchSkills, Reputation, and Search
Skills, Reputation, and SearchPeter Skomoroch
 
LinkedIn Endorsements: Reputation, Virality, and Social Tagging
LinkedIn Endorsements: Reputation, Virality, and Social TaggingLinkedIn Endorsements: Reputation, Virality, and Social Tagging
LinkedIn Endorsements: Reputation, Virality, and Social TaggingPeter Skomoroch
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data ProductsPeter Skomoroch
 
Practical Problem Solving with Data - Onlab Data Conference, Tokyo
Practical Problem Solving with Data - Onlab Data Conference, TokyoPractical Problem Solving with Data - Onlab Data Conference, Tokyo
Practical Problem Solving with Data - Onlab Data Conference, TokyoPeter Skomoroch
 
Street Fighting Data Science
Street Fighting Data ScienceStreet Fighting Data Science
Street Fighting Data SciencePeter Skomoroch
 
Data Mashups -Data Science Summit
Data Mashups -Data Science SummitData Mashups -Data Science Summit
Data Mashups -Data Science SummitPeter Skomoroch
 
Geo Analytics Tutorial - Where 2.0 2011
Geo Analytics Tutorial - Where 2.0 2011Geo Analytics Tutorial - Where 2.0 2011
Geo Analytics Tutorial - Where 2.0 2011Peter Skomoroch
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopPeter Skomoroch
 
Prototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.orgPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.orgPeter Skomoroch
 

More from Peter Skomoroch (14)

Managing Machines: The New AI Dev Stack
Managing Machines: The New AI Dev StackManaging Machines: The New AI Dev Stack
Managing Machines: The New AI Dev Stack
 
Building Competitive Moats With Data
Building Competitive Moats With DataBuilding Competitive Moats With Data
Building Competitive Moats With Data
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data Exhaust
 
SF Data Science: Developing Data Products
SF Data Science: Developing Data ProductsSF Data Science: Developing Data Products
SF Data Science: Developing Data Products
 
Skills, Reputation, and Search
Skills, Reputation, and SearchSkills, Reputation, and Search
Skills, Reputation, and Search
 
LinkedIn Endorsements: Reputation, Virality, and Social Tagging
LinkedIn Endorsements: Reputation, Virality, and Social TaggingLinkedIn Endorsements: Reputation, Virality, and Social Tagging
LinkedIn Endorsements: Reputation, Virality, and Social Tagging
 
Developing Data Products
Developing Data ProductsDeveloping Data Products
Developing Data Products
 
Practical Problem Solving with Data - Onlab Data Conference, Tokyo
Practical Problem Solving with Data - Onlab Data Conference, TokyoPractical Problem Solving with Data - Onlab Data Conference, Tokyo
Practical Problem Solving with Data - Onlab Data Conference, Tokyo
 
Street Fighting Data Science
Street Fighting Data ScienceStreet Fighting Data Science
Street Fighting Data Science
 
Data Mashups -Data Science Summit
Data Mashups -Data Science SummitData Mashups -Data Science Summit
Data Mashups -Data Science Summit
 
Geo Analytics Tutorial - Where 2.0 2011
Geo Analytics Tutorial - Where 2.0 2011Geo Analytics Tutorial - Where 2.0 2011
Geo Analytics Tutorial - Where 2.0 2011
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
 
Prototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.orgPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org
 
Elasticwulf Pycon Talk
Elasticwulf Pycon TalkElasticwulf Pycon Talk
Elasticwulf Pycon Talk
 

Recently uploaded

Day 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampDay 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampPLCLeadershipDevelop
 
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...Pooja Nehwal
 
Strategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal AnalsysisStrategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal Analsysistanmayarora45
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentNimot Muili
 
Reviewing and summarization of university ranking system to.pptx
Reviewing and summarization of university ranking system  to.pptxReviewing and summarization of university ranking system  to.pptx
Reviewing and summarization of university ranking system to.pptxAss.Prof. Dr. Mogeeb Mosleh
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field ArtilleryKennethSwanberg
 
internal analysis on strategic management
internal analysis on strategic managementinternal analysis on strategic management
internal analysis on strategic managementharfimakarim
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdfAlejandromexEspino
 
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Pooja Nehwal
 
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Hedda Bird
 
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Agile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxAgile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxalinstan901
 
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607dollysharma2066
 

Recently uploaded (15)

Day 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC BootcampDay 0- Bootcamp Roadmap for PLC Bootcamp
Day 0- Bootcamp Roadmap for PLC Bootcamp
 
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...Call Now Pooja Mehta :  7738631006 Door Step Call Girls Rate 100% Satisfactio...
Call Now Pooja Mehta : 7738631006 Door Step Call Girls Rate 100% Satisfactio...
 
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTECAbortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
Abortion pills in Jeddah |• +966572737505 ] GET CYTOTEC
 
Intro_University_Ranking_Introduction.pptx
Intro_University_Ranking_Introduction.pptxIntro_University_Ranking_Introduction.pptx
Intro_University_Ranking_Introduction.pptx
 
Strategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal AnalsysisStrategic Management, Vision Mission, Internal Analsysis
Strategic Management, Vision Mission, Internal Analsysis
 
Beyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable developmentBeyond the Codes_Repositioning towards sustainable development
Beyond the Codes_Repositioning towards sustainable development
 
Reviewing and summarization of university ranking system to.pptx
Reviewing and summarization of university ranking system  to.pptxReviewing and summarization of university ranking system  to.pptx
Reviewing and summarization of university ranking system to.pptx
 
Safety T fire missions army field Artillery
Safety T fire missions army field ArtillerySafety T fire missions army field Artillery
Safety T fire missions army field Artillery
 
internal analysis on strategic management
internal analysis on strategic managementinternal analysis on strategic management
internal analysis on strategic management
 
International Ocean Transportation p.pdf
International Ocean Transportation p.pdfInternational Ocean Transportation p.pdf
International Ocean Transportation p.pdf
 
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
Call now : 9892124323 Nalasopara Beautiful Call Girls Vasai virar Best Call G...
 
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...Dealing with Poor Performance - get the full picture from 3C Performance Mana...
Dealing with Poor Performance - get the full picture from 3C Performance Mana...
 
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 99 Noida Escorts >༒8448380779 Escort Service
 
Agile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptxAgile Coaching Change Management Framework.pptx
Agile Coaching Change Management Framework.pptx
 
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607GENUINE Babe,Call Girls IN Baderpur  Delhi | +91-8377087607
GENUINE Babe,Call Girls IN Baderpur Delhi | +91-8377087607
 

Executive Briefing: Why managing machines is harder than you think

  • 1. Executive Briefing Why managing machines is harder than you think Peter Skomoroch - @peteskomoroch Strata Data Conference, London - May 1, 2019
  • 2. Background: Machine Learning & Data Products Peter Skomoroch @peteskomoroch • Co-Founder and CEO of SkipFlag, Enterprise AI startup acquired in 2018 by Workday • 18+ years building machine learning products • Principal Data Scientist, ran Data Products team at LinkedIn. ML & Search at MIT, AOL, ProfitLogic • Co-Host of O’Reilly AI Bots Podcast, Startup Advisor
  • 3. Better, Faster Decisions at Scale • Machine learning drove massive growth at consumer internet companies over the last decade • A wave of AI startups and vertical machine learning applications have emerged across other industries • For many problems, machine learning makes better, faster, and more repeatable decisions at scale • Amazon, Google, and Microsoft are now re-organizing themselves around AI
  • 4. Data Products Automated systems that collect and learn from data to make user facing decisions with machine learning
  • 5. Machine Learning Projects are Hard • The transition to machine learning will be about 100x harder than the transition to mobile • Companies that adopt an experimental culture can still succeed • Some of the biggest challenges are organizational, not technical • Data driven companies like Google and Facebook have a strategic advantage building ML products based on their data & compute assets, large user population, tracking & instrumentation, and AI talent
  • 6. If you only do things where you know the answer in advance, your company goes away. Jeff Bezos Founder, Chairman & CEO of Amazon.com • Machine Learning shifts engineering from a deterministic process to a probabilistic one • Take intelligent risks • Most successful ML products are experiments at massive scale • Companies driven by analytics and experimental insights are more likely to succeed Experimental Culture
  • 7. Data Pipelines & Analytics Before AI Credit: @mrogati
  • 8. ML Algorithms Need Lots of Labelled Data Common Crawl: ~4B pages monthly
  • 9. Combined Pools of Data Give Better Results https://www.flickr.com/photos/nakrnsm/3814916578 • Learning patterns across large numbers of customers is the power behind recommendations from companies like Amazon and Netflix • The more precise or nuanced a prediction, the more data will need to be pooled • You need large amounts of labelled training data • Transfer learning may help push these limits further
  • 10. Democratize Data Access • Allow teams across your company to combine real data to improve their product areas, design with data, and discover new insights • Share derived data and input features for ML models across teams • At LinkedIn we had a rich repository of signals like connection strength, inferred skills, and other datasets that greatly accelerated new product development • Empower small teams to build things quickly and compound returns on feature engineering & derived data See https://www.confluent.io/ebook/i-heart-logs-event-data-stream-processing-and-data-integration/
  • 11. Product Management for Machine Learning Image source: Martin Eriksson https://www.mindtheproduct.com/2011/10/what-exactly-is-a-product-manager/ • A Data Product Manager (PM) has core product skills (strategy, roadmaps, prioritization, etc.) along with an intuitive grasp of ML • They help identify and prioritize the highest value applications for machine learning and do what it takes to make them successful
  • 12. Good ML Product Managers Have Data Expertise • Know the difference between easy, hard, and impossible machine learning problems • Even if something is feasible from a machine learning perspective, the level of effort may not justify building the feature • Know your company’s data inside and out including quality issues, limitations, biases, and gaps that need to be addressed • Develop an intuitive understanding of your company’s data and how it can be used to solve customer problems
  • 13. Apply ML to a Metric the Business Cares About
  • 14. Machine Learning Product Development 1. Verify you are solving the right problem 2. Theory + model design (in parallel with UI design) 3. Data collection, labelling, and cleaning 4. Feature engineering, model training, offline validation 5. Model deployment, monitoring & large scale training • Iterate: repeat process, refine live model & improve • 80% of effort and gains come from iterations after shipping v 1.0 • Use derived data from the system to build new products
  • 15. ML Adds Uncertainty to Product Roadmaps • PMs are often uncomfortable with expensive ideas that have an uncertain probability of success • Many organizations will struggle to justify the expense of projects that require significant research investment upfront • Some ML products may need to be split into time boxed projects that get to market in a shorter time frame • What can you productize now vs. much later on? • Keep track of dependencies on other teams and have a “Plan B”
  • 16. Every single company I've worked at and talked to has the same problem without a single exception so far — poor data quality, especially tracking data Ruslan Belkin VP of Engineering, Salesforce.com • Guide user input when you can • Use auto suggest fields • Validate user inputs, emails • Collect user tags, votes, ratings • Track impressions, queries, clicks • Sessionize logs • Disambiguate and annotate entities (company names, locations, etc.) Data Quality & Standardization
  • 17. Testing Machine Learning Products • Algorithm work that drags on without integration in the product where it can be seen and tested by real users is risky • Ship a complete MVP in production ASAP, benchmark, and iterate • Beware unintended consequences from seemingly small product changes • Remember the prototype is not the product - see what happens when you use a more realistic data set or scale up your inputs • Real world data changes over time, ensure your model tests and benchmarks keep up with changes in underlying data • Machine learning systems tend to fail in unexpected ways
  • 18. Look at Your Input Data & Prediction Errors
  • 19. Flywheel Effects & Data Products • Users generate data as a side effect of using most software products • That data in turn, can improve the product’s algorithms and enable new types of recommendations, leading to more data • These “Flywheels” get better the more customers use them leading to unique competitive moats • This works well in platforms, networks or marketplaces where value compounds * https://medium.freecodecamp.org/the-business-implications-of-machine-learning-11480b99184d
  • 20. Final Thoughts • Machine learning products are hard to build, but within reach of teams who invest in data infrastructure • Some of the biggest challenges are organizational, not technical • Good product leaders are a key factor in shipping successful ML products • Find a machine learning application with a direct connection to a metric your organization values and ship it Send me questions! @peteskomoroch