SlideShare a Scribd company logo
1 of 16
Download to read offline
Copyright © 2015 Criteo
New challenges for scalable machine
learning in online advertising
Olivier Koch
Engineering Program Manager, Criteo
ICML Online Advertising Systems Workshop
June 24, 2016
Copyright © 2015 Criteo
What we do
2
Advertiser Publisher
Copyright © 2015 Criteo
Machine learning applications at Criteo
• Bidding (2nd price auctions)
• Product recommendation
• Banner look and feel selection
Copyright © 2015 Criteo
Machine learning at Criteo
• Supervised learning using standard regression methods / optimization algorithms (SGD, L-BFGS)
• Distribution on Hadoop (MapReduce, Spark)
• 3B displays / day
• 40 PB of data -- 15,000 servers
• 7 data centers worldwide
Copyright © 2015 Criteo
The good news
• New generations of algorithms
• NLP (word embeddings), reinforcement learning, policy learning, deep networks
• Releases of ML infrastructures
• Caffe on Spark, TensorFlow, Torch, PhotonML, GPUs inside clusters
→ strong traction in the academic/industrial community
Copyright © 2015 Criteo
The good news (c’ed)
• A lot of data is available
• Interactions with banners : clicks
• Interactions with products/advertisers : sales, baskets, home views, listings, visit history
• New data is coming
• Mobile, cross-device, (offline)
Copyright © 2015 Criteo
Now what?
Copyright © 2015 Criteo
Challenges in online advertising 1/3
• The technical debt of large-scale machine learning systems
• AB tests = snapshots. Are we missing long term effects?
• Some models become hard to improve. Are we overfitting or using the wrong metrics?
• We need to deal with a growing number of models – e.g. automate feature engineering
Copyright © 2015 Criteo
Challenges in online advertising 2/3
• We want to provide a better online advertising experience
• Personalized
• Cross-device
• Long tail (new users, new products)
Copyright © 2015 Criteo
Challenges in online advertising 3/3
• Credit assignment and incrementality
• Several clicks might be needed to generate a sale
• We should probably optimize a series of bids as opposed to single bids
• What is the optimal credit assignment scheme?
• We optimize what clients give us
• Attributed sales may not be the right target
• Global sales increase are noisy
Copyright © 2015 Criteo
Machine learning to the rescue
• Offline metrics – counterfactual analysis
• Optimal bidding strategies under uncertainty -- reinforcement learning
• Classification/prediction of time series
• Long tail (users, products) -- transfer learning, factorization
• Probabilistic match of devices
Copyright © 2015 Criteo
Machine learning to the rescue
• Offline metrics – counterfactual analysis
• Optimal bidding strategies under uncertainty -- reinforcement learning
• Classification/prediction of time series
• Long tail (users, products) -- transfer learning, factorization
• Probabilistic match of devices
Copyright © 2015 Criteo
Offline metrics – counterfactual analysis
• Option 1 : run a controlled experiment (AB test)
• How would the system behave if I replaced model M by model M*?
• Takes time to conclude
• Costs money if M* is worse than M (often)
• Does not measure long-term effects
• Option 2 : use counter-factual analysis
• How would the system have performed if, when the data was collected, we had replaced model M by model M∗?
• Requires real-time randomization -- cost/exploration trade-off
• Works best when M* is close to M
• Trades time for computation and storage
• Ignores future users’ and advertisers’ reactions
Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising, Bottou et al.
Copyright © 2015 Criteo
Optimal bidding strategies
• A user is seen more than 20 times a day on average
• Each action we take has an impact on the user, the advertiser and the competition
• Option 1 : model the environment and bid accordingly
• Cannot go beyond the proxy being optimized
• Option 2 : no model, randomized experiments
• Hard problem : very high-dimensional state space and very sparse rewards
Copyright © 2015 Criteo
Conclusions
• Machine learning applies well to online advertising at scale
• New algorithms, new infrastructures and more data are coming
• A number of challenges remain unresolved…
• … come help us solve them!
Copyright © 2015 Criteo
Thanks! Questions?
o.koch@criteo.com
Dataset released: http://bit.ly/criteodata

More Related Content

What's hot

criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015Carolyn Bednarz
 
Introduction Criteo - 2.0
Introduction Criteo - 2.0Introduction Criteo - 2.0
Introduction Criteo - 2.0Scott Turecek
 
Criteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupCriteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupIbrahim Abubakari
 
WS `Publisher Deck
WS `Publisher DeckWS `Publisher Deck
WS `Publisher DeckDavid Facter
 
WebSpectator Presentation deck
WebSpectator Presentation deckWebSpectator Presentation deck
WebSpectator Presentation deckrkemmer
 
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...Digiday
 
CPEx Audience – Adform PAB
CPEx Audience – Adform PABCPEx Audience – Adform PAB
CPEx Audience – Adform PABMatěj Novák
 
Sis mon 1315 sponsored lunch criteo
Sis mon 1315 sponsored lunch criteoSis mon 1315 sponsored lunch criteo
Sis mon 1315 sponsored lunch criteoMediaPost
 
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...Digiday
 
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo
 
Ad Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions
 
Online Ad Serving
Online Ad ServingOnline Ad Serving
Online Ad ServingNeha Gupta
 
Criteo - NOAH13 London
Criteo - NOAH13 LondonCriteo - NOAH13 London
Criteo - NOAH13 LondonNOAH Advisors
 
3 Minute Introduction
3 Minute Introduction3 Minute Introduction
3 Minute IntroductionJulian Tol
 
Your Future With Content Manager OnDemand
Your Future With Content Manager OnDemandYour Future With Content Manager OnDemand
Your Future With Content Manager OnDemandZia Consulting
 
What You Need To Know For A Successful Start on DoubleClick Bid Manager
What You Need To Know For A Successful Start on DoubleClick Bid ManagerWhat You Need To Know For A Successful Start on DoubleClick Bid Manager
What You Need To Know For A Successful Start on DoubleClick Bid ManagerHanapin Marketing
 
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation Slides
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation SlidesthinkLA Programmatic Summit 2014 Aimee Gerry Presentation Slides
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation SlidesthinkLA
 
The History of Advertising Technology
The History of Advertising TechnologyThe History of Advertising Technology
The History of Advertising TechnologyClearcode
 
Our Experience with Adobe Audience Manager DMP
Our Experience with Adobe Audience Manager DMPOur Experience with Adobe Audience Manager DMP
Our Experience with Adobe Audience Manager DMPMatěj Novák
 

What's hot (20)

criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015criteo-performance-advertising-playbook-2015
criteo-performance-advertising-playbook-2015
 
Introduction Criteo - 2.0
Introduction Criteo - 2.0Introduction Criteo - 2.0
Introduction Criteo - 2.0
 
Criteo Couchbase live 2015
Criteo Couchbase live 2015Criteo Couchbase live 2015
Criteo Couchbase live 2015
 
Criteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) MeetupCriteo Infrastructure (Platform) Meetup
Criteo Infrastructure (Platform) Meetup
 
WS `Publisher Deck
WS `Publisher DeckWS `Publisher Deck
WS `Publisher Deck
 
WebSpectator Presentation deck
WebSpectator Presentation deckWebSpectator Presentation deck
WebSpectator Presentation deck
 
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...
Aligning with Buyers to Maximize Mobile Revenue presentation by Criteo at DPS...
 
CPEx Audience – Adform PAB
CPEx Audience – Adform PABCPEx Audience – Adform PAB
CPEx Audience – Adform PAB
 
Sis mon 1315 sponsored lunch criteo
Sis mon 1315 sponsored lunch criteoSis mon 1315 sponsored lunch criteo
Sis mon 1315 sponsored lunch criteo
 
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...
Back to the Future: Bringing Performance Targeting to Mobile Devices from DRS...
 
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of ClickersCriteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
Criteo's Ad Week 2012 presentation - Big Data and the Value of Clickers
 
Ad Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchangeAd Server Solutions - ad server ad exchange
Ad Server Solutions - ad server ad exchange
 
Online Ad Serving
Online Ad ServingOnline Ad Serving
Online Ad Serving
 
Criteo - NOAH13 London
Criteo - NOAH13 LondonCriteo - NOAH13 London
Criteo - NOAH13 London
 
3 Minute Introduction
3 Minute Introduction3 Minute Introduction
3 Minute Introduction
 
Your Future With Content Manager OnDemand
Your Future With Content Manager OnDemandYour Future With Content Manager OnDemand
Your Future With Content Manager OnDemand
 
What You Need To Know For A Successful Start on DoubleClick Bid Manager
What You Need To Know For A Successful Start on DoubleClick Bid ManagerWhat You Need To Know For A Successful Start on DoubleClick Bid Manager
What You Need To Know For A Successful Start on DoubleClick Bid Manager
 
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation Slides
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation SlidesthinkLA Programmatic Summit 2014 Aimee Gerry Presentation Slides
thinkLA Programmatic Summit 2014 Aimee Gerry Presentation Slides
 
The History of Advertising Technology
The History of Advertising TechnologyThe History of Advertising Technology
The History of Advertising Technology
 
Our Experience with Adobe Audience Manager DMP
Our Experience with Adobe Audience Manager DMPOur Experience with Adobe Audience Manager DMP
Our Experience with Adobe Audience Manager DMP
 

Viewers also liked

C# development workflow @ criteo
C# development workflow @ criteoC# development workflow @ criteo
C# development workflow @ criteoIbrahim Abubakari
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentationrecsysfr
 
Criteo. Reach people, not devices!
Criteo. Reach people, not devices!Criteo. Reach people, not devices!
Criteo. Reach people, not devices!HybridRussia
 
Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Criteolabs
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016Pierre Mavro
 
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Paris Monitoring
 
Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Changepoint
 
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012Romain Fonnier
 
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...MongoDB
 
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...Amazon Web Services
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsrecsysfr
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Informationrecsysfr
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...recsysfr
 

Viewers also liked (16)

C# development workflow @ criteo
C# development workflow @ criteoC# development workflow @ criteo
C# development workflow @ criteo
 
RecsysFR: Criteo presentation
RecsysFR: Criteo presentationRecsysFR: Criteo presentation
RecsysFR: Criteo presentation
 
Criteo. Reach people, not devices!
Criteo. Reach people, not devices!Criteo. Reach people, not devices!
Criteo. Reach people, not devices!
 
Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo? Why reinvent the wheel at Criteo?
Why reinvent the wheel at Criteo?
 
Couchbase live 2016
Couchbase live 2016Couchbase live 2016
Couchbase live 2016
 
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
Bizarre... vous avez dit bizarre - Paris Monitoring meetup #2
 
Saintjo Two AV4
Saintjo Two AV4Saintjo Two AV4
Saintjo Two AV4
 
Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!Infographic: How Professional Services Automation can transform your Company!
Infographic: How Professional Services Automation can transform your Company!
 
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012Etude Criteo - la valeur réelle des internautes qui cliquent  - 26 Juin 2012
Etude Criteo - la valeur réelle des internautes qui cliquent - 26 Juin 2012
 
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
Morning with MongoDB Paris 2012 - Cas d'usages courant en entreprise. Présent...
 
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
AWS Summit Paris - Track 1 - Session 2 - Designez vos architectures pour plus...
 
Hadoop summit-ams-2014-04-03
Hadoop summit-ams-2014-04-03Hadoop summit-ams-2014-04-03
Hadoop summit-ams-2014-04-03
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Using Neural Networks to predict user ratings
Using Neural Networks to predict user ratingsUsing Neural Networks to predict user ratings
Using Neural Networks to predict user ratings
 
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-InformationMeta-Prod2Vec: Simple Product Embeddings with Side-Information
Meta-Prod2Vec: Simple Product Embeddings with Side-Information
 
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task ...
 

Similar to New challenges for scalable machine learning in online advertising

Why is programmatic taking off? What is this revolution all about?
Why is programmatic taking off?  What is this revolution all about?Why is programmatic taking off?  What is this revolution all about?
Why is programmatic taking off? What is this revolution all about?Datacratic
 
Unifying Marketing Data & Multi-Touch Attribution Analysis
Unifying Marketing Data & Multi-Touch Attribution AnalysisUnifying Marketing Data & Multi-Touch Attribution Analysis
Unifying Marketing Data & Multi-Touch Attribution AnalysisPrinciple America
 
Kritter introduction - advertiser
Kritter   introduction - advertiserKritter   introduction - advertiser
Kritter introduction - advertiserKrittercorporate
 
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...Digiday
 
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617Gemius Ukraine
 
Selling display display to your smb clients final
Selling display display to your smb clients   finalSelling display display to your smb clients   final
Selling display display to your smb clients finalAcquisio
 
Selling Display to Your SMB Clients
Selling Display to Your SMB ClientsSelling Display to Your SMB Clients
Selling Display to Your SMB ClientsAcquisio
 
Kritter introduction - technology player
Kritter   introduction - technology playerKritter   introduction - technology player
Kritter introduction - technology playerKrittercorporate
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and MLQuantUniversity
 
Machine Learning for Performance Advertising
Machine Learning for Performance AdvertisingMachine Learning for Performance Advertising
Machine Learning for Performance AdvertisingEustache Diemert
 
Programmatic Overview and Best Practices
Programmatic Overview and Best PracticesProgrammatic Overview and Best Practices
Programmatic Overview and Best PracticesOperative
 
Machine Learning in Customer Analytics
Machine Learning in Customer AnalyticsMachine Learning in Customer Analytics
Machine Learning in Customer AnalyticsCourse5i
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
Machine Learning and Industrie 4.0
Machine Learning and Industrie 4.0Machine Learning and Industrie 4.0
Machine Learning and Industrie 4.0Peter Schleinitz
 
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in Programmatic
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in ProgrammaticeMarketer Webinar: Creating Ads on the Fly—New Opportunities in Programmatic
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in ProgrammaticeMarketer
 
Mumbrella - Programmatic For Marketers
Mumbrella - Programmatic For MarketersMumbrella - Programmatic For Marketers
Mumbrella - Programmatic For MarketersLouder
 
Chandra ebusiness applications
Chandra ebusiness applicationsChandra ebusiness applications
Chandra ebusiness applicationsSuresh Chandra
 

Similar to New challenges for scalable machine learning in online advertising (20)

Why is programmatic taking off? What is this revolution all about?
Why is programmatic taking off?  What is this revolution all about?Why is programmatic taking off?  What is this revolution all about?
Why is programmatic taking off? What is this revolution all about?
 
Unifying Marketing Data & Multi-Touch Attribution Analysis
Unifying Marketing Data & Multi-Touch Attribution AnalysisUnifying Marketing Data & Multi-Touch Attribution Analysis
Unifying Marketing Data & Multi-Touch Attribution Analysis
 
Kritter introduction - advertiser
Kritter   introduction - advertiserKritter   introduction - advertiser
Kritter introduction - advertiser
 
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...
The Past, Present, and Future of Programmatic, Digiday Programmatic Summit, M...
 
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617
Data Xu&Gemius_Digital Beyond Borders_Kiev_20150617
 
Machine Learning for Computational Advertising
Machine Learning for Computational AdvertisingMachine Learning for Computational Advertising
Machine Learning for Computational Advertising
 
Biz model 4 method of value capture
Biz model 4   method of value captureBiz model 4   method of value capture
Biz model 4 method of value capture
 
Selling display display to your smb clients final
Selling display display to your smb clients   finalSelling display display to your smb clients   final
Selling display display to your smb clients final
 
Selling Display to Your SMB Clients
Selling Display to Your SMB ClientsSelling Display to Your SMB Clients
Selling Display to Your SMB Clients
 
Kritter introduction - technology player
Kritter   introduction - technology playerKritter   introduction - technology player
Kritter introduction - technology player
 
Ppt11
Ppt11Ppt11
Ppt11
 
Practical model management in the age of Data science and ML
Practical model management in the age of Data science and MLPractical model management in the age of Data science and ML
Practical model management in the age of Data science and ML
 
Machine Learning for Performance Advertising
Machine Learning for Performance AdvertisingMachine Learning for Performance Advertising
Machine Learning for Performance Advertising
 
Programmatic Overview and Best Practices
Programmatic Overview and Best PracticesProgrammatic Overview and Best Practices
Programmatic Overview and Best Practices
 
Machine Learning in Customer Analytics
Machine Learning in Customer AnalyticsMachine Learning in Customer Analytics
Machine Learning in Customer Analytics
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Machine Learning and Industrie 4.0
Machine Learning and Industrie 4.0Machine Learning and Industrie 4.0
Machine Learning and Industrie 4.0
 
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in Programmatic
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in ProgrammaticeMarketer Webinar: Creating Ads on the Fly—New Opportunities in Programmatic
eMarketer Webinar: Creating Ads on the Fly—New Opportunities in Programmatic
 
Mumbrella - Programmatic For Marketers
Mumbrella - Programmatic For MarketersMumbrella - Programmatic For Marketers
Mumbrella - Programmatic For Marketers
 
Chandra ebusiness applications
Chandra ebusiness applicationsChandra ebusiness applications
Chandra ebusiness applications
 

Recently uploaded

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

New challenges for scalable machine learning in online advertising

  • 1. Copyright © 2015 Criteo New challenges for scalable machine learning in online advertising Olivier Koch Engineering Program Manager, Criteo ICML Online Advertising Systems Workshop June 24, 2016
  • 2. Copyright © 2015 Criteo What we do 2 Advertiser Publisher
  • 3. Copyright © 2015 Criteo Machine learning applications at Criteo • Bidding (2nd price auctions) • Product recommendation • Banner look and feel selection
  • 4. Copyright © 2015 Criteo Machine learning at Criteo • Supervised learning using standard regression methods / optimization algorithms (SGD, L-BFGS) • Distribution on Hadoop (MapReduce, Spark) • 3B displays / day • 40 PB of data -- 15,000 servers • 7 data centers worldwide
  • 5. Copyright © 2015 Criteo The good news • New generations of algorithms • NLP (word embeddings), reinforcement learning, policy learning, deep networks • Releases of ML infrastructures • Caffe on Spark, TensorFlow, Torch, PhotonML, GPUs inside clusters → strong traction in the academic/industrial community
  • 6. Copyright © 2015 Criteo The good news (c’ed) • A lot of data is available • Interactions with banners : clicks • Interactions with products/advertisers : sales, baskets, home views, listings, visit history • New data is coming • Mobile, cross-device, (offline)
  • 7. Copyright © 2015 Criteo Now what?
  • 8. Copyright © 2015 Criteo Challenges in online advertising 1/3 • The technical debt of large-scale machine learning systems • AB tests = snapshots. Are we missing long term effects? • Some models become hard to improve. Are we overfitting or using the wrong metrics? • We need to deal with a growing number of models – e.g. automate feature engineering
  • 9. Copyright © 2015 Criteo Challenges in online advertising 2/3 • We want to provide a better online advertising experience • Personalized • Cross-device • Long tail (new users, new products)
  • 10. Copyright © 2015 Criteo Challenges in online advertising 3/3 • Credit assignment and incrementality • Several clicks might be needed to generate a sale • We should probably optimize a series of bids as opposed to single bids • What is the optimal credit assignment scheme? • We optimize what clients give us • Attributed sales may not be the right target • Global sales increase are noisy
  • 11. Copyright © 2015 Criteo Machine learning to the rescue • Offline metrics – counterfactual analysis • Optimal bidding strategies under uncertainty -- reinforcement learning • Classification/prediction of time series • Long tail (users, products) -- transfer learning, factorization • Probabilistic match of devices
  • 12. Copyright © 2015 Criteo Machine learning to the rescue • Offline metrics – counterfactual analysis • Optimal bidding strategies under uncertainty -- reinforcement learning • Classification/prediction of time series • Long tail (users, products) -- transfer learning, factorization • Probabilistic match of devices
  • 13. Copyright © 2015 Criteo Offline metrics – counterfactual analysis • Option 1 : run a controlled experiment (AB test) • How would the system behave if I replaced model M by model M*? • Takes time to conclude • Costs money if M* is worse than M (often) • Does not measure long-term effects • Option 2 : use counter-factual analysis • How would the system have performed if, when the data was collected, we had replaced model M by model M∗? • Requires real-time randomization -- cost/exploration trade-off • Works best when M* is close to M • Trades time for computation and storage • Ignores future users’ and advertisers’ reactions Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising, Bottou et al.
  • 14. Copyright © 2015 Criteo Optimal bidding strategies • A user is seen more than 20 times a day on average • Each action we take has an impact on the user, the advertiser and the competition • Option 1 : model the environment and bid accordingly • Cannot go beyond the proxy being optimized • Option 2 : no model, randomized experiments • Hard problem : very high-dimensional state space and very sparse rewards
  • 15. Copyright © 2015 Criteo Conclusions • Machine learning applies well to online advertising at scale • New algorithms, new infrastructures and more data are coming • A number of challenges remain unresolved… • … come help us solve them!
  • 16. Copyright © 2015 Criteo Thanks! Questions? o.koch@criteo.com Dataset released: http://bit.ly/criteodata