SlideShare a Scribd company logo
1 of 37
Download to read offline
Copyright © 2015 Criteo
Large-Scale Real-Time Product
Recommendation at Criteo
Romain Lerallut, Diane Gasselin
RecSys Vienna, Sept 18, 2015
Copyright © 2015 Criteo
2
Copyright © 2015 Criteo
« The largest internet company you’ve never heard of » 
• Founded in 2005, in the adtech business since 2008
• Recommendation was our first product
• Disruptive business models
• 1700 people WW (50+% for less than a year)
• 300+ engineers
• 26 offices
• Live in 130 countries
• 1B unique users
Copyright © 2015 Criteo
We buy
• Inventory ! (ad spaces)
• Billions of times a day
• All over the Internet
• For 95% of the population
=> Funding the Web
A technology company first and foremost
We sell
• Clicks !
• (that convert)
• (that convert a lot)
=> Delight to our clients ! 
We take the risk
You pay only for what you get
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
leads to
50 clicks
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
leads to
50 clicks
leads to
1 sale
Copyright © 2015 Criteo
8
Traffic
800k HTTP requests / sec (peak activity)
29000 impressions / sec (peak activity)
Copyright © 2015 Criteo
9
Traffic
800k HTTP requests / sec (peak activity)
29000 impressions / sec (peak activity)
<10 ms to process RTB request
<100 ms to process reco request
Copyright © 2015 Criteo
10
Physical infrastructure
7 in-house data centers on 3 continents
Traffic
800k HTTP requests / sec (peak activity)
29000 impressions / sec (peak activity)
<10 ms to process RTB request
<100 ms to process reco request
Copyright © 2015 Criteo
11
Physical infrastructure
7 in-house data centers on 3 continents
~ 15000 servers, largest Hadoop cluster in Europe
More than 35 PB of storage Big Data
Traffic
800k HTTP requests / sec (peak activity)
29000 impressions / sec (peak activity)
<10 ms to process RTB request
<100 ms to process reco request
Copyright © 2015 Criteo
(Big) Data Sources
Ad display data
20B events / day
User behavior data
2B events / day
Catalog data
1M+ products / client
10k clients
Copyright © 2015 Criteo
How do we do it ?
Copyright © 2015 Criteo
Recommend products for a user
• What we want: reco(user) = products
• But 1B users x 3B products !
• And we need to scale and keep it fresh
• What we can do :
• Pre-select products offline (source)
• Refine recommendation online
Copyright © 2015 Criteo
15
Offline : prepare sources
Advertiser events
Co events
Item View – Item View Item Sale – Item Sale
Best of
Best of by
category
Similarities Complementarities
Top N
350M keys
12B values
50B
50M keys
1B values
Copyright © 2015 Criteo
User X saw orange shoes
Offline : prepare sources
Historical
Similar
Best-of
Other users :
Most viewed products on the client website
Some candidate products for user X
Complementary
Copyright © 2015 Criteo
OFFLINE
Reco overview
Advertiser
events
Source computation
Map-Reduce jobs
Recommendation Service
Display, Click, Sale logs
Prediction
models
Sources
Catalog
12h
4h
6h
4.5B
500M
100K qps
50B
Copyright © 2015 Criteo
ML model
• Logistic regression models because :
• They scale
• They are fast
• They can handle lots of features (with a bit of magic)
Product-specific User-specific User-product interactions Display-specific
Copyright © 2015 Criteo
Online: sources
Similarities Most viewed Most bought
Copyright © 2015 Criteo
Online: merge of products
Similarities Most viewed Most bought
Copyright © 2015 Criteo
Online: scoring
Similarities Most viewed Most bought
0,02 0,12 0,06 0,18 0,03 0,05 0,01 0,005 0,011 0,013 0,004 0,007
Copyright © 2015 Criteo
Online: scoring
Similarities Most viewed Most bought
0,18 0,12 0,06 0,05 0,03 0,02 0,013 0,011 0,01 0,007 0,005 0,004
Copyright © 2015 Criteo
Online: candidates
0,18 0,12 0,06 0,05 0,03 0,02 0,013 0,011 0,01 0,007 0,005 0,004
SHOP SHOP SHOP SHOP
-50%
Copyright © 2015 Criteo
Evaluation
Copyright © 2015 Criteo
• It is the only truth we have
• 50% users on model A
• 50% users on model B
The basics : online ab-testing
My company
BUY! BUY!
BUY!
My company
BUY! BUY!
BUY!
Copyright © 2015 Criteo
• It is the only truth we have
• 50% users on model A
• 50% users on model B
• But it is onerous
• If not good, we lose money, fast !
• Tests are long (~2weeks needed to have good confidence intervals)
• Code has to be prod-ready (no bug, good performance), we run 24/7
• Can be heavy on the infrastructure
• And does not take long-term effect into account
The basics : online ab-testing
My company
BUY! BUY!
BUY!
My company
BUY! BUY!
BUY!
Copyright © 2015 Criteo
The test framework for prediction
• ALTERNATIVE : Framework that replays production logs (offline)
• 30 000 tests / year
• Replay ~x100
• BUT : we only have data on products we display (exploration is
costly)
• SO : we can only make sure we are not completely mistaken
Copyright © 2015 Criteo
Ultimate solution: offline ab-testing
• Find the best offline predictor for online performance
• Counterfactual Reasoning and Learning Systems
Léon Bottou Microsoft Research, Redmond, WA
Jonas Peters Max Planck Institute, Tübingen
Joaquin Quiñonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly,
Dipankar Ray, Patrice Simard, Ed Snelson
• But we haven’t succeeded in making it precisely match reality..
Copyright © 2015 Criteo
Ultimate solution: offline ab-testing
• Find the best offline predictor for online performance
• Counterfactual Reasoning and Learning Systems
Léon Bottou Microsoft Research, Redmond, WA
Jonas Peters Max Planck Institute, Tübingen
Joaquin Quiñonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly,
Dipankar Ray, Patrice Simard, Ed Snelson
• But we haven’t succeeded in making it precisely match reality.. YET
Copyright © 2015 Criteo
What’s next ?
Copyright © 2015 Criteo
What’s next for us : Upcoming challenges
• Long(er)-term user profiles
Copyright © 2015 Criteo
What’s next for us : Upcoming challenges
• Long(er)-term user profiles
• More and better product information (images, semantic, NLP)
Copyright © 2015 Criteo
What’s next for us : Upcoming challenges
• Long(er)-term user profiles
• More and better product information (images, semantic, NLP)
• Instant-update of similarities
• (because batch computation is soooo last year)
Copyright © 2015 Criteo
What’s next for us : Upcoming challenges
• Long(er)-term user profiles
• More and better product information (images, semantic, NLP)
• Instant-update of similarities
• (because batch computation is soooo last year)
• Joined product scoring
• (score full banner and not products independently)
Copyright © 2015 Criteo
What’s next for you : Fancy a try ?
On your own:
With us !
http://labs.criteo.com/jobs/
• We published datasets for click prediction
• 4GB display-click data : Kaggle challenge in 2014 http://bit.ly/1vgw2XC
• 1TB Display-Click data (industry’s largest dataset) : http://bit.ly/1PyH4Vq
• 4 billion of observations
• 156 billion feature-value
• available on Microsoft Azure
• used by edX (UC Berkeley)
• We would be happy to share Recocentric data !
Copyright © 2015 Criteo
Questions?
Copyright © 2015 Criteo
Thank you !
r.lerallut@criteo.com @Rlerallut
d.gasselin@criteo.com
@recsysfr

More Related Content

What's hot

Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
 
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoT
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoTTop 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoT
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoTAmazon Web Services
 
Using FlockData to power your Recommendation Engine
Using FlockData to power your Recommendation EngineUsing FlockData to power your Recommendation Engine
Using FlockData to power your Recommendation EngineFlockData
 
Making Cloud Procurement Easy with AWS Marketplace, Automation, and Governance
Making Cloud Procurement Easy with AWS Marketplace, Automation, and GovernanceMaking Cloud Procurement Easy with AWS Marketplace, Automation, and Governance
Making Cloud Procurement Easy with AWS Marketplace, Automation, and GovernanceAmazon Web Services
 
Engage your audience through mobile
Engage your audience through mobileEngage your audience through mobile
Engage your audience through mobileAmazon Web Services
 
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...Amazon Web Services
 
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...Amazon Web Services
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Amazon Web Services
 
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...Amazon Web Services
 
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018Amazon Web Services
 
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...Amazon Web Services
 
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...Amazon Web Services
 
Maximising Data Governance in the Cloud
Maximising Data Governance in the CloudMaximising Data Governance in the Cloud
Maximising Data Governance in the CloudAmazon Web Services
 
Accelerate Business Innovation Using AWS Serverless Technologies
Accelerate Business Innovation Using AWS Serverless TechnologiesAccelerate Business Innovation Using AWS Serverless Technologies
Accelerate Business Innovation Using AWS Serverless TechnologiesAmazon Web Services
 
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...Amazon Web Services
 
Company presontation
Company presontationCompany presontation
Company presontationCarl Scheller
 
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018Amazon Web Services
 
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...Amazon Web Services
 
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...Amazon Web Services
 

What's hot (20)

Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
 
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoT
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoTTop 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoT
Top 4 Ways to Build Machine Learning Prediction on the Edge for Mobile & IoT
 
Using FlockData to power your Recommendation Engine
Using FlockData to power your Recommendation EngineUsing FlockData to power your Recommendation Engine
Using FlockData to power your Recommendation Engine
 
Making Cloud Procurement Easy with AWS Marketplace, Automation, and Governance
Making Cloud Procurement Easy with AWS Marketplace, Automation, and GovernanceMaking Cloud Procurement Easy with AWS Marketplace, Automation, and Governance
Making Cloud Procurement Easy with AWS Marketplace, Automation, and Governance
 
Engage your audience through mobile
Engage your audience through mobileEngage your audience through mobile
Engage your audience through mobile
 
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...
Using AWS Marketplace to Reach Over 190,000 Customers (GPSMP203) - AWS re:Inv...
 
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...
APN Programs to Maximize Customer Opportunities with AWS Sales Teams (GPSBUS2...
 
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
Customer Keynote: PIXNET Media Inc.- Business Intelligent and Analysis: Empir...
 
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...
Build AWS Skills Through Community-Led User Groups (DVC202) - AWS reInvent 20...
 
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018
Crafting a Conversational Platform Strategy (AIM338) - AWS re:Invent 2018
 
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...
Maximize AWS Investment to Drive Profitability and Sustainable Growth (GPSBUS...
 
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...
Maximize Innovation and Agility by Building Your SaaS Solution on AWS (GPSBUS...
 
Cheat your Way into the Cloud
Cheat your Way into the CloudCheat your Way into the Cloud
Cheat your Way into the Cloud
 
Maximising Data Governance in the Cloud
Maximising Data Governance in the CloudMaximising Data Governance in the Cloud
Maximising Data Governance in the Cloud
 
Accelerate Business Innovation Using AWS Serverless Technologies
Accelerate Business Innovation Using AWS Serverless TechnologiesAccelerate Business Innovation Using AWS Serverless Technologies
Accelerate Business Innovation Using AWS Serverless Technologies
 
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...
Build a Profitable and Customer-Centric Next-Gen MSP Practice (GPSBUS205) - A...
 
Company presontation
Company presontationCompany presontation
Company presontation
 
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018
Go to Market with AWS - Kevin Park - AWS TechShift ANZ 2018
 
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...
How Fannie Mae Processes over a Quarter Million Loans per Day with Amazon S3 ...
 
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...
Zendesk: Building a World-Class Cloud Center of Excellence (ENT309-S) - AWS r...
 

Viewers also liked

どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013Shuyo Nakatani
 
国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定Shuyo Nakatani
 
情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライドKenta Oku
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filteringsscdotopen
 
JP Chaosmap 2015-2016
JP Chaosmap 2015-2016JP Chaosmap 2015-2016
JP Chaosmap 2015-2016Hiroshi Kondo
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesAlejandro Correa Bahnsen, PhD
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレストTeppei Baba
 

Viewers also liked (10)

どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
 
国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定
 
coordinate descent 法について
coordinate descent 法についてcoordinate descent 法について
coordinate descent 法について
 
情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filtering
 
Deep forest
Deep forestDeep forest
Deep forest
 
JP Chaosmap 2015-2016
JP Chaosmap 2015-2016JP Chaosmap 2015-2016
JP Chaosmap 2015-2016
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slides
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト
 

Similar to RecSys 2015: Large-scale real-time product recommendation at Criteo

Recommendation at scale
Recommendation at scaleRecommendation at scale
Recommendation at scalesimondolle
 
Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Dataconomy Media
 
New machine learning challenges at Criteo
New machine learning challenges at CriteoNew machine learning challenges at Criteo
New machine learning challenges at CriteoOlivier Koch
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAmazon Web Services
 
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...Business of Software Conference
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupOlivier Koch
 
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...MLconf
 
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享Amazon Web Services
 
Data driven video advertising campaigns - JustWatch & Snowplow
Data driven video advertising campaigns - JustWatch & SnowplowData driven video advertising campaigns - JustWatch & Snowplow
Data driven video advertising campaigns - JustWatch & SnowplowGiuseppe Gaviani
 
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...Turing Fest
 
2015 1029 webinar_meet_the_tech_savvy_cfo
2015 1029 webinar_meet_the_tech_savvy_cfo2015 1029 webinar_meet_the_tech_savvy_cfo
2015 1029 webinar_meet_the_tech_savvy_cfoIntacct Corporation
 
ThousandEyes Webinar: How to see and resolve office 365 performance challenges
ThousandEyes Webinar: How to see and resolve office 365 performance challengesThousandEyes Webinar: How to see and resolve office 365 performance challenges
ThousandEyes Webinar: How to see and resolve office 365 performance challengesThousandEyes
 
Tech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @CriteoTech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @CriteoGilles Legoux
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamSenturus
 
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti Technologies Ltd
 
Directi On Campus- Engineering Presentation
Directi On Campus- Engineering PresentationDirecti On Campus- Engineering Presentation
Directi On Campus- Engineering PresentationDirecti Group
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseThe Hive
 

Similar to RecSys 2015: Large-scale real-time product recommendation at Criteo (20)

Recommendation at scale
Recommendation at scaleRecommendation at scale
Recommendation at scale
 
Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo Simon Dollé_Large-scale Real-time recommendation at Criteo
Simon Dollé_Large-scale Real-time recommendation at Criteo
 
New machine learning challenges at Criteo
New machine learning challenges at CriteoNew machine learning challenges at Criteo
New machine learning challenges at Criteo
 
AI and IoT innovation - an industry focus
AI and IoT innovation - an industry focusAI and IoT innovation - an industry focus
AI and IoT innovation - an industry focus
 
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...
Patrick Campbell, Why a SaaS Pricing Consultancy Gives Away Free Software, Bo...
 
Making advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders MeetupMaking advertising personal, 4th NL Recommenders Meetup
Making advertising personal, 4th NL Recommenders Meetup
 
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
 
AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享AWS 金融服務概覽與區塊鍊案例分享
AWS 金融服務概覽與區塊鍊案例分享
 
Data driven video advertising campaigns - JustWatch & Snowplow
Data driven video advertising campaigns - JustWatch & SnowplowData driven video advertising campaigns - JustWatch & Snowplow
Data driven video advertising campaigns - JustWatch & Snowplow
 
Enabling the Media Community
Enabling the Media CommunityEnabling the Media Community
Enabling the Media Community
 
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...
Patrick Campbell — How to Build Actual Customer-Driven Product (Turing Festiv...
 
TIBCO OEM Partnership
TIBCO OEM PartnershipTIBCO OEM Partnership
TIBCO OEM Partnership
 
2015 1029 webinar_meet_the_tech_savvy_cfo
2015 1029 webinar_meet_the_tech_savvy_cfo2015 1029 webinar_meet_the_tech_savvy_cfo
2015 1029 webinar_meet_the_tech_savvy_cfo
 
ThousandEyes Webinar: How to see and resolve office 365 performance challenges
ThousandEyes Webinar: How to see and resolve office 365 performance challengesThousandEyes Webinar: How to see and resolve office 365 performance challenges
ThousandEyes Webinar: How to see and resolve office 365 performance challenges
 
Tech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @CriteoTech Job Conference: Software Engineer @Criteo
Tech Job Conference: Software Engineer @Criteo
 
The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science Team
 
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
Cigniti joint webinar with Soasta - Agile DevOps: Test-driven IT Environment ...
 
Directi On Campus- Engineering Presentation
Directi On Campus- Engineering PresentationDirecti On Campus- Engineering Presentation
Directi On Campus- Engineering Presentation
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
Keynote: AWS Startup Day São Paulo
Keynote: AWS Startup Day São PauloKeynote: AWS Startup Day São Paulo
Keynote: AWS Startup Day São Paulo
 

Recently uploaded

Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Delhi Call girls
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...SUHANI PANDEY
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.soniya singh
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...SUHANI PANDEY
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls DubaiEscorts Call Girls
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...SUHANI PANDEY
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubaikojalkojal131
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...SUHANI PANDEY
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceDelhi Call girls
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 

Recently uploaded (20)

(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Rani Bagh Escort Service Delhi N.C.R.
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls Dubai
 
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
Shikrapur - Call Girls in Pune Neha 8005736733 | 100% Gennuine High Class Ind...
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 

RecSys 2015: Large-scale real-time product recommendation at Criteo

  • 1. Copyright © 2015 Criteo Large-Scale Real-Time Product Recommendation at Criteo Romain Lerallut, Diane Gasselin RecSys Vienna, Sept 18, 2015
  • 2. Copyright © 2015 Criteo 2
  • 3. Copyright © 2015 Criteo « The largest internet company you’ve never heard of »  • Founded in 2005, in the adtech business since 2008 • Recommendation was our first product • Disruptive business models • 1700 people WW (50+% for less than a year) • 300+ engineers • 26 offices • Live in 130 countries • 1B unique users
  • 4. Copyright © 2015 Criteo We buy • Inventory ! (ad spaces) • Billions of times a day • All over the Internet • For 95% of the population => Funding the Web A technology company first and foremost We sell • Clicks ! • (that convert) • (that convert a lot) => Delight to our clients !  We take the risk You pay only for what you get
  • 5. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays
  • 6. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays leads to 50 clicks
  • 7. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays leads to 50 clicks leads to 1 sale
  • 8. Copyright © 2015 Criteo 8 Traffic 800k HTTP requests / sec (peak activity) 29000 impressions / sec (peak activity)
  • 9. Copyright © 2015 Criteo 9 Traffic 800k HTTP requests / sec (peak activity) 29000 impressions / sec (peak activity) <10 ms to process RTB request <100 ms to process reco request
  • 10. Copyright © 2015 Criteo 10 Physical infrastructure 7 in-house data centers on 3 continents Traffic 800k HTTP requests / sec (peak activity) 29000 impressions / sec (peak activity) <10 ms to process RTB request <100 ms to process reco request
  • 11. Copyright © 2015 Criteo 11 Physical infrastructure 7 in-house data centers on 3 continents ~ 15000 servers, largest Hadoop cluster in Europe More than 35 PB of storage Big Data Traffic 800k HTTP requests / sec (peak activity) 29000 impressions / sec (peak activity) <10 ms to process RTB request <100 ms to process reco request
  • 12. Copyright © 2015 Criteo (Big) Data Sources Ad display data 20B events / day User behavior data 2B events / day Catalog data 1M+ products / client 10k clients
  • 13. Copyright © 2015 Criteo How do we do it ?
  • 14. Copyright © 2015 Criteo Recommend products for a user • What we want: reco(user) = products • But 1B users x 3B products ! • And we need to scale and keep it fresh • What we can do : • Pre-select products offline (source) • Refine recommendation online
  • 15. Copyright © 2015 Criteo 15 Offline : prepare sources Advertiser events Co events Item View – Item View Item Sale – Item Sale Best of Best of by category Similarities Complementarities Top N 350M keys 12B values 50B 50M keys 1B values
  • 16. Copyright © 2015 Criteo User X saw orange shoes Offline : prepare sources Historical Similar Best-of Other users : Most viewed products on the client website Some candidate products for user X Complementary
  • 17. Copyright © 2015 Criteo OFFLINE Reco overview Advertiser events Source computation Map-Reduce jobs Recommendation Service Display, Click, Sale logs Prediction models Sources Catalog 12h 4h 6h 4.5B 500M 100K qps 50B
  • 18. Copyright © 2015 Criteo ML model • Logistic regression models because : • They scale • They are fast • They can handle lots of features (with a bit of magic) Product-specific User-specific User-product interactions Display-specific
  • 19. Copyright © 2015 Criteo Online: sources Similarities Most viewed Most bought
  • 20. Copyright © 2015 Criteo Online: merge of products Similarities Most viewed Most bought
  • 21. Copyright © 2015 Criteo Online: scoring Similarities Most viewed Most bought 0,02 0,12 0,06 0,18 0,03 0,05 0,01 0,005 0,011 0,013 0,004 0,007
  • 22. Copyright © 2015 Criteo Online: scoring Similarities Most viewed Most bought 0,18 0,12 0,06 0,05 0,03 0,02 0,013 0,011 0,01 0,007 0,005 0,004
  • 23. Copyright © 2015 Criteo Online: candidates 0,18 0,12 0,06 0,05 0,03 0,02 0,013 0,011 0,01 0,007 0,005 0,004 SHOP SHOP SHOP SHOP -50%
  • 24. Copyright © 2015 Criteo Evaluation
  • 25. Copyright © 2015 Criteo • It is the only truth we have • 50% users on model A • 50% users on model B The basics : online ab-testing My company BUY! BUY! BUY! My company BUY! BUY! BUY!
  • 26. Copyright © 2015 Criteo • It is the only truth we have • 50% users on model A • 50% users on model B • But it is onerous • If not good, we lose money, fast ! • Tests are long (~2weeks needed to have good confidence intervals) • Code has to be prod-ready (no bug, good performance), we run 24/7 • Can be heavy on the infrastructure • And does not take long-term effect into account The basics : online ab-testing My company BUY! BUY! BUY! My company BUY! BUY! BUY!
  • 27. Copyright © 2015 Criteo The test framework for prediction • ALTERNATIVE : Framework that replays production logs (offline) • 30 000 tests / year • Replay ~x100 • BUT : we only have data on products we display (exploration is costly) • SO : we can only make sure we are not completely mistaken
  • 28. Copyright © 2015 Criteo Ultimate solution: offline ab-testing • Find the best offline predictor for online performance • Counterfactual Reasoning and Learning Systems Léon Bottou Microsoft Research, Redmond, WA Jonas Peters Max Planck Institute, Tübingen Joaquin Quiñonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, Ed Snelson • But we haven’t succeeded in making it precisely match reality..
  • 29. Copyright © 2015 Criteo Ultimate solution: offline ab-testing • Find the best offline predictor for online performance • Counterfactual Reasoning and Learning Systems Léon Bottou Microsoft Research, Redmond, WA Jonas Peters Max Planck Institute, Tübingen Joaquin Quiñonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, Ed Snelson • But we haven’t succeeded in making it precisely match reality.. YET
  • 30. Copyright © 2015 Criteo What’s next ?
  • 31. Copyright © 2015 Criteo What’s next for us : Upcoming challenges • Long(er)-term user profiles
  • 32. Copyright © 2015 Criteo What’s next for us : Upcoming challenges • Long(er)-term user profiles • More and better product information (images, semantic, NLP)
  • 33. Copyright © 2015 Criteo What’s next for us : Upcoming challenges • Long(er)-term user profiles • More and better product information (images, semantic, NLP) • Instant-update of similarities • (because batch computation is soooo last year)
  • 34. Copyright © 2015 Criteo What’s next for us : Upcoming challenges • Long(er)-term user profiles • More and better product information (images, semantic, NLP) • Instant-update of similarities • (because batch computation is soooo last year) • Joined product scoring • (score full banner and not products independently)
  • 35. Copyright © 2015 Criteo What’s next for you : Fancy a try ? On your own: With us ! http://labs.criteo.com/jobs/ • We published datasets for click prediction • 4GB display-click data : Kaggle challenge in 2014 http://bit.ly/1vgw2XC • 1TB Display-Click data (industry’s largest dataset) : http://bit.ly/1PyH4Vq • 4 billion of observations • 156 billion feature-value • available on Microsoft Azure • used by edX (UC Berkeley) • We would be happy to share Recocentric data !
  • 36. Copyright © 2015 Criteo Questions?
  • 37. Copyright © 2015 Criteo Thank you ! r.lerallut@criteo.com @Rlerallut d.gasselin@criteo.com @recsysfr