SlideShare a Scribd company logo
1 of 34
Personalized Job Recommendation
System at LinkedIn: Practical
Challenges and Lessons Learned
Krishnaram Kenthapadi
Staff Software Engineer - LinkedIn
Benjamin Le
Senior Software Engineer - LinkedIn
Ganesh Venkataraman
Engineering Manager - Airbnb
Practical Challenges in JYMBII
• Candidate Selection
• Personalized Relevance Models at Scale
• Jobs Marketplace
2
Candidate Selection
Why Candidate Selection?
•Need to meet latency requirements of an online recommendation system
•Only subset of jobs is relevant to a user based on their domain and expertise
•Enables scoring with more complex and computationally expensive models
4
past_applied_title ^ job_title
Past_searched_loc ^ job_location
Activity Based Clauses
How to combine query clauses ?
user_title ^ job_title
user_skills ^ job_skills
Profile – Job Match Clauses
title_pref ^ job_title
location_pref ^ job_location
Explicit Preference Clauses
aspiring_sen ^ job_seniority
location_pref ^ job_location
Latent Preference Clauses
5
Decision Tree Based Approach
[Grover et al., CIKM 2017]
• Train on top-k ranked documents
as positives and tail end as
negatives.
• Extract combinations of clauses
decision tree by traversing root
to leaf paths.
• Do a weighted combination of
the clauses.
6
Decision Tree Based Query Generation
• Trees are a natural way to
learn combinations of clauses
• Weighted combinations can
be learned by looking at
purity of the nodes in a
WAND query
7
Title
Match
Seniority
Match
Negative Positive
Function
Match
Positive Positive
NO Yes
YesYes NONO
7 7
Online A/B Testing
8
***No drop to engagement metrics observed
***
Personalized Relevance
Models
Generalized Linear Mixed Models (GLMM)
• Mixture of linear models into an additive model
• Fixed Effect – Population Average Model
• Random Effects – Entity Specific Models
Response Prediction (Logistic Regression)
User 1
Random Effect Model
User 2
Random Effect Model
Personalization
Job 2
Random Effect Model
Job 1
Random Effect Model
Collaboration
Global Fixed Effect Model
Content-Based Similarity
10
Features
• Dense Vector Bag of Words Similarity Features in global model for Generalization
• i.e: Similarity in title text good predictor of response
• Sparse Cross Features in global,user, and job model for Memorization
• i.e: Memorize that computer science students will transition to entry engineering roles
Vector BoW Similarity Feature
Sim(User Title BoW,
Job Title BoW)
Global Model Cross Feature
AND(user = Comp Sci. Student,
job = Software Engineer)
User Model Cross Feature
AND(user = User 2,
job = Software Engineer)
Job Model Cross Feature
AND(user = Comp Sci. Student,
job = Job 1)
11
Training a GLMM at Scale
• Millions of random effect models * thousands of features per model
= Billions of parameters
• Infeasible to run traditional fitting methods on this very large linear
model with industry scale datasets
• Key Idea: For each entity's random effect, only the labeled data
associated with that entity is needed to fit its model
12
Parallel Block-wise Coordinate Descent
[Zhang et al., KDD 2016]
13
Training a GLMM at Scale
14
Global Fixed Effect Model
All labeled data is
first used to train the
fixed effect model
Training a GLMM at Scale
15
Global Fixed Effect Model
Nadia’s
Random Effect Model
Ben’s
Random Effect Model
Ganesh’s
Random Effect Model
Liang’s
Random Effect Model
Labeled data is
partitioned by entity to
train random effect
models in parallel
Repeat for each random
effect
Training a GLMM at Scale
16
Global Fixed Effect Model
After training the random
effects, cycle back and
train the fixed effect model
again if convergence
criteria is not met
Nadia’s
Random Effect Model
Ben’s
Random Effect Model
Ganesh’s
Random Effect Model
Liang’s
Random Effect Model
Online A/B Testing
17
Jobs Marketplace
The Ideal Jobs Marketplace
• Maximize number of confirmed hires while minimizing number of job
applications
• This maximizes utility of job seeker and job posters
• Ranking by 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗) only optimizes for user
engagement
1. Recommend highly relevant jobs to users
2. Ensure each job posting
• Receives sufficient number of applications from qualified candidates to
guarantee a hire
• But not overwhelm the job poster with too many applications
19
The Ideal Jobs Marketplace
20
Potential Solution?
• Rank by likelihood that user will apply for the job and pass the
interview and accept the job offer?
• Data on whether a candidate passed an interview is confidential
• Data about the offer to the candidate is confidential too
• More importantly, modeling this requires careful understanding on potential
bias and unfairness of a model due to societal bias in the data
• Practically, we solve the job application redistribution problem instead
• Ensure a job does not receive too many or too few applications
21
Diminishing Return of #Applications
22
Our High-level Idea: Early Intervention
[Borisyuk et al., KDD 2017]
• Say a job expires at time T
• At any time t < T
• Predict #applications it would receive at time T
• Given data received from time 0 to t
• If too few => Boost ranking score 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗)
• If too many => Penalize ranking score 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗)
• Otherwise => No intervention
• Key: Forecasting model of #applications per Job, using signals from:
• # Applies / Impressions the job has received so far
• Other features (xjt): e.g.
• Seasonality (time of day, day of week)
• Job attributes: title, company, industry, qualifications, …
23
• Control Model for Ranking : Optimize for user engagement only
• Split the jobs into 3 buckets every day:
• Bucket 1: Received <8 applications up to now
• Bucket 2: Received [8, 100] applications up to now
• Bucket 3: Received >100 applications up to now
Re-distribute
Online A/B Testing
24
Summary
• Model candidate selection query generation using decision trees
• Personalization at Scale through GLMM
• Realizing the ideal jobs marketplace through application
redistribution
• But a lot of research work still needed to
• Reformulate problem to model optimizing for a healthy marketplace directly
• Understand and quantify bias and fairness in those potential new models
25
References
• [Borisyuk et al., 2016] CaSMoS: A framework for learning candidate
selection models over structured queries and documents, KDD 2016
• [Borisyuk et al., 2017] LiJAR: A System for Job Application
Redistribution towards Efficient Career Marketplace, KDD 2017
• [Grover et al., 2017] Latency reduction via decision tree based query
construction, CIKM 2017
• [Zhang et al., 2016] GLMix: Generalized Linear Mixed Models For
Large-Scale Response Prediction, KDD 2016
26
Appendix
Jobs You May Be Interested In (JYMBII)
28
Problem Formulation
• Rank jobs by 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗)
• Model response given:
29
Careers History, Skills, Education, Connections Job Title, Description, Location, Company
29
User
Interaction
Logs
Offline Modeling
Workflow + User /
Item derived
features
User
Search-based
Candidate
Selection &
Retrieval
Query
Construction
User
Feature
Store
Search
Index of
Items
Recommendation
Ranking
Ranking
Model Store
Additional Re-
ranking/Filtering
Steps
1
2
3
4 5
6
7
Offline System Online System
Item
derived features
JYMBII Infrastructure
30
Understanding WAND Query
Query : “Quality Assurance Engineer”
AND Query: “Quality AND Assurance AND Engineer”
✅ ❌ 31
Understanding WAND Query
Query : “Quality Assurance Engineer”
WAND : “(Quality[5] AND Assurance[5] AND Engineer[1]) [10]”
✅ ✅ 32
Offline Evaluation
• Utilize offline query replay to validate
query against current baseline
• Replay baseline and new query to
compute metrics from the retention of
actions and operational metrics
• Applied jobs retained
• Hits retrieved
• Kendall’s Tau
• Mimicking production ranking through
replay to get more reliable estimate of
online metrics
33
34

More Related Content

What's hot

CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesDmytro Mishkin
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringViet-Trung TRAN
 
Instant search - A hands-on tutorial
Instant search  - A hands-on tutorialInstant search  - A hands-on tutorial
Instant search - A hands-on tutorialGanesh Venkataraman
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleXavier Amatriain
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?blueace
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFYusuke Yamamoto
 
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
Find it! Nail it!Boosting e-commerce search conversions with machine learnin...Find it! Nail it!Boosting e-commerce search conversions with machine learnin...
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...Rakuten Group, Inc.
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 

What's hot (20)

CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filteringRecommender systems: Content-based and collaborative filtering
Recommender systems: Content-based and collaborative filtering
 
Instant search - A hands-on tutorial
Instant search  - A hands-on tutorialInstant search  - A hands-on tutorial
Instant search - A hands-on tutorial
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Project presentation
Project presentationProject presentation
Project presentation
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Social Network Analysis (SNA)
Social Network Analysis (SNA)Social Network Analysis (SNA)
Social Network Analysis (SNA)
 
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix ScaleQcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
How to build a recommender system?
How to build a recommender system?How to build a recommender system?
How to build a recommender system?
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Collaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CFCollaborative Filtering 1: User-based CF
Collaborative Filtering 1: User-based CF
 
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
Find it! Nail it!Boosting e-commerce search conversions with machine learnin...Find it! Nail it!Boosting e-commerce search conversions with machine learnin...
Find it! Nail it! Boosting e-commerce search conversions with machine learnin...
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 

Viewers also liked

Building Recommender Systems for Fashion
Building Recommender Systems for FashionBuilding Recommender Systems for Fashion
Building Recommender Systems for FashionNick Landia
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico
 
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...Justin Basilico
 
Personalization Challenges in E-Learning
Personalization Challenges in E-LearningPersonalization Challenges in E-Learning
Personalization Challenges in E-LearningRoberto Turrin
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsJustin Basilico
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineNeal Lathia
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareJustin Basilico
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 

Viewers also liked (9)

Building Recommender Systems for Fashion
Building Recommender Systems for FashionBuilding Recommender Systems for Fashion
Building Recommender Systems for Fashion
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...
Is that a Time Machine? Some Design Patterns for Real World Machine Learning ...
 
Personalization Challenges in E-Learning
Personalization Challenges in E-LearningPersonalization Challenges in E-Learning
Personalization Challenges in E-Learning
 
Personalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing RecommendationsPersonalized Page Generation for Browsing Recommendations
Personalized Page Generation for Browsing Recommendations
 
Bootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation EngineBootstrapping a Destination Recommendation Engine
Bootstrapping a Destination Recommendation Engine
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 

Similar to Practical Challenges and Lessons Learned from Building a Personalized Job Recommendation System

Overcoming the 5 Most Common PCM Challenges
Overcoming the 5 Most Common PCM Challenges Overcoming the 5 Most Common PCM Challenges
Overcoming the 5 Most Common PCM Challenges Michelle Scifers, MBA
 
Leveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyDylan Hogg
 
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Search Party
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
Software Engineering Plan & Methodology Recommendation
Software Engineering Plan & Methodology RecommendationSoftware Engineering Plan & Methodology Recommendation
Software Engineering Plan & Methodology RecommendationDhatri Misra
 
Review on cost estimation technque for web application [part 1]
Review on cost estimation technque for web application [part 1]Review on cost estimation technque for web application [part 1]
Review on cost estimation technque for web application [part 1]Sayed Mohsin Reza
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptopRising Media, Inc.
 
LinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case StudyLinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case StudyKara Yarnot
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
Project Management and Technology don't matter
Project Management and Technology don't matterProject Management and Technology don't matter
Project Management and Technology don't matterAndrew Patricio
 
Mramadhani project presentation report version 02
Mramadhani project presentation report version 02Mramadhani project presentation report version 02
Mramadhani project presentation report version 02Malinda Ramadhani
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilitiesAllan D. Butler
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxkprasad8
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation StrategySatish Nath
 
Model driven development and code generation of software systems
Model driven development and code generation of software systemsModel driven development and code generation of software systems
Model driven development and code generation of software systemsMarco Brambilla
 
Isita_Pal_Resume_(1)
Isita_Pal_Resume_(1)Isita_Pal_Resume_(1)
Isita_Pal_Resume_(1)Isita Pal
 

Similar to Practical Challenges and Lessons Learned from Building a Personalized Job Recommendation System (20)

Mcq peresentation
Mcq  peresentationMcq  peresentation
Mcq peresentation
 
Overcoming the 5 Most Common PCM Challenges
Overcoming the 5 Most Common PCM Challenges Overcoming the 5 Most Common PCM Challenges
Overcoming the 5 Most Common PCM Challenges
 
Leveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search PartyLeveraging Machine Learning for Competitive Advantage at Search Party
Leveraging Machine Learning for Competitive Advantage at Search Party
 
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
Leveraging Machine Learning for Competitive Advantage by Dylan Hogg - Search ...
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Software Engineering Plan & Methodology Recommendation
Software Engineering Plan & Methodology RecommendationSoftware Engineering Plan & Methodology Recommendation
Software Engineering Plan & Methodology Recommendation
 
Review on cost estimation technque for web application [part 1]
Review on cost estimation technque for web application [part 1]Review on cost estimation technque for web application [part 1]
Review on cost estimation technque for web application [part 1]
 
1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop1440 track 2 boire_using our laptop
1440 track 2 boire_using our laptop
 
Final ec2 kt
Final ec2 ktFinal ec2 kt
Final ec2 kt
 
LinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case StudyLinkedIn Strategies for Recruiting: A Case Study
LinkedIn Strategies for Recruiting: A Case Study
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Project Management and Technology don't matter
Project Management and Technology don't matterProject Management and Technology don't matter
Project Management and Technology don't matter
 
Mramadhani project presentation report version 02
Mramadhani project presentation report version 02Mramadhani project presentation report version 02
Mramadhani project presentation report version 02
 
Customer choice probabilities
Customer choice probabilitiesCustomer choice probabilities
Customer choice probabilities
 
Sandeep_Chaudhary_CV
Sandeep_Chaudhary_CVSandeep_Chaudhary_CV
Sandeep_Chaudhary_CV
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Agile DevOps Transformation Strategy
Agile DevOps Transformation StrategyAgile DevOps Transformation Strategy
Agile DevOps Transformation Strategy
 
Model driven development and code generation of software systems
Model driven development and code generation of software systemsModel driven development and code generation of software systems
Model driven development and code generation of software systems
 
Isita_Pal_Resume_(1)
Isita_Pal_Resume_(1)Isita_Pal_Resume_(1)
Isita_Pal_Resume_(1)
 
Role of Data Science in eCommerce
Role of Data Science in eCommerceRole of Data Science in eCommerce
Role of Data Science in eCommerce
 

Recently uploaded

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the weldingMuhammadUzairLiaqat
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 

Recently uploaded (20)

young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
welding defects observed during the welding
welding defects observed during the weldingwelding defects observed during the welding
welding defects observed during the welding
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 

Practical Challenges and Lessons Learned from Building a Personalized Job Recommendation System

  • 1. Personalized Job Recommendation System at LinkedIn: Practical Challenges and Lessons Learned Krishnaram Kenthapadi Staff Software Engineer - LinkedIn Benjamin Le Senior Software Engineer - LinkedIn Ganesh Venkataraman Engineering Manager - Airbnb
  • 2. Practical Challenges in JYMBII • Candidate Selection • Personalized Relevance Models at Scale • Jobs Marketplace 2
  • 4. Why Candidate Selection? •Need to meet latency requirements of an online recommendation system •Only subset of jobs is relevant to a user based on their domain and expertise •Enables scoring with more complex and computationally expensive models 4
  • 5. past_applied_title ^ job_title Past_searched_loc ^ job_location Activity Based Clauses How to combine query clauses ? user_title ^ job_title user_skills ^ job_skills Profile – Job Match Clauses title_pref ^ job_title location_pref ^ job_location Explicit Preference Clauses aspiring_sen ^ job_seniority location_pref ^ job_location Latent Preference Clauses 5
  • 6. Decision Tree Based Approach [Grover et al., CIKM 2017] • Train on top-k ranked documents as positives and tail end as negatives. • Extract combinations of clauses decision tree by traversing root to leaf paths. • Do a weighted combination of the clauses. 6
  • 7. Decision Tree Based Query Generation • Trees are a natural way to learn combinations of clauses • Weighted combinations can be learned by looking at purity of the nodes in a WAND query 7 Title Match Seniority Match Negative Positive Function Match Positive Positive NO Yes YesYes NONO 7 7
  • 8. Online A/B Testing 8 ***No drop to engagement metrics observed ***
  • 10. Generalized Linear Mixed Models (GLMM) • Mixture of linear models into an additive model • Fixed Effect – Population Average Model • Random Effects – Entity Specific Models Response Prediction (Logistic Regression) User 1 Random Effect Model User 2 Random Effect Model Personalization Job 2 Random Effect Model Job 1 Random Effect Model Collaboration Global Fixed Effect Model Content-Based Similarity 10
  • 11. Features • Dense Vector Bag of Words Similarity Features in global model for Generalization • i.e: Similarity in title text good predictor of response • Sparse Cross Features in global,user, and job model for Memorization • i.e: Memorize that computer science students will transition to entry engineering roles Vector BoW Similarity Feature Sim(User Title BoW, Job Title BoW) Global Model Cross Feature AND(user = Comp Sci. Student, job = Software Engineer) User Model Cross Feature AND(user = User 2, job = Software Engineer) Job Model Cross Feature AND(user = Comp Sci. Student, job = Job 1) 11
  • 12. Training a GLMM at Scale • Millions of random effect models * thousands of features per model = Billions of parameters • Infeasible to run traditional fitting methods on this very large linear model with industry scale datasets • Key Idea: For each entity's random effect, only the labeled data associated with that entity is needed to fit its model 12
  • 13. Parallel Block-wise Coordinate Descent [Zhang et al., KDD 2016] 13
  • 14. Training a GLMM at Scale 14 Global Fixed Effect Model All labeled data is first used to train the fixed effect model
  • 15. Training a GLMM at Scale 15 Global Fixed Effect Model Nadia’s Random Effect Model Ben’s Random Effect Model Ganesh’s Random Effect Model Liang’s Random Effect Model Labeled data is partitioned by entity to train random effect models in parallel Repeat for each random effect
  • 16. Training a GLMM at Scale 16 Global Fixed Effect Model After training the random effects, cycle back and train the fixed effect model again if convergence criteria is not met Nadia’s Random Effect Model Ben’s Random Effect Model Ganesh’s Random Effect Model Liang’s Random Effect Model
  • 19. The Ideal Jobs Marketplace • Maximize number of confirmed hires while minimizing number of job applications • This maximizes utility of job seeker and job posters • Ranking by 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗) only optimizes for user engagement 1. Recommend highly relevant jobs to users 2. Ensure each job posting • Receives sufficient number of applications from qualified candidates to guarantee a hire • But not overwhelm the job poster with too many applications 19
  • 20. The Ideal Jobs Marketplace 20
  • 21. Potential Solution? • Rank by likelihood that user will apply for the job and pass the interview and accept the job offer? • Data on whether a candidate passed an interview is confidential • Data about the offer to the candidate is confidential too • More importantly, modeling this requires careful understanding on potential bias and unfairness of a model due to societal bias in the data • Practically, we solve the job application redistribution problem instead • Ensure a job does not receive too many or too few applications 21
  • 22. Diminishing Return of #Applications 22
  • 23. Our High-level Idea: Early Intervention [Borisyuk et al., KDD 2017] • Say a job expires at time T • At any time t < T • Predict #applications it would receive at time T • Given data received from time 0 to t • If too few => Boost ranking score 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗) • If too many => Penalize ranking score 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗) • Otherwise => No intervention • Key: Forecasting model of #applications per Job, using signals from: • # Applies / Impressions the job has received so far • Other features (xjt): e.g. • Seasonality (time of day, day of week) • Job attributes: title, company, industry, qualifications, … 23
  • 24. • Control Model for Ranking : Optimize for user engagement only • Split the jobs into 3 buckets every day: • Bucket 1: Received <8 applications up to now • Bucket 2: Received [8, 100] applications up to now • Bucket 3: Received >100 applications up to now Re-distribute Online A/B Testing 24
  • 25. Summary • Model candidate selection query generation using decision trees • Personalization at Scale through GLMM • Realizing the ideal jobs marketplace through application redistribution • But a lot of research work still needed to • Reformulate problem to model optimizing for a healthy marketplace directly • Understand and quantify bias and fairness in those potential new models 25
  • 26. References • [Borisyuk et al., 2016] CaSMoS: A framework for learning candidate selection models over structured queries and documents, KDD 2016 • [Borisyuk et al., 2017] LiJAR: A System for Job Application Redistribution towards Efficient Career Marketplace, KDD 2017 • [Grover et al., 2017] Latency reduction via decision tree based query construction, CIKM 2017 • [Zhang et al., 2016] GLMix: Generalized Linear Mixed Models For Large-Scale Response Prediction, KDD 2016 26
  • 28. Jobs You May Be Interested In (JYMBII) 28
  • 29. Problem Formulation • Rank jobs by 𝑃 User 𝑢 applies to Job 𝑗 𝑢, 𝑗) • Model response given: 29 Careers History, Skills, Education, Connections Job Title, Description, Location, Company 29
  • 30. User Interaction Logs Offline Modeling Workflow + User / Item derived features User Search-based Candidate Selection & Retrieval Query Construction User Feature Store Search Index of Items Recommendation Ranking Ranking Model Store Additional Re- ranking/Filtering Steps 1 2 3 4 5 6 7 Offline System Online System Item derived features JYMBII Infrastructure 30
  • 31. Understanding WAND Query Query : “Quality Assurance Engineer” AND Query: “Quality AND Assurance AND Engineer” ✅ ❌ 31
  • 32. Understanding WAND Query Query : “Quality Assurance Engineer” WAND : “(Quality[5] AND Assurance[5] AND Engineer[1]) [10]” ✅ ✅ 32
  • 33. Offline Evaluation • Utilize offline query replay to validate query against current baseline • Replay baseline and new query to compute metrics from the retention of actions and operational metrics • Applied jobs retained • Hits retrieved • Kendall’s Tau • Mimicking production ranking through replay to get more reliable estimate of online metrics 33
  • 34. 34

Editor's Notes

  1. - Passive Job Seekers: Allow them to discover what is avaliable in the marketplace. - Not Alot of Data for Passives, show them jobs that make the most sense to them given their current career history and experience - Active Job Seekers: Reinforce their job seeking experience. Show them similar jobs that they applied to that they may have missed. Make sure they don't miss opportunities - Powers alot of modules including jobs home, feed, email, ads
  2. Cite any examples
  3. - Content Based Recommendations, Personalization, and Collaboration are incorporated through a GLMM Model - Generalized link function - Mixed due to ensemble of models in additive - Fixed effect model - Random Effects for variation - All Linear Models
  4. - Example of features in each of the model. Generalization, on most of the time to share learning across examples in a linear model - Sparse Features for memorization - Need to choose good features to memorize - Random effect sparse features model personal affinity
  5. - Goal: Get people hired - Confirmed Hires - Time Lag on signal - Proxy in Total Job Applies - Metric: Total Job Applies - Optimize probability of Apply. Not View. Showing users popular/attractive jobs not as important as showing them actual good matches - User, Job, Activity