SlideShare a Scribd company logo
1 of 16
Download to read offline
AIM3 – Scalable Data Analysis and Data
                     Mining



             11 – Latent factor models for Collaborative Filtering
              Sebastian Schelter, Christoph Boden, Volker Markl




         Fachgebiet Datenbanksysteme und Informationsmanagement
                        Technische Universität Berlin

20.06.2012
                         http://www.dima.tu-berlin.de/
                                  DIMA – TU Berlin                   1
Recap: Item-Based Collaborative Filtering


Itembased Collaborative Filtering


    • compute pairwise similarities of the columns of
      the rating matrix using some similarity measure
    • store top 20 to 50 most similar items per item
      in the item-similarity matrix
    • prediction: use a weighted sum over all items
      similar to the unknown item that have been
      rated by the current user


              p ui =
                          j S ( i , u )
                                            s ij ruj

                            j S ( i , u )
                                              s 
                                                ij



 20.06.2012                  DIMA – TU Berlin          2
Drawbacks of similarity-based neighborhood
       methods


   • the assumption that a rating is defined by all the
     user's ratings for commonly co-rated items is
     hard to justify in general

   • lack of bias correction

   • every co-rated item is looked at in isolation,
     say a movie was similar to „Lord of the Rings“, do
     we want each part to of the trilogy to contribute as
     a single similar item?

   • best choice of similarity measure is based on
     experimentation not on mathematical reasons

20.06.2012              DIMA – TU Berlin            3
Latent factor models


■ Idea

    • ratings are deeply influenced by a set of factors that are
      very specific to the domain (e.g. amount of action in movies,
      complexity of characters)

    • these factors are in general not obvious, we might be able to
      think of some of them but it's hard to estimate their impact on
      the ratings

    • the goal is to infer those so called latent factors from the
      rating data by using mathematical techniques




 20.06.2012                   DIMA – TU Berlin                  4
Latent factor models

■ Approach

    • users and items are characterized by latent                                n
                                                                                     f
      factors, each user and item is mapped onto     ui ,m       j
                                                                      R
      a latent feature space

    • each rating is approximated by the dot                             T
                                                    rij  m j u i
      product of the user feature vector
      and the item feature vector

    • prediction of unknown ratings also uses
      this dot product

    • squared error as a measure of loss            r   ij
                                                                     T
                                                               m j ui          2




 20.06.2012                  DIMA – TU Berlin                            5
Latent factor models


■ Approach

    • decomposition of the rating matrix into the product of a user
      feature and an item feature matrix
    • row in U: vector of a user's affinity to the features
    • row in M: vector of an item's relation to the features

    • closely related to Singular Value Decomposition which
      produces an optimal low-rank optimization of a matrix



                                                MT
              R          ≈           U




 20.06.2012                  DIMA – TU Berlin              6
Latent factor models


■ Properties of the decomposition
   • automatically ranks features by their „impact“ on the ratings
   • features might not necessarily be intuitively understandable




  20.06.2012                 DIMA – TU Berlin                 7
Latent factor models

■ Problematic situation with explicit feedback data

    • the rating matrix is not only sparse, but partially defined,
      missing entries cannot be interpreted as 0 they are just
      unknown
    • standard decomposition algorithms like Lanczos method for
      SVD are not applicable

Solution

    • decomposition has to be done using the known ratings only
    • find the set of user and item feature vectors that minimizes the
      squared error to the known ratings



                                   r            m j ui 
                                                     T        2
                     min   U, M           i, j




 20.06.2012                       DIMA – TU Berlin                8
Latent factor models


■ quality of the decomposition is not measured with respect to
  the reconstruction error to the original data, but with
  respect to the generalization to unseen data
■ regularization necessary to avoid overfitting

■ model has hyperparameters (regularization, learning rate)
  that need to be chosen

■ process: split data into training, test and validation set
    □   train model using the training set
    □   choose hyperparameters according to performance on the test set
    □   evaluate generalization on the validation set
    □   ensure that each datapoint is used in each set once
        (cross-validation)



 20.06.2012                      DIMA – TU Berlin                    9
Stochastic Gradient Descent


   • add a regularizarion term

       min       U, M    r   i, j
                                          T
                                       m j ui   
                                                     2
                                                            
                                                         + λ ui
                                                                        2
                                                                            + m   j
                                                                                      2
                                                                                          
   • loop through all ratings in the training set, compute
     associated prediction error
                               T
       e ui = rij  m j u i

   • modify parameters in the opposite direction of the gradient

        u i  u i + γ e u, i m                      j
                                                          λu       i
                                                                        
        m    j
                   m j + γ e u, i u i  λm                    j
                                                                    
   • problem: approach is inherently sequential (although recent
     research might have unveiled a parallelization technique)



20.06.2012                                                      DIMA – TU Berlin              10
Alternating Least Squares with
        Weighted λ-Regularization
■ Model

    • feature matrices are modeled directly by using only
      the observed ratings
    • add a regularization term to avoid overfitting
    • minimize regularized error of:

          f U, M   =    r   ij
                                      m j ui  + λ
                                         T    2
                                                       n   u
                                                                 i
                                                                     ui
                                                                          2
                                                                              +      nm
                                                                                           j
                                                                                               m   j
                                                                                                       2
                                                                                                           
Solving technique

    • fixing one of the unknown variable to make this a simple
      quadratic equation
    • rotate between fixing u and m until convergence
      („Alternating Least Squares“)



 20.06.2012                                       DIMA – TU Berlin                                             11
ALS-WR is scalable


■ Which properties make this approach scalable?

    • all the features in one iteration can be computed
      independently of each other
    • only a small portion of the data necessary to compute
      a feature vector

Parallelization with Map/Reduce

    • Computing user feature vectors: the mappers need to send
      each user's rating vector and the feature vectors of his/her
      rated items to the same reducer

    • Computing item feature vectors: the mappers need to send
      each item's rating vector and the feature vectors of users who
      rated it to the same reducer

 20.06.2012                  DIMA – TU Berlin                  12
Incorporating biases


■ Problem: explicit feedback data is highly biased
    □ some users tend to rate more extreme than others
    □ some items tend to get higher ratings than others


■ Solution: explicitly model biases
    □ the bias of a rating is model as a combination of the items average
      rating, the item bias and the user bias

         b ij    b i  b j


    □ the rating bias can be incorporated into the prediction

         rij    b i  b j  m j u i
                                  T
         ˆ




 20.06.2012                           DIMA – TU Berlin                13
Latent factor models


■ implicit feedback data is very different from explicit data!

    □ e.g. use the number of clicks on a product page of an online shop

    □   the whole matrix is defined!
    □   no negative feedback
    □   interactions that did not happen produce zero values
    □   however we should have only little confidence in these (maybe the user
        never had the chance to interact with these items)

    □ using standard decomposition techniques like SVD would give us a
      decomposition that is biased towards the zero entries, again not
      applicable




 20.06.2012                      DIMA – TU Berlin                     14
Latent factor models

■ Solution for working with implicit data:
  weighted matrix factorization

                                                                                           1        rij  0
■ create a binary preference matrix P                                             p ij    
                                                                                             0       rij  0
                                                                                           

■ each entry in this matrix can be weighted
  by a confidence function
    □ zero values should get low confidence                                       c ( i , j )  1   rij

    □ values that are based on a lot of interactions
      should get high confidence


■ confidence is incorporated into the model
    □ the factorization will ‚prefer‘ more confident values


  f U, M     =                        T
                     c ( i , j ) p ij  m j u i   
                                                      2
                                                          + λ      ui
                                                                          2
                                                                              +            m    j
                                                                                                      2
                                                                                                          
 20.06.2012                           DIMA – TU Berlin                                               15
Sources


   • Sarwar et al.: „Item-Based Collaborative Filtering
     Recommendation Algorithms“, 2001
   • Koren et al.: „Matrix Factorization Techniques for Recommender
     Systems“, 2009
   • Funk: „Netflix Update: Try This at Home“,
     http://sifter.org/~simon/journal/20061211.html, 2006
   • Zhou et al.: „Large-scale Parallel Collaborative Filtering for the
     Netflix Prize“, 2008
   • Hu et al.: „Collaborative Filtering for Implicit Feedback
     Datasets“, 2008




20.06.2012                   DIMA – TU Berlin                   16

More Related Content

What's hot

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixJaya Kawale
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.Sunghoon Joo
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsJustin Basilico
 
ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox Tsahi Glik
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationDat Nguyen
 
Transformer xl
Transformer xlTransformer xl
Transformer xlSan Kim
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classificationKrish_ver2
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseHasan H Topcu
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillationNAVER Engineering
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningMohamed Loey
 

What's hot (20)

Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender SystemsDéjà Vu: The Importance of Time and Causality in Recommender Systems
Déjà Vu: The Importance of Time and Causality in Recommender Systems
 
ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Mask-RCNN for Instance Segmentation
Mask-RCNN for Instance SegmentationMask-RCNN for Instance Segmentation
Mask-RCNN for Instance Segmentation
 
Transformer xl
Transformer xlTransformer xl
Transformer xl
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Learning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwiseLearning to Rank - From pairwise approach to listwise
Learning to Rank - From pairwise approach to listwise
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
Content based filtering
Content based filteringContent based filtering
Content based filtering
 

Viewers also liked

Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutData Science London
 
国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定Shuyo Nakatani
 
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013Shuyo Nakatani
 
RecSys 2015: Large-scale real-time product recommendation at Criteo
RecSys 2015: Large-scale real-time product recommendation at CriteoRecSys 2015: Large-scale real-time product recommendation at Criteo
RecSys 2015: Large-scale real-time product recommendation at CriteoRomain Lerallut
 
情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライドKenta Oku
 
JP Chaosmap 2015-2016
JP Chaosmap 2015-2016JP Chaosmap 2015-2016
JP Chaosmap 2015-2016Hiroshi Kondo
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBenjamin Bengfort
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesAlejandro Correa Bahnsen, PhD
 
機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレストTeppei Baba
 

Viewers also liked (11)

Simple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in MahoutSimple Matrix Factorization for Recommendation in Mahout
Simple Matrix Factorization for Recommendation in Mahout
 
国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定国際化時代の40カ国語言語判定
国際化時代の40カ国語言語判定
 
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
どの言語でつぶやかれたのか、機械が知る方法 #WebDBf2013
 
RecSys 2015: Large-scale real-time product recommendation at Criteo
RecSys 2015: Large-scale real-time product recommendation at CriteoRecSys 2015: Large-scale real-time product recommendation at Criteo
RecSys 2015: Large-scale real-time product recommendation at Criteo
 
coordinate descent 法について
coordinate descent 法についてcoordinate descent 法について
coordinate descent 法について
 
情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド情報推薦システム入門:講義スライド
情報推薦システム入門:講義スライド
 
Deep forest
Deep forestDeep forest
Deep forest
 
JP Chaosmap 2015-2016
JP Chaosmap 2015-2016JP Chaosmap 2015-2016
JP Chaosmap 2015-2016
 
Beginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix FactorizationBeginners Guide to Non-Negative Matrix Factorization
Beginners Guide to Non-Negative Matrix Factorization
 
Ensembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slidesEnsembles of example dependent cost-sensitive decision trees slides
Ensembles of example dependent cost-sensitive decision trees slides
 
機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト機会学習ハッカソン:ランダムフォレスト
機会学習ハッカソン:ランダムフォレスト
 

Similar to Latent factor models for Collaborative Filtering

Modeling Economic Relationships.pptx
Modeling Economic Relationships.pptxModeling Economic Relationships.pptx
Modeling Economic Relationships.pptxBaijuPallayil
 
Irt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfIrt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfCarlo Magno
 
CEB-02-Cost-Estimating-Techniques.pdf
CEB-02-Cost-Estimating-Techniques.pdfCEB-02-Cost-Estimating-Techniques.pdf
CEB-02-Cost-Estimating-Techniques.pdfwhenn1
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performanceCSIRO
 
A framework for trustworthiness assessment based on fidelity in cyber and phy...
A framework for trustworthiness assessment based on fidelity in cyber and phy...A framework for trustworthiness assessment based on fidelity in cyber and phy...
A framework for trustworthiness assessment based on fidelity in cyber and phy...Vincenzo De Florio
 
Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsSSA KPI
 
AI driven classification framework for advanced Test Automation
AI driven classification framework for advanced Test AutomationAI driven classification framework for advanced Test Automation
AI driven classification framework for advanced Test AutomationSTePINForum
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptxNatan Katz
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningSanghamitra Deb
 
Models Of Modeling
Models Of ModelingModels Of Modeling
Models Of ModelingRoger Smith
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Harald Steck
 

Similar to Latent factor models for Collaborative Filtering (20)

Modeling Economic Relationships.pptx
Modeling Economic Relationships.pptxModeling Economic Relationships.pptx
Modeling Economic Relationships.pptx
 
Irt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdfIrt 1 pl, 2pl, 3pl.pdf
Irt 1 pl, 2pl, 3pl.pdf
 
CEB-02-Cost-Estimating-Techniques.pdf
CEB-02-Cost-Estimating-Techniques.pdfCEB-02-Cost-Estimating-Techniques.pdf
CEB-02-Cost-Estimating-Techniques.pdf
 
.pptx
.pptx.pptx
.pptx
 
Explainable insights on algorithm performance
Explainable insights on algorithm performanceExplainable insights on algorithm performance
Explainable insights on algorithm performance
 
A framework for trustworthiness assessment based on fidelity in cyber and phy...
A framework for trustworthiness assessment based on fidelity in cyber and phy...A framework for trustworthiness assessment based on fidelity in cyber and phy...
A framework for trustworthiness assessment based on fidelity in cyber and phy...
 
Aspiring Minds | Automata
Aspiring Minds | Automata Aspiring Minds | Automata
Aspiring Minds | Automata
 
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its ApplicationsParameter Estimation for Semiparametric Models with CMARS and Its Applications
Parameter Estimation for Semiparametric Models with CMARS and Its Applications
 
Quantitative Design Tools
Quantitative Design ToolsQuantitative Design Tools
Quantitative Design Tools
 
Entity2rec recsys
Entity2rec recsysEntity2rec recsys
Entity2rec recsys
 
AI driven classification framework for advanced Test Automation
AI driven classification framework for advanced Test AutomationAI driven classification framework for advanced Test Automation
AI driven classification framework for advanced Test Automation
 
AI for PM.pptx
AI for PM.pptxAI for PM.pptx
AI for PM.pptx
 
Modeling and analysis
Modeling and analysisModeling and analysis
Modeling and analysis
 
Logistic Regression Analysis
Logistic Regression AnalysisLogistic Regression Analysis
Logistic Regression Analysis
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Models Of Modeling
Models Of ModelingModels Of Modeling
Models Of Modeling
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
 

More from sscdotopen

Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen
 
Bringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to MahoutBringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to Mahoutsscdotopen
 
Next directions in Mahout's recommenders
Next directions in Mahout's recommendersNext directions in Mahout's recommenders
Next directions in Mahout's recommenderssscdotopen
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenderssscdotopen
 
Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahoutsscdotopen
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen
 
Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraphsscdotopen
 
Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processingsscdotopen
 

More from sscdotopen (9)

Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
 
Bringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to MahoutBringing Algebraic Semantics to Mahout
Bringing Algebraic Semantics to Mahout
 
Next directions in Mahout's recommenders
Next directions in Mahout's recommendersNext directions in Mahout's recommenders
Next directions in Mahout's recommenders
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenders
 
Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahout
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
Large Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache GiraphLarge Scale Graph Processing with Apache Giraph
Large Scale Graph Processing with Apache Giraph
 
Introducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph ProcessingIntroducing Apache Giraph for Large Scale Graph Processing
Introducing Apache Giraph for Large Scale Graph Processing
 
mahout-cf
mahout-cfmahout-cf
mahout-cf
 

Recently uploaded

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 

Recently uploaded (20)

How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 

Latent factor models for Collaborative Filtering

  • 1. AIM3 – Scalable Data Analysis and Data Mining 11 – Latent factor models for Collaborative Filtering Sebastian Schelter, Christoph Boden, Volker Markl Fachgebiet Datenbanksysteme und Informationsmanagement Technische Universität Berlin 20.06.2012 http://www.dima.tu-berlin.de/ DIMA – TU Berlin 1
  • 2. Recap: Item-Based Collaborative Filtering Itembased Collaborative Filtering • compute pairwise similarities of the columns of the rating matrix using some similarity measure • store top 20 to 50 most similar items per item in the item-similarity matrix • prediction: use a weighted sum over all items similar to the unknown item that have been rated by the current user p ui =  j S ( i , u ) s ij ruj  j S ( i , u ) s  ij 20.06.2012 DIMA – TU Berlin 2
  • 3. Drawbacks of similarity-based neighborhood methods • the assumption that a rating is defined by all the user's ratings for commonly co-rated items is hard to justify in general • lack of bias correction • every co-rated item is looked at in isolation, say a movie was similar to „Lord of the Rings“, do we want each part to of the trilogy to contribute as a single similar item? • best choice of similarity measure is based on experimentation not on mathematical reasons 20.06.2012 DIMA – TU Berlin 3
  • 4. Latent factor models ■ Idea • ratings are deeply influenced by a set of factors that are very specific to the domain (e.g. amount of action in movies, complexity of characters) • these factors are in general not obvious, we might be able to think of some of them but it's hard to estimate their impact on the ratings • the goal is to infer those so called latent factors from the rating data by using mathematical techniques 20.06.2012 DIMA – TU Berlin 4
  • 5. Latent factor models ■ Approach • users and items are characterized by latent n f factors, each user and item is mapped onto ui ,m j  R a latent feature space • each rating is approximated by the dot T rij  m j u i product of the user feature vector and the item feature vector • prediction of unknown ratings also uses this dot product • squared error as a measure of loss r ij T  m j ui  2 20.06.2012 DIMA – TU Berlin 5
  • 6. Latent factor models ■ Approach • decomposition of the rating matrix into the product of a user feature and an item feature matrix • row in U: vector of a user's affinity to the features • row in M: vector of an item's relation to the features • closely related to Singular Value Decomposition which produces an optimal low-rank optimization of a matrix MT R ≈ U 20.06.2012 DIMA – TU Berlin 6
  • 7. Latent factor models ■ Properties of the decomposition • automatically ranks features by their „impact“ on the ratings • features might not necessarily be intuitively understandable 20.06.2012 DIMA – TU Berlin 7
  • 8. Latent factor models ■ Problematic situation with explicit feedback data • the rating matrix is not only sparse, but partially defined, missing entries cannot be interpreted as 0 they are just unknown • standard decomposition algorithms like Lanczos method for SVD are not applicable Solution • decomposition has to be done using the known ratings only • find the set of user and item feature vectors that minimizes the squared error to the known ratings  r  m j ui  T 2 min U, M i, j 20.06.2012 DIMA – TU Berlin 8
  • 9. Latent factor models ■ quality of the decomposition is not measured with respect to the reconstruction error to the original data, but with respect to the generalization to unseen data ■ regularization necessary to avoid overfitting ■ model has hyperparameters (regularization, learning rate) that need to be chosen ■ process: split data into training, test and validation set □ train model using the training set □ choose hyperparameters according to performance on the test set □ evaluate generalization on the validation set □ ensure that each datapoint is used in each set once (cross-validation) 20.06.2012 DIMA – TU Berlin 9
  • 10. Stochastic Gradient Descent • add a regularizarion term min U, M  r i, j T  m j ui  2  + λ ui 2 + m j 2  • loop through all ratings in the training set, compute associated prediction error T e ui = rij  m j u i • modify parameters in the opposite direction of the gradient u i  u i + γ e u, i m j  λu i  m j  m j + γ e u, i u i  λm j  • problem: approach is inherently sequential (although recent research might have unveiled a parallelization technique) 20.06.2012 DIMA – TU Berlin 10
  • 11. Alternating Least Squares with Weighted λ-Regularization ■ Model • feature matrices are modeled directly by using only the observed ratings • add a regularization term to avoid overfitting • minimize regularized error of: f U, M =  r ij  m j ui  + λ T 2  n u i ui 2 +  nm j m j 2  Solving technique • fixing one of the unknown variable to make this a simple quadratic equation • rotate between fixing u and m until convergence („Alternating Least Squares“) 20.06.2012 DIMA – TU Berlin 11
  • 12. ALS-WR is scalable ■ Which properties make this approach scalable? • all the features in one iteration can be computed independently of each other • only a small portion of the data necessary to compute a feature vector Parallelization with Map/Reduce • Computing user feature vectors: the mappers need to send each user's rating vector and the feature vectors of his/her rated items to the same reducer • Computing item feature vectors: the mappers need to send each item's rating vector and the feature vectors of users who rated it to the same reducer 20.06.2012 DIMA – TU Berlin 12
  • 13. Incorporating biases ■ Problem: explicit feedback data is highly biased □ some users tend to rate more extreme than others □ some items tend to get higher ratings than others ■ Solution: explicitly model biases □ the bias of a rating is model as a combination of the items average rating, the item bias and the user bias b ij    b i  b j □ the rating bias can be incorporated into the prediction rij    b i  b j  m j u i T ˆ 20.06.2012 DIMA – TU Berlin 13
  • 14. Latent factor models ■ implicit feedback data is very different from explicit data! □ e.g. use the number of clicks on a product page of an online shop □ the whole matrix is defined! □ no negative feedback □ interactions that did not happen produce zero values □ however we should have only little confidence in these (maybe the user never had the chance to interact with these items) □ using standard decomposition techniques like SVD would give us a decomposition that is biased towards the zero entries, again not applicable 20.06.2012 DIMA – TU Berlin 14
  • 15. Latent factor models ■ Solution for working with implicit data: weighted matrix factorization 1 rij  0 ■ create a binary preference matrix P p ij   0 rij  0  ■ each entry in this matrix can be weighted by a confidence function □ zero values should get low confidence c ( i , j )  1   rij □ values that are based on a lot of interactions should get high confidence ■ confidence is incorporated into the model □ the factorization will ‚prefer‘ more confident values f U, M =   T c ( i , j ) p ij  m j u i  2 + λ  ui 2 +  m j 2  20.06.2012 DIMA – TU Berlin 15
  • 16. Sources • Sarwar et al.: „Item-Based Collaborative Filtering Recommendation Algorithms“, 2001 • Koren et al.: „Matrix Factorization Techniques for Recommender Systems“, 2009 • Funk: „Netflix Update: Try This at Home“, http://sifter.org/~simon/journal/20061211.html, 2006 • Zhou et al.: „Large-scale Parallel Collaborative Filtering for the Netflix Prize“, 2008 • Hu et al.: „Collaborative Filtering for Implicit Feedback Datasets“, 2008 20.06.2012 DIMA – TU Berlin 16