Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[CIKM 2014] Deviation-Based Contextual SLIM Recommenders

843 views

Published on

Context-aware recommender systems (CARS) help improve the effectiveness of recommendations by adapting to users' preferences in different contextual situations. One approach to CARS that has been shown to be particularly effective is Context-Aware Matrix Factorization (CAMF). CAMF incorporates contextual dependencies into the standard matrix factorization (MF) process, where users and items are represented as collections of weights over various latent factors. In this paper, we introduce another CARS approach based on an extension of matrix factorization, namely, the Sparse Linear Method (SLIM). We develop a family of deviation-based contextual SLIM (CSLIM) recommendation algorithms by learning rating deviations in different contextual conditions. Our CSLIM approach is better at explaining the underlying reasons behind contextual recommendations, and our experimental evaluations over five context-aware data sets demonstrate that these CSLIM algorithms outperform the state-of-the-art CARS algorithms in the top-$N$ recommendation task. We also discuss the criteria for selecting the appropriate CSLIM algorithm in advance based on the underlying characteristics of the data.

Published in: Technology
  • Login to see the comments

[CIKM 2014] Deviation-Based Contextual SLIM Recommenders

  1. 1. Deviation-Based Contextual SLIM Recommenders Yong Zheng, Bamshad Mobasher, Robin Burke DePaul University, Chicago, IL, USA @CIKM 2014, Shanghai, China, Nov 4, 2014
  2. 2. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  3. 3. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  4. 4. Traditional Recommender Systems (RS) T1 T2 T3 T4 T5 U1 3 2 U2 3 3 4 U3 4 2 1 U4 2 5 5 U5 3 2 4 2 Example: User-Item 2D-Rating Matrix Traditional Recommender: Users × Items Ratings
  5. 5. Context-aware RS (CARS) Motivations behind: Recommendation cannot live alone without considering contexts, because users’ preferences always change from contexts to contexts. Companion
  6. 6. Context-aware RS (CARS) Example: User-Item Contextual Rating Matrix In CARS: Users × Items × Contexts Ratings
  7. 7. Context-aware RS (CARS) Example: User-Item Contextual Rating Matrix Terminology: Context dimension: time, location, companion Context condition: values in specific dimension, e.g., weekend and weekday are two conditions in the context dimension “Time”
  8. 8. Context-aware RS (CARS) Representational CARS (R-CARS): Assuming there are known influential contextual variables available (e.g., location, time, mood, etc), how to build CARS algorithms to adapt to users’ preferences in different contextual situations.
  9. 9. Context-aware RS (CARS) Most of research in R-CARS is focusing on development of context-aware collaborative filtering (CACF). CF CACF Contexts
  10. 10. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  11. 11. Collaborative Filtering (CF) CF is one of most popular recommendation algorithms. 1). Memory-based CF Such as user-based CF and item-based CF Pros: good for explanation; Cons: sparsity problems 2). Model-based CF Such as matrix factorization, etc Pros: good performance; Cons: cold-start, explanation 3).Hybrid CF Recommendation Algorithms Such as content-based hybrid CF, etc Pros: further improvement; Cons: running costs
  12. 12. Item-based CF (ItemKNN, Sarwar, 2001) T1 T2 T3 T4 T5 U1 3 2 U2 3 3 ??? 4 U3 4 2 1 U4 2 5 5 U5 3 2 4 2 𝑃𝑢,𝑖 = 𝑗∈𝑁 𝑖 𝑅 𝑢,𝑗 × 𝑠𝑖𝑚(𝑖, 𝑗 𝑗∈𝑁 𝑖 𝑠𝑖𝑚(𝑖, 𝑗 Rating Prediction: Cons: item-item similarity calculations and neighborhood selections rely on co-ratings. What if the # of co-ratings is limited?
  13. 13. SLIM (Ning, et al., 2011) Sparse Linear Model (SLIM) is considered as another shape of collaborative filtering approach. Ranking Score Prediction: Matrix R = User-Item rating matrix; Matrix W = Item-Item coefficient matrix ≈ similarity matrix We name this approach as SLIM-I, since W represents item-item coefficients. 𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 = ℎ=1,ℎ≠𝑗 𝑁 𝑅𝑖,ℎ 𝑊ℎ,𝑗
  14. 14. Comparison Between ItemKNN & SLIM-I Pros of SLIM-I: Matrix W is learned directly towards prediction/ranking error; in other words, item-item coefficient/similarity is no longer calculated based on co-ratings, which is more reliable and can be optimized towards ranking directly. SLIM-I has been demonstrated to outperform UserKNN, ItemKNN, matrix factorization and other traditional RS algorithms. 𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 = ℎ=1,ℎ≠𝑗 𝑁 𝑅𝑖,ℎ 𝑊ℎ,𝑗 𝑃𝑢,𝑖 = 𝑗∈𝑁 𝑖 𝑅 𝑢,𝑗 × 𝑠𝑖𝑚(𝑖, 𝑗 𝑗∈𝑁 𝑖 𝑠𝑖𝑚(𝑖, 𝑗 Rating Prediction in ItemKNN: Ranking Score Prediction in SLIM-I:
  15. 15. SLIM-I and SLIM-U SLIM-I is another shape of ItemKNN; W = Item-item coefficient matrix; SLIM-U is another shape of UserKNN; W = User-user coefficient matrix;
  16. 16. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  17. 17. CSLIM: Contextual SLIM Recommenders We use SLIM-I as an example to introduce how to build CSLIM-I approaches; contexts can also be incorporated into SLIM-U to formulate CSLIM-U models accordingly. Ranking Prediction in SLIM-I: CSLIM has a uniform ranking prediction: CSLIM aggregates contextual ratings with item-item coefficients. There are two key points: 1).The rating to be aggregated should be placed under same c; 2).Accordingly, W indicates coefficients under same contexts; 𝑆𝑖,𝑗 = 𝑅𝑖,: ⋅ 𝑊:,𝑗 = ℎ=1,ℎ≠𝑗 𝑁 𝑅𝑖,ℎ 𝑊ℎ,𝑗 𝑆𝑖,𝑗,𝑐 = ℎ=1,ℎ≠𝑗 𝑁 𝑅𝑖,ℎ,𝑐 𝑊ℎ,𝑗 Incorporate Contexts
  18. 18. CSLIM: Contextual SLIM Recommenders The challenge is how to estimate , since contextual ratings are usually sparse – it is not guaranteed that the same user already rated other items in the same context c. Ranking Prediction in CSLIM-I: We used a deviation-based approach to estimate it. Matrix R: user-item 2D rating matrix (non-contextual ratings) Matrix W: item-item coefficient matrix Matrix D: a matrix estimating rating deviations in contexts; Here, D is a CI matrix (rows are items, cols are contexts) This approach is named as CSLIM-I-CI 𝑆𝑖,𝑗,𝑐 = ℎ=1,ℎ≠𝑗 𝑁 𝑅𝑖,ℎ,𝑐 𝑊ℎ,𝑗 𝑅𝑖,ℎ,𝑐
  19. 19. CSLIM: Contextual SLIM Recommenders We used a deviation-based approach to estimate it. Example: CSLIM-I-CI, R = non-contextual Rating Matrix D = Contextual Rating Deviation Matrix W = Item-item Coefficient Matrix C = a binary context vector, as below 𝑅𝑖,𝑗,𝑐 = 𝑅𝑖,𝑗 + 𝑙=1 𝐿 𝐷𝑗,𝑙 𝑐𝑙 Weekend Weekday At Home At Park 1 0 0 1 We use this estimation even if we already know a real contextual rating in situation c, since we’d like to learn as many cells in D as possible.
  20. 20. CSLIM: Contextual SLIM Recommenders There are three ways to model contextual rating deviation (CRD) in D: 1). D is a CI matrix – assuming there is CRD for each <item, context> pair 2). D is a CU matrix – assuming there is CRD for each <user, context> pair 3). D is a vector – assuming CRD is only dependent with context Incorporate contexts into SLIM-I: CSLIM-I-CI, CSLIM-I-CU, CSLIM-I-C; Incorporate contexts into SLIM-U: CSLIM-U-CI, CSLIM-U-CU, CSLIM-U-C; We have built six Deviation-based CSLIM models!!
  21. 21. Further Step: General CSLIM Approaches Cons: CSLIM requires users’ non-contextual ratings on items; if there are no such ratings, we proposed to use the average of user’s contextual ratings on the item for representative, which was demonstrated to be feasible in our experiments. However, we’d like to build more General CSLIM (GSLIM) models which does not require the data of non-contextual ratings. Simply, we model matrix D as a CC matrix, where each cell in D represents the CRD between each two contextual conditions. GCSLIM-I-CC can estimate rating deviations from a contextual rating to another contextual rating (same item but different contexts).
  22. 22. Further Step: General CSLIM Approaches For example, we want to estimate R<u1, t1, {Weekday, At home}> And we already know the rating R<u1, t1, {Weekend, At cinema}> And Matrix D helps us to learn and estimate CRD (Weekday, Weekend) & CRD (At home, At cinema) Therefore, R<u1, t1, {Weekday, At home}> = R<u1, t1, {Weekend, At cinema}> + CRD (Weekday, Weekend) + CRD (At home, At cinema) Similarly, matrix D can be paired with users or items; e.g., we assume CRD between contexts differ from users to users.
  23. 23. Further Step: General CSLIM Approaches Two challenges in GCSLIM approaches: 1). For each <user, item> pair, there could be several ratings for this pair but in different contexts. Which contextual rating should be applied? If we use all those ratings  increasing computational costs; If we just select one of them  there are three ways: MostSimilar, LeastSimilar and Random; our experiments showed we could randomly pick up one. See our papers for more details.
  24. 24. Further Step: General CSLIM Approaches Two challenges in GCSLIM approaches: 2). How to couple matrix D with user or item dimension If assign a D for each user/item  increasing computation costs Solution: we can cluster users/items to small groups, and assume the users/items in the same group can share a same matrix D. We will explore this attempt in our future work.
  25. 25. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  26. 26. Data Sets The current situation in the CARS research domain: 1). The number of data sets is limited; 2). The data is either small or sparse; 3). There are no large data sets, or larger ones are not publicly accessible. Most data were collected from surveys. All the data sets used can be found here: http://tiny.cc/contextdata For reason of limited time, we only present results based on the restaurant and music data in this slide. See more results in our CIKM paper.
  27. 27. Baseline Approaches We choose the state-of-the-arts CACF algorithms as baselines: 1). Differential context modeling (DCM): DCM incorporates contexts into UserKNN/ItemKNN, but it suffers from sparsity problem and performs the worst in terms of precision, recall and MAP. 2). Context-aware Splitting Approaches (CASA): CASA is a contextual transformation approach, where contextual data were converted to 2D user-item rating matrix, and then traditional approach (MF in this case) can be applied to the transformed data. 3). Context-aware Matrix Factorization (CAMF): CAMF incorporates contexts into MF, where CRD is modeled as similar way as CSLIM. 4). Tensor Factorization (TF): TF is an independent context-aware algorithm, since contexts are assumed to be independent with user and item dimensions. TF increases computational costs with the number of contexts increases.
  28. 28. Evaluation Protocols 1). 5-folds Cross-validation All algorithms were run based on the same 5-folds of the data. 2). Top-N Recommendation Evaluations Metrics: Precision, Recall and MAP (Mean Average Precision) Precision and Recall are used to measure accuracy; MAP is used to measure the position in the rankings; Research Questions: 1). CSLIM outperforms the state-of-the-art CARS algorithms? 2). How about the GCSLIM? Better than CSLIM? 3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
  29. 29. Evaluation Results Research Questions: 1). CSLIM outperforms the state-of-the-art CARS algorithms? 2). How about the GCSLIM? Better than CSLIM? 3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
  30. 30. Evaluation Results Research Questions: 1). CSLIM outperforms the state-of-the-art CARS algorithms? 2). How about the GCSLIM? Better than CSLIM? 3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm?
  31. 31. Evaluation Results Research Questions: 1). CSLIM outperforms the state-of-the-art CARS algorithms? 2). How about the GCSLIM? Better than CSLIM? 3). There are so many CLSIM algorithms, any guidelines to pre-select the appropriate CSLIM algorithm? There are two pieces in CSLIM algorithms; For example, CSLIM-I-CI 1). CSLIM-I, indicates we perform an ItemKNN CF approach; 2). – CI, indicates we model CRD as a CI matrix; Questions: 1). CSLIM-I/ItemKNN or CSLIM-U/UserKNN should be used? AW: it depends on the average number of ratings on items or the average number of ratings by users. 2). –CI, –CU or –C should be applied? AW: it relies on contexts are more dependent with users or items For more details, see our CIKM paper.
  32. 32. Evaluation Results How about the running efficiency? Typically, in CSLIM and GCSLIM, the matrices D and W should be learned in the process. There could be different challenges: 1). Large number of users/items/ratings In this case, the non-contextual rating matrix R or the rating space P will be very large, as well as the matrix W. Solution: adopt KNN strategy. We do not use all the ratings, but only select the top-N neighbors (items or users). 2). Large scale of contexts What if there are tons of contextual conditions? Usually, in CARS domain, the # of contextual dimensions are within 10, and the # of contextual conditions are 100 at most. Solution: there are many ways to pre-select influential contexts, which contributes to reduce the # of contexts.
  33. 33. Outline of the Talk • Context-aware Recommender Systems (CARS) • Collaborative Filtering and SLIM Recommenders • CSLIM: Contextualizing SLIM Recommenders • Experimental Evaluations • Conclusions and Future Work
  34. 34. Conclusions 1). CSLIM actually has been demonstrated to outperform the state-of-the-art CARS algorithms; 2). GCSLIM sometimes contributes further improvements, but it is not guaranteed that GCSLIM can always beat CSLIM algorithms – it depends on how sparse the contextual ratings are; 3). We figure out some influential factors and discover latent rules to select the appropriate CSLIM algorithms in advance. 1). Try to examine CSLIM and GCSLIM on larger data sets; 2). Try to compete with more models, e.g. factorization machines; 3). Try to couple CC matrix with users/items in GCSLIM approach; 4). Try to incorporate contexts into matrix W instead of adding the matrix D. Future Work
  35. 35. Deviation-Based Contextual SLIM Recommenders Yong Zheng, Bamshad Mobasher, Robin Burke DePaul University, Chicago, IL, USA @CIKM 2014, Shanghai, China, Nov 4, 2014 Thanks! Questions?

×