SlideShare a Scribd company logo
1 of 56
Download to read offline
Fast ALS-Based Matrix
Factorization for
Recommender Systems
David Zibriczky
LAWA Workpackage Meeting
16th January, 2013
LAWA Workpackage Meeting
Problem setting
16th January, 20132
Item Recommendation
โ€ข Classical item recommendation problem (see Netflix)
โ€ข Explicit feedbacks (ratings)
16th January, 20133 LAWA Workpackage Meeting
5 ?
?
The Matrix The Matrix 2 Twilight The Matrix 3
?
Collaborative Filtering (Explicit)
โ€ข Classical item recommendation problem (see Netflix)
โ€ข Explicit feedbacks (ratings)
โ€ข Collaborative Filtering
โ€ข Based on other users
16th January, 20134 LAWA Workpackage Meeting
5
5
4
5
5
?
?
The Matrix 3The Matrix The Matrix 2 Twilight
5
?
Collaborative Filtering (Implicit)
โ€ข Items are not movies only (live content, products, holidays, โ€ฆ)
โ€ข Implicit feedbacks (buy, view, โ€ฆ)
โ€ข Less information about pref.
16th January, 20135 LAWA Workpackage Meeting
?
?
Item4Item1 Item2 Item3
?
Industrial motivation
โ€ข Keeping the response time low
โ€ข Up-to-date user models, the adaptation should be fast
โ€ข The items may change rapidly, the training time can be a bottleneck of
live performance
โ€ข Increasing amount of data from a customer ๏ƒ  Increasing training time
โ€ข Limited resources
16th January, 20136 LAWA Workpackage Meeting
LAWA Workpackage Meeting
Model
16th January, 20137
Preference Matrix
โ€ข Matrix representation
โ€ข Implicit Feedbacks: Assuming
positive preference
โ€ข Value = 1
โ€ข Estimation of unknown preference?
โ€ข Sorting items by estimation ๏ƒ  Item
Recommendation
16th January, 20138 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 ? ? ?
User2 ? ? 1 ?
User3 1 1 ? ?
User4 ? 1 ? 1
Matrix Factorization
๐‘น = ๐‘ท๐‘ธ ๐‘ป
๐‘Ÿ ๐‘ข๐‘– = ๐’‘ ๐‘ข
๐‘‡ ๐’’๐‘–
๐‘น ๐‘ต๐’™๐‘ด: preference matrix
๐‘ท ๐‘ต๐’™๐‘ฒ: user feature matrix
๐‘ธ ๐‘ด๐’™๐‘ฒ: item feature matrix
๐‘ต: #users
๐‘ด: #items
๐‘ฒ: #features
๐‘ฒ โ‰ช ๐‘ด, ๐‘ฒ โ‰ช ๐‘ต
16th January, 20139 LAWA Workpackage Meeting
R Item1 Item2 Item3 โ€ฆ
User1
User2 ๐’“ ๐‘ข๐‘–
User3
โ€ฆ
P
๐’‘ ๐‘ข
๐‘‡
QT
๐’’๐‘–
๐’‘ ๐’– โ‰” ๐‘ท ๐’– ๐‘ป
๐’’๐’Š โ‰” ๐‘ธ ๐’Š ๐‘ป
LAWA Workpackage Meeting
Objective Function
16th January, 201310
Preference Matrix
16th January, 201311 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1
User2 1
User3 1 1
User4 1 1
โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice
Preference Matrix
16th January, 201312 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice-
โ€ข ๐’„ ๐‘ข๐‘– confidence for known feedback (constant or function of the context of event)
โ€ข Zero examples are less important, but important.
Confidence Matrix
16th January, 201313 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
C Item1 Item2 Item3 Item4
User1 ๐’„11 1 1 1
User2 1 1 ๐’„23 1
User3 ๐’„31 ๐’„32 1 1
User4 1 ๐’„42 1 ๐’„44
โ€ข Objective function:
Weighted Sum of Squared Errors
16th January, 201314 LAWA Workpackage Meeting
C Item1 Item2 Item3 Item4
User1 ๐’„11 1 1 1
User2 1 1 ๐’„23 1
User3 ๐’„31 ๐’„32 1 1
User4 1 ๐’„42 1 ๐’„44
๐’‡ ๐‘ท, ๐‘ธ = ๐‘พ๐‘บ๐‘บ๐‘ฌ =
(๐’–,๐’Š)
๐’„ ๐’–๐’Š ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’Š
๐Ÿ ๐‘ท = ?
๐‘ธ = ?
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
LAWA Workpackage Meeting
Optimizer
16th January, 201315
โ€ข Ridge Regression
โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘„ โˆ’1
๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘… ๐‘Ÿ ๐‘ข
โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘ƒ
โˆ’1
๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘… ๐‘ ๐‘–
Optimizer โ€“ Alternating Least Squares
16th January, 201316 LAWA Workpackage Meeting
QT
0.1 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
P
-0.2 0.6
0.6 0.4
0.7 0.2
0.5 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
โ€ข Ridge Regression
โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘„ โˆ’1
๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘… ๐‘Ÿ ๐‘ข
โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘ƒ
โˆ’1
๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘… ๐‘ ๐‘–
Optimizer โ€“ Alternating Least Squares
16th January, 201317 LAWA Workpackage Meeting
QT
0.3 -0.3 0.7 0.7
0.7 0.8 -0.5 -0.1
P
-0.2 0.6
0.6 0.4
0.7 0.2
0.5 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
โ€ข Ridge Regression
โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘„ โˆ’1
๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘… ๐‘Ÿ ๐‘ข
โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘ƒ
โˆ’1
๐‘ƒ ๐‘‡
๐ถ ๐‘–
๐‘… ๐‘ ๐‘–
Optimizer โ€“ Alternating Least Squares
16th January, 201318 LAWA Workpackage Meeting
QT
0.3 -0.3 0.7 0.7
0.7 0.8 -0.5 -0.1
P
-0.2 0.7
0.6 0.5
0.8 0.2
0.6 -0.2
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Alternating Least Squares
โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ต๐‘ด + ๐‘ฐ๐‘ฒ ๐Ÿ‘ ๐‘ต + ๐‘ด
๐‘ฌ: number of examples, ๐‘ฐ : number of iterations
โ€ข Improvement (Hu, Koren, Volinsky)
๏‚ง Ridge Regression: ๐‘ ๐‘ข = ๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘„ โˆ’1
๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘… ๐‘Ÿ ๐‘ข
๏‚ง ๐‘„ ๐‘‡
๐ถ ๐‘ข
๐‘„ = ๐‘„ ๐‘‡
๐‘„ + ๐‘„ ๐‘‡
๐ถ ๐‘ข
โˆ’ ๐ผ ๐‘„ = ๐ถ๐‘‚๐‘‰๐‘„0 + ๐ถ๐‘‚๐‘‰๐‘„+, ๐šถ(๐‘ฐ๐‘ฒ ๐Ÿ
๐‘ต๐‘ด) is costly
๏‚ง ๐ถ๐‘‚๐‘‰๐‘„0 is user independent, need to be calculated at the start of the iteration
๏‚ง Calculating ๐ถ๐‘‚๐‘‰๐‘„+ needs only #๐‘ท(๐’–)+
steps.
o #๐‘ท(๐’–)+
: number of positive examples of user u
๏‚ง Complexity: ๐œช ๐‘ฐ๐‘ฒ ๐Ÿ
๐‘ฌ + ๐‘ฐ๐‘ฒ ๐Ÿ‘
(๐‘ต + ๐‘ด) = ๐œช ๐‘ฐ๐‘ฒ ๐Ÿ
(๐‘ฌ + ๐‘ฒ(๐‘ต + ๐‘ด)
๏‚ง Codename: IALS
โ€ข Complexity issues on large dataset:
๏‚ง If ๐‘ฒ is low: ๐œช(๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ฌ) is dominant
๏‚ง If ๐‘ฒ is high: ๐‘ถ(๐‘ฐ๐‘ฒ ๐Ÿ‘
(๐‘ต + ๐‘ด)) is dominant
19 LAWA Workpackage Meeting 16th January, 2013
LAWA Workpackage Meeting
Problem: Complexity
16th January, 201320
Ridge Regression with Coordinate Descent
16th January, 201321 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
P
? ? ?
โ€ข Initialize with zero values
Ridge Regression with Coordinate Descent
16th January, 201322 LAWA Workpackage Meeting
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
P
0 0 0
Ridge Regression with Coordinate Descent
16th January, 201323 LAWA Workpackage Meeting
P
0.51 0 0
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป
โ€ข Optimize only one feature of ๐‘ ๐‘ข at once
โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘–
โ€ข Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201324 LAWA Workpackage Meeting
P
0.51 0.10 0
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป
โ€ข Optimize only one feature of ๐‘ ๐‘ข at once
โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘–
โ€ข Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201325 LAWA Workpackage Meeting
P
0.51 0.10 0.08
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป
โ€ข Optimize only one feature of ๐‘ ๐‘ข at once
โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘–
โ€ข Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201326 LAWA Workpackage Meeting
P
0.47 0.10 0.08
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป
โ€ข Optimize only one feature of ๐‘ ๐‘ข at once
โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘–
โ€ข Apply more iteration
Ridge Regression with Coordinate Descent
16th January, 201327 LAWA Workpackage Meeting
P
0.46 0.11 0.07
R Item1 Item2 Item3 Item4
User1 1 0 0 0
QT
0.9 -0.4 0.8 0.6
0.6 0.7 -0.7 -0.2
-0.1 -0.4 -0.1 0.6
โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป
โ€ข Optimize only one feature of ๐‘ ๐‘ข at once
โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘–
โ€ข Apply more iteration
Optimizer โ€“ Coordinate Descent
16th January, 201328 LAWA Workpackage Meeting
QT
0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 0
0 0
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201329 LAWA Workpackage Meeting
QT
0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0 0
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201330 LAWA Workpackage Meeting
QT
0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 0
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201331 LAWA Workpackage Meeting
QT
0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 0.5
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201332 LAWA Workpackage Meeting
QT
0.1 0.4 1.1 0.6
0.6 0.7 1.5 1.0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201333 LAWA Workpackage Meeting
QT
0.1 0 0 0
0 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201334 LAWA Workpackage Meeting
QT
0.1 0 0 0
0.6 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201335 LAWA Workpackage Meeting
QT
0.1 0.4 0 0
0.6 0 0 0
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201336 LAWA Workpackage Meeting
QT
0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.3 -0.1
0.1 -0.5
-0.4 0.2
0.5 -0.4
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201337 LAWA Workpackage Meeting
QT
0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 0
0 0
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201338 LAWA Workpackage Meeting
QT
0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 -0.1
0 0
0 0
0 0
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
16th January, 201339 LAWA Workpackage Meeting
QT
0.1 0.4 -0.1 0.2
0.6 0.7 0.8 0.5
P
0.2 -0.1
0.1 -0.4
-0.3 0.1
0.5 -0.6
โ€ข Ridge Regression with Coordinate Descent
R Item1 Item2 Item3 Item4
User1 1 0 0 0
User2 0 0 1 0
User3 1 1 0 0
User4 0 1 0 1
Optimizer โ€“ Coordinate Descent
โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด
โ€ข Ridge Regression calculates the features based on examples directly,
Covariance precomputing solution cannot be applied here.
40 LAWA Workpackage Meeting 16th January, 2013
Optimizer โ€“ Coordinate Descent Improvement
โ€ข Synthetic examples (Pilรกszy, Zibriczky, Tikk)
โ€ข Solution of Ridgre Regression with CD: ๐‘ ๐‘ข๐‘˜ = ๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘–
๐‘–=1
๐‘€
๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜
=
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
โ€ข Calculate statistics for this user, who watched nothing (๐‘†๐ธ๐‘„0 and ๐‘†๐‘„๐‘„0)
โ€ข The solution is calculated incrementally: ๐‘ ๐‘ข๐‘˜ =
๐‘†๐‘„๐ธ
๐‘†๐‘„๐‘„
=
๐‘†๐‘„๐ธ0+๐‘†๐‘„๐ธ+
๐‘†๐‘„๐‘„0+๐‘†๐‘„๐‘„+
( ๐‘ด + #๐‘ท(๐’–)+ steps)
โ€ข Eigenvalue decomposition: ๐‘„ ๐‘‡
๐‘„ = ๐‘†ฮ›๐‘† ๐‘‡
= ๐‘† ฮ›
๐‘‡
ฮ›๐‘† = ๐บ ๐‘‡
๐บ
โ€ข Zero examples are compressed to synthetic examples: ๐‘„ ๐‘€๐‘ฅ๐พ โ†’ ๐บ ๐พ๐‘ฅ๐พ
โ€ข ๐‘†๐บ๐บ0 = ๐‘†๐‘„๐‘„0, but needs only ๐Š steps to compute: ๐‘ ๐‘ข๐‘˜ =
๐‘บ๐‘ฎ๐‘ฌ ๐ŸŽ+๐‘†๐‘„๐ธ+
๐‘บ๐‘ฎ๐‘ฎ ๐ŸŽ+๐‘†๐‘„๐‘„+
( ๐‘ฒ + #๐‘ท(๐’–)+ steps)
โ€ข ๐‘†๐บ๐ธ0 is calculated the same way as ๐‘†๐‘„๐ธ0, but using ๐Š steps only.
โ€ข Complexity: ๐›ฐ ๐ผ๐พ(๐ธ + ๐พ๐‘€ + ๐พ๐‘)) = ๐šถ ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต)
41 LAWA Workpackage Meeting 16th January, 2013
Optimizer โ€“ Coordinate Descent
โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด
โ€ข Ridge Regression calculates the features based on examples directly,
Covariance precomputing solution cannot be applied here.
โ€ข Synthetic Examples
โ€ข Codename: IALS1
โ€ข Complexity reduction (IALS๏ƒ IALS1)
๏ƒ ๐œช ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต)
โ€ข IALS1 requires higher ๐‘ฒ for the same accuracy as IALS.
42 LAWA Workpackage Meeting 16th January, 2013
Optimizer โ€“ Coordinate Descent
...does it work in practice?
16th January, 201343 LAWA Workpackage Meeting
โ€ข Average Rank Position on the subset of a propietary implicit feedback dataset. The lower
value is better.
โ€ข IALS1 offers better time-accuracy tradeoffs, especially when K is large.
Comparison
44 LAWA Workpackage Meeting 16th January, 2013
IALS IALS1
K ARP time ARP time
5 0,1903 153 0,1898 112
10 0,1578 254 0,1588 134
20 0,1427 644 0,1432 209
50 0,1334 2862 0,1344 525
100 0,1314 11441 0,1325 1361
250 0,1311 92944 0,1312 6651
500 N/A N/A 0,1282 24697
1000 N/A N/A 0,1242 104611
0,120
0,125
0,130
0,135
0,140
0,145
0,150
0,155
100 1000 10000 100000
ARP
Training Time (s)
IALS IALS1
Conclusion
โ€ข Explicit feedbacks are rarely or not provided.
โ€ข Implicit feedbacks are more general.
โ€ข Complexity issues of Alternating Least Squares.
โ€ข Efficient solution by using approximation and synthetic examples.
โ€ข IALS1 offers better time-accuracy tradeoffs, especially when ๐‘ฒ is large.
โ€ข IALS is approximation algorithm too, so why not change it to be even
more approximative?
45 LAWA Workpackage Meeting 16th January, 2013
LAWA Workpackage Meeting
Other algorithms
16th January, 201346
Model โ€“ Tensor Factorization
47 LAWA Workpackage Meeting 16th January, 2013
โ€ข Different preferences during the day
โ€ข Time period 1: 06:00-14:00
R1 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 1 โ€ฆ
User3 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
โ€ข Different preferences during the day
โ€ข Time period 2: 14:00-22:00
Model โ€“ Tensor Factorization
48 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 1 0 โ€ฆ
User3 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
R2 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 1 โ€ฆ
User3 1 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
Model โ€“ Tensor Factorization
โ€ข Different preferences during the day
โ€ข Time period 3: 22:00-06:00
49 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 1 0 โ€ฆ
User3 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
R2 Item1 Item2 Item3 โ€ฆ
User1 0 1 โ€ฆ
User2 1 โ€ฆ
User3 1 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
R3 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 โ€ฆ
User3 1 1 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
Model โ€“ Tensor Factorization
50 LAWA Workpackage Meeting 16th January, 2013
R1 Item1 Item2 Item3 โ€ฆ
User1 1 โ€ฆ
User2 1 0 โ€ฆ
User3 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
R2 Item1 Item2 Item3 โ€ฆ
User1 0 1 โ€ฆ
User2 1 โ€ฆ
User3 1 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
R3 Item1 Item2 Item3 โ€ฆ
User1 โ€ฆ
User2 ๐’“ ๐‘ข๐‘–๐‘ก โ€ฆ
User3 โ€ฆ
โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
QT
q11 q21 q31 โ€ฆ
q12 q22 q32 โ€ฆ
P
p11 p12
p21 p22
p31 p32
โ€ฆ โ€ฆ
Tt11
t12
t21
t22
t31
t32
๐‘น ๐‘ต๐’™๐‘ด: preference matrix
๐‘ท ๐‘ต๐’™๐‘ฒ: user feature matrix
๐‘ธ ๐‘ด๐’™๐‘ฒ: item feature matrix
๐‘ป ๐‘ณ๐’™๐‘ฒ: time feature matrix
๐‘ต: #users
๐‘ด: #items
๐‘ณ: #time periods
๐‘ฒ: #features
๐’“ ๐’–๐’Št =
๐’Œ
๐’‘ ๐’–๐’Œ ๐’’๐’Š๐’Œ ๐’•๐’•๐’Œ
๐‘น = ๐‘ทยฐ ๐‘ธยฐ ๐‘ป
โ€ข Data sets: Netflix Rating 5, IPTV Provider VOD rental, Grocery buys
โ€ข Evaluation Metric: Recall@20, Precision-Recall@20
โ€ข Number of features: 20
Comparison โ€“ ITALS vs. IALS
51 LAWA Workpackage Meeting 16th January, 2013
Test case (20) IALS ITALS
Netflix Probe 0.087 0.097
Netflix Time Split 0.054 0.071
IPTV VOD 1day 0.063 0.112
IPTV VOD 1week 0.055 0.100
Grocer 0.065 0.103
Comparison โ€“ ITALS vs. IALS
52 LAWA Workpackage Meeting 16th January, 2013
Objective Function โ€“ Ranking-based objective function
16th January, 201353 LAWA Workpackage Meeting
โ€ข Ranking-based objective function approach:
โ€ข ๐’“ ๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ : difference of preference between item i and j
โ€ข ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ : estimated difference of preference between item i and j
โ€ข ๐’”๐’‹: importance of item j in objective function
โ€ข Model: Matrix Factorization
โ€ข Optimizer: Alternating Least Squares
โ€ข Name: RankALS
๐’‡ ๐œฝ =
๐’–๐๐‘ผ ๐’Š๐๐‘ฐ
๐’„ ๐’–๐’Š
๐’Š๐๐‘ฐ
๐’”๐’‹[ ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ โˆ’ ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ ] ๐Ÿ
Comparison โ€“ RankIALS vs. IALS
54 LAWA Workpackage Meeting 16th January, 2013
Comparison โ€“ RankIALS vs. IALS
55 LAWA Workpackage Meeting 16th January, 2013
Related Publications
โ€ข Alternating Least Squares with Coordinate Descent
I. Pilรกszy, D. Zibriczky, D. Tikk. Fast ALS-based matrix factorization for explicit and
implicit feedback datasets. RecSys 2010
โ€ข Tensor Factorization
B. Hidasi, D. Tikk: Fast ALS-Based Tensor Factorization for Context-Aware
Recommendation from Implicit Feedback, ECML PKDD 2012
โ€ข Personalized Ranking
G. Takรกcs, D. Tikk: Alternating least squares for personalized ranking, RecSys 2012
โ€ข IPTV Case Study
D. Zibriczky, B. Hidasi, Z. Petres, D. Tikk: Personalized recommendation of linear content
on interactive TV platforms: beating the cold start and noisy implicit user feedback,
TVMMP @ UMAP 2012
56 LAWA Workpackage Meeting 16th January, 2013

More Related Content

What's hot

Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at SpotifyRohan Agrawal
ย 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
ย 
Bpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedbackBpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedbackPark JunPyo
ย 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation systemAkashPatil334
ย 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
ย 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
ย 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsJustin Basilico
ย 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
ย 
Cohort Analysis at Scale
Cohort Analysis at ScaleCohort Analysis at Scale
Cohort Analysis at ScaleBlake Irvine
ย 
Movie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens DatasetMovie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens DatasetJagruti Joshi
ย 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
ย 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsXavier Amatriain
ย 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
ย 
RecSysTEL lecture at advanced SIKS course, NL
RecSysTEL lecture at advanced SIKS course, NLRecSysTEL lecture at advanced SIKS course, NL
RecSysTEL lecture at advanced SIKS course, NLHendrik Drachsler
ย 
Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018Paolo Cremonesi
ย 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
ย 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender systemStanley Wang
ย 

What's hot (20)

Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at Spotify
ย 
Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
ย 
Bpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedbackBpr bayesian personalized ranking from implicit feedback
Bpr bayesian personalized ranking from implicit feedback
ย 
Use of data science in recommendation system
Use of data science in  recommendation systemUse of data science in  recommendation system
Use of data science in recommendation system
ย 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
ย 
Content based filtering
Content based filteringContent based filtering
Content based filtering
ย 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
ย 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
ย 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
ย 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
ย 
Cohort Analysis at Scale
Cohort Analysis at ScaleCohort Analysis at Scale
Cohort Analysis at Scale
ย 
Movie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens DatasetMovie Recommendation System - MovieLens Dataset
Movie Recommendation System - MovieLens Dataset
ย 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
ย 
The Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender SystemsThe Science and the Magic of User Feedback for Recommender Systems
The Science and the Magic of User Feedback for Recommender Systems
ย 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
ย 
RecSysTEL lecture at advanced SIKS course, NL
RecSysTEL lecture at advanced SIKS course, NLRecSysTEL lecture at advanced SIKS course, NL
RecSysTEL lecture at advanced SIKS course, NL
ย 
Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018Tutorial on sequence aware recommender systems - UMAP 2018
Tutorial on sequence aware recommender systems - UMAP 2018
ย 
Recommender system
Recommender systemRecommender system
Recommender system
ย 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
ย 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
ย 

More from David Zibriczky

Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)David Zibriczky
ย 
Predictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesPredictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesDavid Zibriczky
ย 
Improving the TV User Experience by Algorithms: Personalized Content Recommen...
Improving the TV User Experience by Algorithms: Personalized Content Recommen...Improving the TV User Experience by Algorithms: Personalized Content Recommen...
Improving the TV User Experience by Algorithms: Personalized Content Recommen...David Zibriczky
ย 
Recommender Systems meet Finance - A literature review
Recommender Systems meet Finance - A literature reviewRecommender Systems meet Finance - A literature review
Recommender Systems meet Finance - A literature reviewDavid Zibriczky
ย 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
ย 
EPG content recommendation in large scale: a case study on interactive TV pla...
EPG content recommendation in large scale: a case study on interactive TV pla...EPG content recommendation in large scale: a case study on interactive TV pla...
EPG content recommendation in large scale: a case study on interactive TV pla...David Zibriczky
ย 
Personalized recommendation of linear content on interactive TV platforms
Personalized recommendation of linear content on interactive TV platformsPersonalized recommendation of linear content on interactive TV platforms
Personalized recommendation of linear content on interactive TV platformsDavid Zibriczky
ย 
Data Modeling in IPTV and OTT Recommender Systems
Data Modeling in IPTV and OTT Recommender SystemsData Modeling in IPTV and OTT Recommender Systems
Data Modeling in IPTV and OTT Recommender SystemsDavid Zibriczky
ย 
Entropy based asset pricing
Entropy based asset pricingEntropy based asset pricing
Entropy based asset pricingDavid Zibriczky
ย 

More from David Zibriczky (9)

Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
ย 
Predictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesPredictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment Businesses
ย 
Improving the TV User Experience by Algorithms: Personalized Content Recommen...
Improving the TV User Experience by Algorithms: Personalized Content Recommen...Improving the TV User Experience by Algorithms: Personalized Content Recommen...
Improving the TV User Experience by Algorithms: Personalized Content Recommen...
ย 
Recommender Systems meet Finance - A literature review
Recommender Systems meet Finance - A literature reviewRecommender Systems meet Finance - A literature review
Recommender Systems meet Finance - A literature review
ย 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
ย 
EPG content recommendation in large scale: a case study on interactive TV pla...
EPG content recommendation in large scale: a case study on interactive TV pla...EPG content recommendation in large scale: a case study on interactive TV pla...
EPG content recommendation in large scale: a case study on interactive TV pla...
ย 
Personalized recommendation of linear content on interactive TV platforms
Personalized recommendation of linear content on interactive TV platformsPersonalized recommendation of linear content on interactive TV platforms
Personalized recommendation of linear content on interactive TV platforms
ย 
Data Modeling in IPTV and OTT Recommender Systems
Data Modeling in IPTV and OTT Recommender SystemsData Modeling in IPTV and OTT Recommender Systems
Data Modeling in IPTV and OTT Recommender Systems
ย 
Entropy based asset pricing
Entropy based asset pricingEntropy based asset pricing
Entropy based asset pricing
ย 

Recently uploaded

Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
ย 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSรฉrgio Sacani
ย 
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCR
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCRStunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCR
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
ย 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
ย 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
ย 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
ย 
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.aasikanpl
ย 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
ย 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
ย 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
ย 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSรฉrgio Sacani
ย 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
ย 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
ย 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
ย 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
ย 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
ย 
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.aasikanpl
ย 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
ย 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
ย 

Recently uploaded (20)

Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
ย 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
ย 
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCR
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCRStunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCR
Stunning โžฅ8448380779โ–ป Call Girls In Panchshil Enclave Delhi NCR
ย 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
ย 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
ย 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
ย 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
ย 
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Munirka Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
ย 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
ย 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
ย 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
ย 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
ย 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
ย 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
ย 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
ย 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
ย 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
ย 
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
Call Girls in Mayapuri Delhi ๐Ÿ’ฏCall Us ๐Ÿ”9953322196๐Ÿ” ๐Ÿ’ฏEscort.
ย 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
ย 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
ย 

Fast ALS-Based Matrix Factorization for Recommender Systems

  • 1. Fast ALS-Based Matrix Factorization for Recommender Systems David Zibriczky LAWA Workpackage Meeting 16th January, 2013
  • 2. LAWA Workpackage Meeting Problem setting 16th January, 20132
  • 3. Item Recommendation โ€ข Classical item recommendation problem (see Netflix) โ€ข Explicit feedbacks (ratings) 16th January, 20133 LAWA Workpackage Meeting 5 ? ? The Matrix The Matrix 2 Twilight The Matrix 3 ?
  • 4. Collaborative Filtering (Explicit) โ€ข Classical item recommendation problem (see Netflix) โ€ข Explicit feedbacks (ratings) โ€ข Collaborative Filtering โ€ข Based on other users 16th January, 20134 LAWA Workpackage Meeting 5 5 4 5 5 ? ? The Matrix 3The Matrix The Matrix 2 Twilight 5 ?
  • 5. Collaborative Filtering (Implicit) โ€ข Items are not movies only (live content, products, holidays, โ€ฆ) โ€ข Implicit feedbacks (buy, view, โ€ฆ) โ€ข Less information about pref. 16th January, 20135 LAWA Workpackage Meeting ? ? Item4Item1 Item2 Item3 ?
  • 6. Industrial motivation โ€ข Keeping the response time low โ€ข Up-to-date user models, the adaptation should be fast โ€ข The items may change rapidly, the training time can be a bottleneck of live performance โ€ข Increasing amount of data from a customer ๏ƒ  Increasing training time โ€ข Limited resources 16th January, 20136 LAWA Workpackage Meeting
  • 8. Preference Matrix โ€ข Matrix representation โ€ข Implicit Feedbacks: Assuming positive preference โ€ข Value = 1 โ€ข Estimation of unknown preference? โ€ข Sorting items by estimation ๏ƒ  Item Recommendation 16th January, 20138 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 ? ? ? User2 ? ? 1 ? User3 1 1 ? ? User4 ? 1 ? 1
  • 9. Matrix Factorization ๐‘น = ๐‘ท๐‘ธ ๐‘ป ๐‘Ÿ ๐‘ข๐‘– = ๐’‘ ๐‘ข ๐‘‡ ๐’’๐‘– ๐‘น ๐‘ต๐’™๐‘ด: preference matrix ๐‘ท ๐‘ต๐’™๐‘ฒ: user feature matrix ๐‘ธ ๐‘ด๐’™๐‘ฒ: item feature matrix ๐‘ต: #users ๐‘ด: #items ๐‘ฒ: #features ๐‘ฒ โ‰ช ๐‘ด, ๐‘ฒ โ‰ช ๐‘ต 16th January, 20139 LAWA Workpackage Meeting R Item1 Item2 Item3 โ€ฆ User1 User2 ๐’“ ๐‘ข๐‘– User3 โ€ฆ P ๐’‘ ๐‘ข ๐‘‡ QT ๐’’๐‘– ๐’‘ ๐’– โ‰” ๐‘ท ๐’– ๐‘ป ๐’’๐’Š โ‰” ๐‘ธ ๐’Š ๐‘ป
  • 10. LAWA Workpackage Meeting Objective Function 16th January, 201310
  • 11. Preference Matrix 16th January, 201311 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 User2 1 User3 1 1 User4 1 1
  • 12. โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice Preference Matrix 16th January, 201312 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 13. โ€ข Zero value for unknown preference (zero example). Many 0s, few 1s, in practice- โ€ข ๐’„ ๐‘ข๐‘– confidence for known feedback (constant or function of the context of event) โ€ข Zero examples are less important, but important. Confidence Matrix 16th January, 201313 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1 C Item1 Item2 Item3 Item4 User1 ๐’„11 1 1 1 User2 1 1 ๐’„23 1 User3 ๐’„31 ๐’„32 1 1 User4 1 ๐’„42 1 ๐’„44
  • 14. โ€ข Objective function: Weighted Sum of Squared Errors 16th January, 201314 LAWA Workpackage Meeting C Item1 Item2 Item3 Item4 User1 ๐’„11 1 1 1 User2 1 1 ๐’„23 1 User3 ๐’„31 ๐’„32 1 1 User4 1 ๐’„42 1 ๐’„44 ๐’‡ ๐‘ท, ๐‘ธ = ๐‘พ๐‘บ๐‘บ๐‘ฌ = (๐’–,๐’Š) ๐’„ ๐’–๐’Š ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’Š ๐Ÿ ๐‘ท = ? ๐‘ธ = ? R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 16. โ€ข Ridge Regression โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘„ โˆ’1 ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘… ๐‘Ÿ ๐‘ข โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘ƒ โˆ’1 ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘… ๐‘ ๐‘– Optimizer โ€“ Alternating Least Squares 16th January, 201316 LAWA Workpackage Meeting QT 0.1 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 P -0.2 0.6 0.6 0.4 0.7 0.2 0.5 -0.2 R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 17. โ€ข Ridge Regression โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘„ โˆ’1 ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘… ๐‘Ÿ ๐‘ข โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘ƒ โˆ’1 ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘… ๐‘ ๐‘– Optimizer โ€“ Alternating Least Squares 16th January, 201317 LAWA Workpackage Meeting QT 0.3 -0.3 0.7 0.7 0.7 0.8 -0.5 -0.1 P -0.2 0.6 0.6 0.4 0.7 0.2 0.5 -0.2 R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 18. โ€ข Ridge Regression โ€ข ๐‘ ๐‘ข = ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘„ โˆ’1 ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘… ๐‘Ÿ ๐‘ข โ€ข ๐‘ž๐‘– = ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘ƒ โˆ’1 ๐‘ƒ ๐‘‡ ๐ถ ๐‘– ๐‘… ๐‘ ๐‘– Optimizer โ€“ Alternating Least Squares 16th January, 201318 LAWA Workpackage Meeting QT 0.3 -0.3 0.7 0.7 0.7 0.8 -0.5 -0.1 P -0.2 0.7 0.6 0.5 0.8 0.2 0.6 -0.2 R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 19. Optimizer โ€“ Alternating Least Squares โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ต๐‘ด + ๐‘ฐ๐‘ฒ ๐Ÿ‘ ๐‘ต + ๐‘ด ๐‘ฌ: number of examples, ๐‘ฐ : number of iterations โ€ข Improvement (Hu, Koren, Volinsky) ๏‚ง Ridge Regression: ๐‘ ๐‘ข = ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘„ โˆ’1 ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘… ๐‘Ÿ ๐‘ข ๏‚ง ๐‘„ ๐‘‡ ๐ถ ๐‘ข ๐‘„ = ๐‘„ ๐‘‡ ๐‘„ + ๐‘„ ๐‘‡ ๐ถ ๐‘ข โˆ’ ๐ผ ๐‘„ = ๐ถ๐‘‚๐‘‰๐‘„0 + ๐ถ๐‘‚๐‘‰๐‘„+, ๐šถ(๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ต๐‘ด) is costly ๏‚ง ๐ถ๐‘‚๐‘‰๐‘„0 is user independent, need to be calculated at the start of the iteration ๏‚ง Calculating ๐ถ๐‘‚๐‘‰๐‘„+ needs only #๐‘ท(๐’–)+ steps. o #๐‘ท(๐’–)+ : number of positive examples of user u ๏‚ง Complexity: ๐œช ๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ฌ + ๐‘ฐ๐‘ฒ ๐Ÿ‘ (๐‘ต + ๐‘ด) = ๐œช ๐‘ฐ๐‘ฒ ๐Ÿ (๐‘ฌ + ๐‘ฒ(๐‘ต + ๐‘ด) ๏‚ง Codename: IALS โ€ข Complexity issues on large dataset: ๏‚ง If ๐‘ฒ is low: ๐œช(๐‘ฐ๐‘ฒ ๐Ÿ ๐‘ฌ) is dominant ๏‚ง If ๐‘ฒ is high: ๐‘ถ(๐‘ฐ๐‘ฒ ๐Ÿ‘ (๐‘ต + ๐‘ด)) is dominant 19 LAWA Workpackage Meeting 16th January, 2013
  • 20. LAWA Workpackage Meeting Problem: Complexity 16th January, 201320
  • 21. Ridge Regression with Coordinate Descent 16th January, 201321 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 P ? ? ?
  • 22. โ€ข Initialize with zero values Ridge Regression with Coordinate Descent 16th January, 201322 LAWA Workpackage Meeting R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 P 0 0 0
  • 23. Ridge Regression with Coordinate Descent 16th January, 201323 LAWA Workpackage Meeting P 0.51 0 0 R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป โ€ข Optimize only one feature of ๐‘ ๐‘ข at once โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘– โ€ข Apply more iteration
  • 24. Ridge Regression with Coordinate Descent 16th January, 201324 LAWA Workpackage Meeting P 0.51 0.10 0 R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป โ€ข Optimize only one feature of ๐‘ ๐‘ข at once โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘– โ€ข Apply more iteration
  • 25. Ridge Regression with Coordinate Descent 16th January, 201325 LAWA Workpackage Meeting P 0.51 0.10 0.08 R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป โ€ข Optimize only one feature of ๐‘ ๐‘ข at once โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘– โ€ข Apply more iteration
  • 26. Ridge Regression with Coordinate Descent 16th January, 201326 LAWA Workpackage Meeting P 0.47 0.10 0.08 R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป โ€ข Optimize only one feature of ๐‘ ๐‘ข at once โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘– โ€ข Apply more iteration
  • 27. Ridge Regression with Coordinate Descent 16th January, 201327 LAWA Workpackage Meeting P 0.46 0.11 0.07 R Item1 Item2 Item3 Item4 User1 1 0 0 0 QT 0.9 -0.4 0.8 0.6 0.6 0.7 -0.7 -0.2 -0.1 -0.4 -0.1 0.6 โ€ข Target vector: ๐’† ๐’–= ๐‘ช ๐’– ๐’“ ๐’– โˆ’ ๐’‘ ๐’– ๐‘ธ ๐‘ป โ€ข Optimize only one feature of ๐‘ ๐‘ข at once โ€ข ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข ๐‘’ ๐‘ข๐‘– = ๐‘’ ๐‘ข๐‘– โˆ’ ๐‘ ๐‘ข๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘ ๐‘ข๐‘– โ€ข Apply more iteration
  • 28. Optimizer โ€“ Coordinate Descent 16th January, 201328 LAWA Workpackage Meeting QT 0.1 0.4 1.1 0.6 0.6 0.7 1.5 1.0 P 0.3 0 0 0 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 29. Optimizer โ€“ Coordinate Descent 16th January, 201329 LAWA Workpackage Meeting QT 0.1 0.4 1.1 0.6 0.6 0.7 1.5 1.0 P 0.3 -0.1 0 0 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 30. Optimizer โ€“ Coordinate Descent 16th January, 201330 LAWA Workpackage Meeting QT 0.1 0.4 1.1 0.6 0.6 0.7 1.5 1.0 P 0.3 -0.1 0.1 0 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 31. Optimizer โ€“ Coordinate Descent 16th January, 201331 LAWA Workpackage Meeting QT 0.1 0.4 1.1 0.6 0.6 0.7 1.5 1.0 P 0.3 -0.1 0.1 0.5 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 32. Optimizer โ€“ Coordinate Descent 16th January, 201332 LAWA Workpackage Meeting QT 0.1 0.4 1.1 0.6 0.6 0.7 1.5 1.0 P 0.3 -0.1 0.1 -0.5 -0.4 0.2 0.5 -0.4 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 33. Optimizer โ€“ Coordinate Descent 16th January, 201333 LAWA Workpackage Meeting QT 0.1 0 0 0 0 0 0 0 P 0.3 -0.1 0.1 -0.5 -0.4 0.2 0.5 -0.4 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 34. Optimizer โ€“ Coordinate Descent 16th January, 201334 LAWA Workpackage Meeting QT 0.1 0 0 0 0.6 0 0 0 P 0.3 -0.1 0.1 -0.5 -0.4 0.2 0.5 -0.4 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 35. Optimizer โ€“ Coordinate Descent 16th January, 201335 LAWA Workpackage Meeting QT 0.1 0.4 0 0 0.6 0 0 0 P 0.3 -0.1 0.1 -0.5 -0.4 0.2 0.5 -0.4 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 36. Optimizer โ€“ Coordinate Descent 16th January, 201336 LAWA Workpackage Meeting QT 0.1 0.4 -0.1 0.2 0.6 0.7 0.8 0.5 P 0.3 -0.1 0.1 -0.5 -0.4 0.2 0.5 -0.4 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 37. Optimizer โ€“ Coordinate Descent 16th January, 201337 LAWA Workpackage Meeting QT 0.1 0.4 -0.1 0.2 0.6 0.7 0.8 0.5 P 0.2 0 0 0 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 38. Optimizer โ€“ Coordinate Descent 16th January, 201338 LAWA Workpackage Meeting QT 0.1 0.4 -0.1 0.2 0.6 0.7 0.8 0.5 P 0.2 -0.1 0 0 0 0 0 0 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 39. Optimizer โ€“ Coordinate Descent 16th January, 201339 LAWA Workpackage Meeting QT 0.1 0.4 -0.1 0.2 0.6 0.7 0.8 0.5 P 0.2 -0.1 0.1 -0.4 -0.3 0.1 0.5 -0.6 โ€ข Ridge Regression with Coordinate Descent R Item1 Item2 Item3 Item4 User1 1 0 0 0 User2 0 0 1 0 User3 1 1 0 0 User4 0 1 0 1
  • 40. Optimizer โ€“ Coordinate Descent โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด โ€ข Ridge Regression calculates the features based on examples directly, Covariance precomputing solution cannot be applied here. 40 LAWA Workpackage Meeting 16th January, 2013
  • 41. Optimizer โ€“ Coordinate Descent Improvement โ€ข Synthetic examples (Pilรกszy, Zibriczky, Tikk) โ€ข Solution of Ridgre Regression with CD: ๐‘ ๐‘ข๐‘˜ = ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘’ ๐‘ข๐‘– ๐‘–=1 ๐‘€ ๐‘ ๐‘ข๐‘– ๐‘ž ๐‘–๐‘˜ ๐‘ž ๐‘–๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ โ€ข Calculate statistics for this user, who watched nothing (๐‘†๐ธ๐‘„0 and ๐‘†๐‘„๐‘„0) โ€ข The solution is calculated incrementally: ๐‘ ๐‘ข๐‘˜ = ๐‘†๐‘„๐ธ ๐‘†๐‘„๐‘„ = ๐‘†๐‘„๐ธ0+๐‘†๐‘„๐ธ+ ๐‘†๐‘„๐‘„0+๐‘†๐‘„๐‘„+ ( ๐‘ด + #๐‘ท(๐’–)+ steps) โ€ข Eigenvalue decomposition: ๐‘„ ๐‘‡ ๐‘„ = ๐‘†ฮ›๐‘† ๐‘‡ = ๐‘† ฮ› ๐‘‡ ฮ›๐‘† = ๐บ ๐‘‡ ๐บ โ€ข Zero examples are compressed to synthetic examples: ๐‘„ ๐‘€๐‘ฅ๐พ โ†’ ๐บ ๐พ๐‘ฅ๐พ โ€ข ๐‘†๐บ๐บ0 = ๐‘†๐‘„๐‘„0, but needs only ๐Š steps to compute: ๐‘ ๐‘ข๐‘˜ = ๐‘บ๐‘ฎ๐‘ฌ ๐ŸŽ+๐‘†๐‘„๐ธ+ ๐‘บ๐‘ฎ๐‘ฎ ๐ŸŽ+๐‘†๐‘„๐‘„+ ( ๐‘ฒ + #๐‘ท(๐’–)+ steps) โ€ข ๐‘†๐บ๐ธ0 is calculated the same way as ๐‘†๐‘„๐ธ0, but using ๐Š steps only. โ€ข Complexity: ๐›ฐ ๐ผ๐พ(๐ธ + ๐พ๐‘€ + ๐พ๐‘)) = ๐šถ ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต) 41 LAWA Workpackage Meeting 16th January, 2013
  • 42. Optimizer โ€“ Coordinate Descent โ€ข Complexity of naive solution: ๐šถ ๐‘ฐ๐‘ฒ๐‘ต๐‘ด โ€ข Ridge Regression calculates the features based on examples directly, Covariance precomputing solution cannot be applied here. โ€ข Synthetic Examples โ€ข Codename: IALS1 โ€ข Complexity reduction (IALS๏ƒ IALS1) ๏ƒ ๐œช ๐‘ฐ๐‘ฒ(๐‘ฌ + ๐‘ฒ(๐‘ด + ๐‘ต) โ€ข IALS1 requires higher ๐‘ฒ for the same accuracy as IALS. 42 LAWA Workpackage Meeting 16th January, 2013
  • 43. Optimizer โ€“ Coordinate Descent ...does it work in practice? 16th January, 201343 LAWA Workpackage Meeting
  • 44. โ€ข Average Rank Position on the subset of a propietary implicit feedback dataset. The lower value is better. โ€ข IALS1 offers better time-accuracy tradeoffs, especially when K is large. Comparison 44 LAWA Workpackage Meeting 16th January, 2013 IALS IALS1 K ARP time ARP time 5 0,1903 153 0,1898 112 10 0,1578 254 0,1588 134 20 0,1427 644 0,1432 209 50 0,1334 2862 0,1344 525 100 0,1314 11441 0,1325 1361 250 0,1311 92944 0,1312 6651 500 N/A N/A 0,1282 24697 1000 N/A N/A 0,1242 104611 0,120 0,125 0,130 0,135 0,140 0,145 0,150 0,155 100 1000 10000 100000 ARP Training Time (s) IALS IALS1
  • 45. Conclusion โ€ข Explicit feedbacks are rarely or not provided. โ€ข Implicit feedbacks are more general. โ€ข Complexity issues of Alternating Least Squares. โ€ข Efficient solution by using approximation and synthetic examples. โ€ข IALS1 offers better time-accuracy tradeoffs, especially when ๐‘ฒ is large. โ€ข IALS is approximation algorithm too, so why not change it to be even more approximative? 45 LAWA Workpackage Meeting 16th January, 2013
  • 46. LAWA Workpackage Meeting Other algorithms 16th January, 201346
  • 47. Model โ€“ Tensor Factorization 47 LAWA Workpackage Meeting 16th January, 2013 โ€ข Different preferences during the day โ€ข Time period 1: 06:00-14:00 R1 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 1 โ€ฆ User3 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
  • 48. โ€ข Different preferences during the day โ€ข Time period 2: 14:00-22:00 Model โ€“ Tensor Factorization 48 LAWA Workpackage Meeting 16th January, 2013 R1 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 1 0 โ€ฆ User3 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ R2 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 1 โ€ฆ User3 1 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
  • 49. Model โ€“ Tensor Factorization โ€ข Different preferences during the day โ€ข Time period 3: 22:00-06:00 49 LAWA Workpackage Meeting 16th January, 2013 R1 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 1 0 โ€ฆ User3 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ R2 Item1 Item2 Item3 โ€ฆ User1 0 1 โ€ฆ User2 1 โ€ฆ User3 1 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ R3 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 โ€ฆ User3 1 1 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ
  • 50. Model โ€“ Tensor Factorization 50 LAWA Workpackage Meeting 16th January, 2013 R1 Item1 Item2 Item3 โ€ฆ User1 1 โ€ฆ User2 1 0 โ€ฆ User3 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ R2 Item1 Item2 Item3 โ€ฆ User1 0 1 โ€ฆ User2 1 โ€ฆ User3 1 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ R3 Item1 Item2 Item3 โ€ฆ User1 โ€ฆ User2 ๐’“ ๐‘ข๐‘–๐‘ก โ€ฆ User3 โ€ฆ โ€ฆ. โ€ฆ โ€ฆ โ€ฆ โ€ฆ QT q11 q21 q31 โ€ฆ q12 q22 q32 โ€ฆ P p11 p12 p21 p22 p31 p32 โ€ฆ โ€ฆ Tt11 t12 t21 t22 t31 t32 ๐‘น ๐‘ต๐’™๐‘ด: preference matrix ๐‘ท ๐‘ต๐’™๐‘ฒ: user feature matrix ๐‘ธ ๐‘ด๐’™๐‘ฒ: item feature matrix ๐‘ป ๐‘ณ๐’™๐‘ฒ: time feature matrix ๐‘ต: #users ๐‘ด: #items ๐‘ณ: #time periods ๐‘ฒ: #features ๐’“ ๐’–๐’Št = ๐’Œ ๐’‘ ๐’–๐’Œ ๐’’๐’Š๐’Œ ๐’•๐’•๐’Œ ๐‘น = ๐‘ทยฐ ๐‘ธยฐ ๐‘ป
  • 51. โ€ข Data sets: Netflix Rating 5, IPTV Provider VOD rental, Grocery buys โ€ข Evaluation Metric: Recall@20, Precision-Recall@20 โ€ข Number of features: 20 Comparison โ€“ ITALS vs. IALS 51 LAWA Workpackage Meeting 16th January, 2013 Test case (20) IALS ITALS Netflix Probe 0.087 0.097 Netflix Time Split 0.054 0.071 IPTV VOD 1day 0.063 0.112 IPTV VOD 1week 0.055 0.100 Grocer 0.065 0.103
  • 52. Comparison โ€“ ITALS vs. IALS 52 LAWA Workpackage Meeting 16th January, 2013
  • 53. Objective Function โ€“ Ranking-based objective function 16th January, 201353 LAWA Workpackage Meeting โ€ข Ranking-based objective function approach: โ€ข ๐’“ ๐’–๐’Š โˆ’ ๐’“๐’–๐’‹ : difference of preference between item i and j โ€ข ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ : estimated difference of preference between item i and j โ€ข ๐’”๐’‹: importance of item j in objective function โ€ข Model: Matrix Factorization โ€ข Optimizer: Alternating Least Squares โ€ข Name: RankALS ๐’‡ ๐œฝ = ๐’–๐๐‘ผ ๐’Š๐๐‘ฐ ๐’„ ๐’–๐’Š ๐’Š๐๐‘ฐ ๐’”๐’‹[ ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ โˆ’ ๐’“ ๐’–๐’Š โˆ’ ๐’“ ๐’–๐’‹ ] ๐Ÿ
  • 54. Comparison โ€“ RankIALS vs. IALS 54 LAWA Workpackage Meeting 16th January, 2013
  • 55. Comparison โ€“ RankIALS vs. IALS 55 LAWA Workpackage Meeting 16th January, 2013
  • 56. Related Publications โ€ข Alternating Least Squares with Coordinate Descent I. Pilรกszy, D. Zibriczky, D. Tikk. Fast ALS-based matrix factorization for explicit and implicit feedback datasets. RecSys 2010 โ€ข Tensor Factorization B. Hidasi, D. Tikk: Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Implicit Feedback, ECML PKDD 2012 โ€ข Personalized Ranking G. Takรกcs, D. Tikk: Alternating least squares for personalized ranking, RecSys 2012 โ€ข IPTV Case Study D. Zibriczky, B. Hidasi, Z. Petres, D. Tikk: Personalized recommendation of linear content on interactive TV platforms: beating the cold start and noisy implicit user feedback, TVMMP @ UMAP 2012 56 LAWA Workpackage Meeting 16th January, 2013