Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases
1. ENHANCING MATRIX FACTORIZATION
THROUGH INITIALIZATION FOR
IMPLICIT FEEDBACK DATABASES
Balázs Hidasi
Domonkos Tikk
Gravity R&D Ltd.
Budapest University of Technology and Economics
3. MATRIX FACTORIZATION
Collaborative Filtering
One of the most common approaches
Approximates the rating matrix as product of low-
rank matrices
Items
Q
Users
R ≈ P
3/19
4. MATRIX FACTORIZATION
Initialize P and Q with small random numbers
Teach P and Q
Alternating Least Squares
Gradient Descent
Etc.
Transforms the data to a feature space
Separately for users and items
Noise reduction
Compression
Generalization
4/19
5. IMPLICIT FEEDBACK
No ratings
User-item interactions (events)
Much noisier
Presence of an event might not be positive feedback
Absence of an event does not mean negative
feedback
No negative feedback is available!
More common problem
MF for implicit feedback
Less accurate results due to noise
Mostly ALS is used 5/19
Scalability problems (rating matrix is dense)
6. CONCEPT
Good MF model
The feature vectors of similar entities are similar
If data is too noisy similar entities won’t be similar by
their features
Start MF from a „good” point
Feature vector similarities are OK
Data is more than just events
Metadata
Info about items/users
Contextual data
In what context did the event occured
Can we incorporate those to help implicit MF? 6/19
7. NAIVE APPROACH
Describe items using any data we have (detailed
later)
Long, sparse vectors for item description
Compress these vectors to dense feature vectors
PCA, MLP, MF, …
Length of desired vectors = Number of features in MF
Use these features as starting points
7/19
8. NAIVE APPROACH
Compression and also noise reduction
Does not really care about similarities
But often feature similarities are not that bad
If MF is used
Half of the results is thrown out
Descriptors
features Descriptor features
≈
Items
Item
Description of items
8/19
9. SIMFACTOR ALGORITHM
Try to preserve similarities better
Starting from an MF of item description
Descriptors
Descriptors features
features
Description of items
≈
Items
Item
(D)
Similarities of items: DD’
Description of items
Some metrics require transformation on D
(D’)
Item
Description of items
similarities
(S)
= (D) 9/19
10. SIMFACTOR ALGORITHM
Descriptors features
features
Item Descriptors features Item
≈ Item
(X)
similarities (Y’) features
(S) (X’)
(Y)
Similarity approximation
features
Item Item
Item
≈
(X)
similarities Y’Y features
(S) (X’)
Y’Y KxK symmetric 10/19
Eigendecomposition
11. SIMFACTOR ALGORITHM
Y’Y = U λ U’
λ diagonal λ = SQRT(λ) * SQRT(λ)
features
Item Item
≈
Item
SQRT SQRT
(X)
similarities U (λ) (λ) U’ features
(S) (X’)
X*U*SQRT(λ) = (SQRT(λ)*U’*X’)’=F
F is MxK matrix
S F * F’ F used for initialization
Item
similarities ≈ F’ 11/19
F
(S)
12. CREATING THE DESCRIPTION MATRIX
„Any” data about the entity
Vector-space reprezentation
For Items:
Metadata vector (title, category, description, etc)
Event vector (who bought the item)
Context-state vector (in which context state was it
bought)
Context-event (in which context state who bought it)
For Users:
All above except metadata
Currently: Choose one source for D matrix
12/19
Context used: seasonality
13. EXPERIMENTS: SIMILARITY PRESERVATION
Real life dataset: online grocery shopping events
SimFactor RMSE improvement over naive in similarity
approximation
52,36%
48,70%
26,22%
16,86%
13,39% 12,38%
10,81%
Item context state User context state Item context- User context- Item event data User event data Item metadata
event event
13/19
SimFactor approximates similarities better
14. EXPERIMENTS: INITIALIZATION
Using different description matrices
And both naive and SimFactor initialization
Baseline: random init
Evaluation metric: recall@50
14/19
15. EXPERIMENTS: GROCERY DB
Up to 6% improvement
Best methods use SimFactor and user context data
Top5 methods on Grocery DB
5,71%
4,88%
4,30%
4,12% 4,04%
15/19
User context state User context state User context event User event data User context event
(SimFactor) (Naive) (SimFactor) (SimFactor) (Naive)
16. EXPERIMENTS: „IMPLICITIZED” MOVIELENS
Keeping 5 star ratings implicit events
Up to 10% improvement
Best methods use SimFactor and item context data
Top5 methods on MovieLens DB
10%
9,17% 9,17% 9,17% 9,17%
16/19
Item context state User context state Item context event Item context event Item context state
(SimFactor) (SimFactor) (SimFactor) (Naive) (Naive)
17. DISCUSSION OF RESULTS
SimFactor yields better results than naive
Context information yields better results than other
descriptions
Context information separates well between entities
Grocery: User context
People’s routines
Different types of shoppings in different times
MovieLens: Item context
Different types of movies watched on different hours
Context-based similarity
17/19
18. WHY CONTEXT?
Grocery example
Correlation between context states by users low
ITEM USER
Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun
Mon 1,00 0,79 0,79 0,78 0,76 0,70 0,74 Mon 1,00 0,36 0,34 0,34 0,35 0,29 0,19
Tue 0,79 1,00 0,79 0,78 0,76 0,69 0,73 Tue 0,36 1,00 0,34 0,34 0,33 0,29 0,19
Wed 0,79 0,79 1,00 0,79 0,76 0,70 0,74 Wed 0,34 0,34 1,00 0,36 0,35 0,27 0,17
Thu 0,78 0,78 0,79 1,00 0,76 0,71 0,74 Thu 0,34 0,34 0,36 1,00 0,39 0,30 0,16
Fri 0,76 0,76 0,76 0,76 1,00 0,71 0,72 Fri 0,35 0,33 0,35 0,39 1,00 0,32 0,16
Sat 0,70 0,69 0,70 0,71 0,71 1,00 0,71 Sat 0,29 0,29 0,27 0,30 0,32 1,00 0,33
Sun 0,74 0,73 0,74 0,74 0,72 0,71 1,00 Sun 0,19 0,19 0,17 0,16 0,16 0,33 1,00
Why can context aware algorithms be efficient?
Different recommendations in different context states
18/19
Context differentiates well between entities
Easier subtasks
19. CONCLUSION & FUTURE WORK
SimFactor Similarity preserving compression
Similarity based MF initialization:
Description matrix from any data
Apply SimFactor
Use output as initial features for MF
Context differentitates between entities well
Future work:
Mixed description matrix (multiple data sources)
Multiple description matrix
Using different context information
Using different similarity metrics 19/19