2. Agenda
• Use case
• Google search, Amazon online shopping
• What is recommendation system
• Hands-on practices
• Baseline
• Content-based/ CF
• Lightfm
• NCF
3. Prepare training environment
• Start container
• RUN =>
• RUN => docker run --rm -e JUPYTER_ENABLE_LAB=yes -v "/tmp/work":/home/jovyan/work -p
20000:8888 jupyter/datascience-notebook
cd /tmp; mkdir docker_20201209
4. Prepare training environment
• Place docker_20201216 folder in /tmp/work folder
• Code: https://github.com/orozcohsu/ntunhs_2020/tree/master/docker_20201216
9. What is recommendation system
• Which one is better for bringing more jam sales?
• https://medium.com/@przemekszustak/less-is-more-the-paradox-of-choice-
behavioural-economics-in-ux-318849b2d70
10. What is recommendation system
• Recommendation types
• Demographic-based Recommendation
• Content-based Recommendation
• Collaborative Filtering Recommendation
• User-based CF
• Item-based CF
• Model based Recommendation
• Machine learning algorithms (lightfm, ALS, SVD…)
• Deep learning
19. What is recommendation system
• Model optimization
• Precision@K: Find the top K recommended items and check result of user
actual reaction ratio
• Use Precision@K to optimize mode performance
• Model evaluation
• A/B test => on line evaluation
• RMSE, Precision, Recall, F1 => offline evaluation
21. Hands-on practices
• Set training and validating set
Training
[appearance>1]
Validating
[appearance=1]
Hold-one-out evaluation for baseline evaluation
Appearance=1 means the latest view of each visitor
22. Hands-on practices
• MRR (Mean reciprocal rank)
https://en.wikipedia.org/wiki/Mean_reciprocal_rank
The first selected item has a higher score, that means BETTER!
Baseline MRR = 0.02096631601316712
Not so good
28. Hands-on practices [cb]
• Demo code
• RUN => 03_Content_Based_Filtering.ipynb
TFIDF is a measure of originality of a word by comparing the
number of times a word appears in a doc with the number of docs
the word appears in
Extracting keywords from text with TFIDF
Matrix dot vector used to computing similarity
30. Hands-on practices [Lightfm]
• It make to incorporate both item and user metadata into the
traditional matrix factorization algorithms
• It represents each user and item as the sum of the latent
representations of their features, thus allowing recommendations to
generalize to new items (via item features) and to new users (via user
features)
32. Hands-on practices [Lightfm]
• Demo code
• RUN =>
• YOU RUN=>
04_LightFM_small.ipynb
no_components: the dimensionality of the feature latent embeddings
learning_schedule: 'adagrad', 'adadelta'
loss: 'logistic', 'bpr', 'warp', 'warp-kos'
learning_rate: nitial learning rate for the adagrad learning schedule
Ref. https://making.lyst.com/lightfm/docs/_modules/lightfm/lightfm.html
04_LightFM.ipynb
33. Hands-on practices [NCF]
• Confusion matrix
• Acc = (TP+TN)/(TP+FP+FN+TN)
• Recall = (TP)/(TP+FN)
• Precision = (TP)/(TP+FP)
Fact (H1 = True) Fact (H1 = False)
Prediction (True) TP FP
Prediction (False) FN TN
Add some fake backgroup (negative examples) if your precision is lower
Example: add FP(false-positive) examples from testing as negative examples to train a new model
34. Hands-on practices [NCF]
• Demo code
• YOU RUN => 05_NCF.ipynb
NCF MRR = 0.24179100068337941
The difference between implicit and explicit feedbacks is what we only see the items that each user
may be interested in.
In order to training Deep-Learning model, we need to create some negative training examples. The
faster way is to randomly generating some negative examples.
35. Hands-on practices [NCF]
• MLP (multilayer perceptron)
• Copy and paste get_mlp_model function code from 05_NCF.ipynb to Net2Vis
webpage
• https://viscom.net2vis.uni-
ulm.de/FAPokAxSXSrKPD9xCYjbDerXwa3WAZzeoL5ST6b1pOkCdPHZDo
36. Reference
• Credit to https://github.com/khuangaf/tibame_recommender_system
• We didn’t cover so much about Deep-Learning topics
• MLP
• CNN
• Dense
• Flatten
• Activation
• Add
• Dropout
• …