6 docker cf_recommendation_hands_on

Docker
Recommendation hands on
2020-12-16
Orozco Hsu

Agenda
• Use case
• Google search, Amazon online shopping
• What is recommendation system
• Hands-on practices
• Baseline
• Content-based/ CF
• Lightfm
• NCF

Prepare training environment
• Start container
• RUN =>
• RUN => docker run --rm -e JUPYTER_ENABLE_LAB=yes -v "/tmp/work":/home/jovyan/work -p
20000:8888 jupyter/datascience-notebook
cd /tmp; mkdir docker_20201209

• Place docker_20201216 folder in /tmp/work folder
• Code: https://github.com/orozcohsu/ntunhs_2020/tree/master/docker_20201216

• Browser jupyter-notebook

What is recommendation system
• Which one is better for bringing more jam sales?
• https://medium.com/@przemekszustak/less-is-more-the-paradox-of-choice-
behavioural-economics-in-ux-318849b2d70

• Recommendation types
• Demographic-based Recommendation
• Content-based Recommendation
• Collaborative Filtering Recommendation
• User-based CF
• Item-based CF
• Model based Recommendation
• Machine learning algorithms (lightfm, ALS, SVD…)
• Deep learning

• Demographic-based Recommendation

• Content-based Recommendation

• Collaborative Filtering Recommendation
• User-based
• Item-based

Ref. https://towardsdatascience.com/introduction-to-recommender-systems-6c66cf15ada

• If sparsity is greater than 50%, you can’t use CF

• Normalization the Ratings

• Model based Recommendation (matrix decomposition)

• Model optimization
• Precision@K: Find the top K recommended items and check result of user
actual reaction ratio
• Use Precision@K to optimize mode performance
• Model evaluation
• A/B test => on line evaluation
• RMSE, Precision, Recall, F1 => offline evaluation

Hands-on practices
• Raw data (with visitor’s page-view)

Hands-on practices
• Set training and validating set
Training
[appearance>1]
Validating
[appearance=1]
Hold-one-out evaluation for baseline evaluation
Appearance=1 means the latest view of each visitor

Hands-on practices
• MRR (Mean reciprocal rank)
https://en.wikipedia.org/wiki/Mean_reciprocal_rank
The first selected item has a higher score, that means BETTER!
Baseline MRR = 0.02096631601316712
Not so good

Hands-on practices [baseline]
• Demo code
• RUN =>
• You RUN =>
01_baseline_small.ipynb
01_baseline.ipynb

Hands-on practices [ubcf, ibcf]
• Raw data (with user’s ratings)
users
Pearson
Correlation

• Demo code
• RUN =>
• YOU RUN =>
02_Collaborative_Filtering_small.ipynb
02_Collaborative_Filtering.ipynb

• ubcf
Recommendation Results
Sources
User1
User4
….
item1 item5item2 item3 item4
user_similarity_matrix
user2movie

• ibcf
User1
User4
….
item1 item5item2 item3 item4
item_similarity_matrix
user2movie
Sources

Hands-on practices [cb]
• Demo code
• RUN => 03_Content_Based_Filtering.ipynb
TFIDF is a measure of originality of a word by comparing the
number of times a word appears in a doc with the number of docs
the word appears in
Extracting keywords from text with TFIDF
Matrix dot vector used to computing similarity

Hands-on practices [cb]
• Find most similar item with Harry Potter

Hands-on practices [Lightfm]
• It make to incorporate both item and user metadata into the
traditional matrix factorization algorithms
• It represents each user and item as the sum of the latent
representations of their features, thus allowing recommendations to
generalize to new items (via item features) and to new users (via user
features)

• Divide user/ movie feature matrix with F features

• Demo code
• RUN =>
• YOU RUN=>
04_LightFM_small.ipynb
no_components: the dimensionality of the feature latent embeddings
learning_schedule: 'adagrad', 'adadelta'
loss: 'logistic', 'bpr', 'warp', 'warp-kos'
learning_rate: nitial learning rate for the adagrad learning schedule
Ref. https://making.lyst.com/lightfm/docs/_modules/lightfm/lightfm.html
04_LightFM.ipynb

Hands-on practices [NCF]
• Confusion matrix
• Acc = (TP+TN)/(TP+FP+FN+TN)
• Recall = (TP)/(TP+FN)
• Precision = (TP)/(TP+FP)
Fact (H1 = True) Fact (H1 = False)
Prediction (True) TP FP
Prediction (False) FN TN
Add some fake backgroup (negative examples) if your precision is lower
Example: add FP(false-positive) examples from testing as negative examples to train a new model

• Demo code
• YOU RUN => 05_NCF.ipynb
NCF MRR = 0.24179100068337941
The difference between implicit and explicit feedbacks is what we only see the items that each user
may be interested in.
In order to training Deep-Learning model, we need to create some negative training examples. The
faster way is to randomly generating some negative examples.

• MLP (multilayer perceptron)
• Copy and paste get_mlp_model function code from 05_NCF.ipynb to Net2Vis
webpage
• https://viscom.net2vis.uni-
ulm.de/FAPokAxSXSrKPD9xCYjbDerXwa3WAZzeoL5ST6b1pOkCdPHZDo

Reference
• Credit to https://github.com/khuangaf/tibame_recommender_system
• We didn’t cover so much about Deep-Learning topics
• MLP
• CNN
• Dense
• Flatten
• Activation
• Add
• Dropout
• …

6 docker cf_recommendation_hands_on

Recommended

Recommended

More Related Content

Similar to 6 docker cf_recommendation_hands_on

Similar to 6 docker cf_recommendation_hands_on (20)

More from FEG

More from FEG (20)

Recently uploaded

Recently uploaded (20)

6 docker cf_recommendation_hands_on