SlideShare a Scribd company logo
1 of 36
Download to read offline
Docker
Recommendation hands on
2020-12-16
Orozco Hsu
Agenda
• Use case
• Google search, Amazon online shopping
• What is recommendation system
• Hands-on practices
• Baseline
• Content-based/ CF
• Lightfm
• NCF
Prepare training environment
• Start container
• RUN =>
• RUN => docker run --rm -e JUPYTER_ENABLE_LAB=yes -v "/tmp/work":/home/jovyan/work -p
20000:8888 jupyter/datascience-notebook
cd /tmp; mkdir docker_20201209
Prepare training environment
• Place docker_20201216 folder in /tmp/work folder
• Code: https://github.com/orozcohsu/ntunhs_2020/tree/master/docker_20201216
Prepare training environment
• Browser jupyter-notebook
Use case
Use case
Use case
Recsys
What is recommendation system
• Which one is better for bringing more jam sales?
• https://medium.com/@przemekszustak/less-is-more-the-paradox-of-choice-
behavioural-economics-in-ux-318849b2d70
What is recommendation system
• Recommendation types
• Demographic-based Recommendation
• Content-based Recommendation
• Collaborative Filtering Recommendation
• User-based CF
• Item-based CF
• Model based Recommendation
• Machine learning algorithms (lightfm, ALS, SVD…)
• Deep learning
What is recommendation system
• Demographic-based Recommendation
What is recommendation system
• Content-based Recommendation
What is recommendation system
• Collaborative Filtering Recommendation
• User-based
• Item-based
What is recommendation system
Ref. https://towardsdatascience.com/introduction-to-recommender-systems-6c66cf15ada
What is recommendation system
• If sparsity is greater than 50%, you can’t use CF
What is recommendation system
• Normalization the Ratings
What is recommendation system
• Model based Recommendation (matrix decomposition)
What is recommendation system
What is recommendation system
• Model optimization
• Precision@K: Find the top K recommended items and check result of user
actual reaction ratio
• Use Precision@K to optimize mode performance
• Model evaluation
• A/B test => on line evaluation
• RMSE, Precision, Recall, F1 => offline evaluation
Hands-on practices
• Raw data (with visitor’s page-view)
Hands-on practices
• Set training and validating set
Training
[appearance>1]
Validating
[appearance=1]
Hold-one-out evaluation for baseline evaluation
Appearance=1 means the latest view of each visitor
Hands-on practices
• MRR (Mean reciprocal rank)
https://en.wikipedia.org/wiki/Mean_reciprocal_rank
The first selected item has a higher score, that means BETTER!
Baseline MRR = 0.02096631601316712
Not so good
Hands-on practices [baseline]
• Demo code
• RUN =>
• You RUN =>
01_baseline_small.ipynb
01_baseline.ipynb
Hands-on practices [ubcf, ibcf]
• Raw data (with user’s ratings)
users
Pearson
Correlation
Hands-on practices [ubcf, ibcf]
• Demo code
• RUN =>
• YOU RUN =>
02_Collaborative_Filtering_small.ipynb
02_Collaborative_Filtering.ipynb
Hands-on practices [ubcf, ibcf]
• ubcf
Recommendation Results
Sources
User1
User4
….
item1 item5item2 item3 item4
user_similarity_matrix
user2movie
Hands-on practices [ubcf, ibcf]
• ibcf
User1
User4
….
item1 item5item2 item3 item4
item_similarity_matrix
user2movie
Sources
Hands-on practices [cb]
• Demo code
• RUN => 03_Content_Based_Filtering.ipynb
TFIDF is a measure of originality of a word by comparing the
number of times a word appears in a doc with the number of docs
the word appears in
Extracting keywords from text with TFIDF
Matrix dot vector used to computing similarity
Hands-on practices [cb]
• Find most similar item with Harry Potter
Hands-on practices [Lightfm]
• It make to incorporate both item and user metadata into the
traditional matrix factorization algorithms
• It represents each user and item as the sum of the latent
representations of their features, thus allowing recommendations to
generalize to new items (via item features) and to new users (via user
features)
Hands-on practices [Lightfm]
• Divide user/ movie feature matrix with F features
Hands-on practices [Lightfm]
• Demo code
• RUN =>
• YOU RUN=>
04_LightFM_small.ipynb
no_components: the dimensionality of the feature latent embeddings
learning_schedule: 'adagrad', 'adadelta'
loss: 'logistic', 'bpr', 'warp', 'warp-kos'
learning_rate: nitial learning rate for the adagrad learning schedule
Ref. https://making.lyst.com/lightfm/docs/_modules/lightfm/lightfm.html
04_LightFM.ipynb
Hands-on practices [NCF]
• Confusion matrix
• Acc = (TP+TN)/(TP+FP+FN+TN)
• Recall = (TP)/(TP+FN)
• Precision = (TP)/(TP+FP)
Fact (H1 = True) Fact (H1 = False)
Prediction (True) TP FP
Prediction (False) FN TN
Add some fake backgroup (negative examples) if your precision is lower
Example: add FP(false-positive) examples from testing as negative examples to train a new model
Hands-on practices [NCF]
• Demo code
• YOU RUN => 05_NCF.ipynb
NCF MRR = 0.24179100068337941
The difference between implicit and explicit feedbacks is what we only see the items that each user
may be interested in.
In order to training Deep-Learning model, we need to create some negative training examples. The
faster way is to randomly generating some negative examples.
Hands-on practices [NCF]
• MLP (multilayer perceptron)
• Copy and paste get_mlp_model function code from 05_NCF.ipynb to Net2Vis
webpage
• https://viscom.net2vis.uni-
ulm.de/FAPokAxSXSrKPD9xCYjbDerXwa3WAZzeoL5ST6b1pOkCdPHZDo
Reference
• Credit to https://github.com/khuangaf/tibame_recommender_system
• We didn’t cover so much about Deep-Learning topics
• MLP
• CNN
• Dense
• Flatten
• Activation
• Add
• Dropout
• …

More Related Content

Similar to 6 docker cf_recommendation_hands_on

Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...Niels Frydenholm
 
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...DataWorks Summit
 
21 Experiments to Increase Throughput
21 Experiments to Increase Throughput21 Experiments to Increase Throughput
21 Experiments to Increase ThroughputAndrew Rusling
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkDatabricks
 
2014 11 20 Drupal 7 -> 8 test migratie
2014 11 20 Drupal 7 -> 8 test migratie2014 11 20 Drupal 7 -> 8 test migratie
2014 11 20 Drupal 7 -> 8 test migratiehcderaad
 
Northeast PHP - High Performance PHP
Northeast PHP - High Performance PHPNortheast PHP - High Performance PHP
Northeast PHP - High Performance PHPJonathan Klein
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine LearningRandy Shoup
 
Production profiling what, why and how technical audience (3)
Production profiling  what, why and how   technical audience (3)Production profiling  what, why and how   technical audience (3)
Production profiling what, why and how technical audience (3)RichardWarburton
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLPeace Lee
 
Extreme Programming practices for your team
Extreme Programming practices for your teamExtreme Programming practices for your team
Extreme Programming practices for your teamPawel Lipinski
 
Ruby performance - The low hanging fruit
Ruby performance - The low hanging fruitRuby performance - The low hanging fruit
Ruby performance - The low hanging fruitBruce Werdschinski
 
DockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging WorkshopDockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging WorkshopBrian Christner
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with sparkModern Data Stack France
 
DevOps day 10 traps to avoid
DevOps day 10 traps to avoidDevOps day 10 traps to avoid
DevOps day 10 traps to avoidEric Mattern
 
DrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every TimeDrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every TimeSuzanne Aldrich
 
Learning & using new technology
Learning & using new technologyLearning & using new technology
Learning & using new technologyMichelle Crapo
 

Similar to 6 docker cf_recommendation_hands_on (20)

Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...
 
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
Dr Elephant: LinkedIn's Self-Service System for Detecting and Treating Hadoop...
 
Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0
 
21 Experiments to Increase Throughput
21 Experiments to Increase Throughput21 Experiments to Increase Throughput
21 Experiments to Increase Throughput
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
 
2014 11 20 Drupal 7 -> 8 test migratie
2014 11 20 Drupal 7 -> 8 test migratie2014 11 20 Drupal 7 -> 8 test migratie
2014 11 20 Drupal 7 -> 8 test migratie
 
Northeast PHP - High Performance PHP
Northeast PHP - High Performance PHPNortheast PHP - High Performance PHP
Northeast PHP - High Performance PHP
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire
 
An Agile Approach to Machine Learning
An Agile Approach to Machine LearningAn Agile Approach to Machine Learning
An Agile Approach to Machine Learning
 
Production profiling what, why and how technical audience (3)
Production profiling  what, why and how   technical audience (3)Production profiling  what, why and how   technical audience (3)
Production profiling what, why and how technical audience (3)
 
Guider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGLGuider: An Integrated Runtime Performance Analyzer on AGL
Guider: An Integrated Runtime Performance Analyzer on AGL
 
Extreme Programming practices for your team
Extreme Programming practices for your teamExtreme Programming practices for your team
Extreme Programming practices for your team
 
Ruby performance - The low hanging fruit
Ruby performance - The low hanging fruitRuby performance - The low hanging fruit
Ruby performance - The low hanging fruit
 
DockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging WorkshopDockerCon Europe 2018 Monitoring & Logging Workshop
DockerCon Europe 2018 Monitoring & Logging Workshop
 
Production-ready Software
Production-ready SoftwareProduction-ready Software
Production-ready Software
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
DevOps day 10 traps to avoid
DevOps day 10 traps to avoidDevOps day 10 traps to avoid
DevOps day 10 traps to avoid
 
DrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every TimeDrupalCamp LA 2014 - A Perfect Launch, Every Time
DrupalCamp LA 2014 - A Perfect Launch, Every Time
 
Bn1033 demo sap basis
Bn1033 demo  sap basisBn1033 demo  sap basis
Bn1033 demo sap basis
 
Learning & using new technology
Learning & using new technologyLearning & using new technology
Learning & using new technology
 

More from FEG

Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318FEG
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practicesFEG
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratchFEG
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratchFEG
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratchFEG
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_RulesFEG
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)FEG
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis VisualizationFEG
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)FEG
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)FEG
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)FEG
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised LearningFEG
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning ClusteringFEG
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in ExcelFEG
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdfFEG
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdfFEG
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdfFEG
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdfFEG
 
2_Clustering.pdf
2_Clustering.pdf2_Clustering.pdf
2_Clustering.pdfFEG
 
1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdfFEG
 

More from FEG (20)

Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318Pytorch cnn netowork introduction 20240318
Pytorch cnn netowork introduction 20240318
 
2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices2023 Decision Tree analysis in business practices
2023 Decision Tree analysis in business practices
 
2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch2023 Clustering analysis using Python from scratch
2023 Clustering analysis using Python from scratch
 
2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch2023 Data visualization using Python from scratch
2023 Data visualization using Python from scratch
 
2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch2023 Supervised Learning for Orange3 from scratch
2023 Supervised Learning for Orange3 from scratch
 
2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules2023 Supervised_Learning_Association_Rules
2023 Supervised_Learning_Association_Rules
 
202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)202312 Exploration Data Analysis Visualization (English version)
202312 Exploration Data Analysis Visualization (English version)
 
202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization202312 Exploration of Data Analysis Visualization
202312 Exploration of Data Analysis Visualization
 
Transfer Learning (20230516)
Transfer Learning (20230516)Transfer Learning (20230516)
Transfer Learning (20230516)
 
Image Classification (20230411)
Image Classification (20230411)Image Classification (20230411)
Image Classification (20230411)
 
Google CoLab (20230321)
Google CoLab (20230321)Google CoLab (20230321)
Google CoLab (20230321)
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
UnSupervised Learning Clustering
UnSupervised Learning ClusteringUnSupervised Learning Clustering
UnSupervised Learning Clustering
 
Data Visualization in Excel
Data Visualization in ExcelData Visualization in Excel
Data Visualization in Excel
 
6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf6_Association_rule_碩士班第六次.pdf
6_Association_rule_碩士班第六次.pdf
 
5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf5_Neural_network_碩士班第五次.pdf
5_Neural_network_碩士班第五次.pdf
 
4_Regression_analysis.pdf
4_Regression_analysis.pdf4_Regression_analysis.pdf
4_Regression_analysis.pdf
 
3_Decision_tree.pdf
3_Decision_tree.pdf3_Decision_tree.pdf
3_Decision_tree.pdf
 
2_Clustering.pdf
2_Clustering.pdf2_Clustering.pdf
2_Clustering.pdf
 
1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf1_大二班_資料視覺化_20221028.pdf
1_大二班_資料視覺化_20221028.pdf
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

6 docker cf_recommendation_hands_on

  • 2. Agenda • Use case • Google search, Amazon online shopping • What is recommendation system • Hands-on practices • Baseline • Content-based/ CF • Lightfm • NCF
  • 3. Prepare training environment • Start container • RUN => • RUN => docker run --rm -e JUPYTER_ENABLE_LAB=yes -v "/tmp/work":/home/jovyan/work -p 20000:8888 jupyter/datascience-notebook cd /tmp; mkdir docker_20201209
  • 4. Prepare training environment • Place docker_20201216 folder in /tmp/work folder • Code: https://github.com/orozcohsu/ntunhs_2020/tree/master/docker_20201216
  • 5. Prepare training environment • Browser jupyter-notebook
  • 9. What is recommendation system • Which one is better for bringing more jam sales? • https://medium.com/@przemekszustak/less-is-more-the-paradox-of-choice- behavioural-economics-in-ux-318849b2d70
  • 10. What is recommendation system • Recommendation types • Demographic-based Recommendation • Content-based Recommendation • Collaborative Filtering Recommendation • User-based CF • Item-based CF • Model based Recommendation • Machine learning algorithms (lightfm, ALS, SVD…) • Deep learning
  • 11. What is recommendation system • Demographic-based Recommendation
  • 12. What is recommendation system • Content-based Recommendation
  • 13. What is recommendation system • Collaborative Filtering Recommendation • User-based • Item-based
  • 14. What is recommendation system Ref. https://towardsdatascience.com/introduction-to-recommender-systems-6c66cf15ada
  • 15. What is recommendation system • If sparsity is greater than 50%, you can’t use CF
  • 16. What is recommendation system • Normalization the Ratings
  • 17. What is recommendation system • Model based Recommendation (matrix decomposition)
  • 19. What is recommendation system • Model optimization • Precision@K: Find the top K recommended items and check result of user actual reaction ratio • Use Precision@K to optimize mode performance • Model evaluation • A/B test => on line evaluation • RMSE, Precision, Recall, F1 => offline evaluation
  • 20. Hands-on practices • Raw data (with visitor’s page-view)
  • 21. Hands-on practices • Set training and validating set Training [appearance>1] Validating [appearance=1] Hold-one-out evaluation for baseline evaluation Appearance=1 means the latest view of each visitor
  • 22. Hands-on practices • MRR (Mean reciprocal rank) https://en.wikipedia.org/wiki/Mean_reciprocal_rank The first selected item has a higher score, that means BETTER! Baseline MRR = 0.02096631601316712 Not so good
  • 23. Hands-on practices [baseline] • Demo code • RUN => • You RUN => 01_baseline_small.ipynb 01_baseline.ipynb
  • 24. Hands-on practices [ubcf, ibcf] • Raw data (with user’s ratings) users Pearson Correlation
  • 25. Hands-on practices [ubcf, ibcf] • Demo code • RUN => • YOU RUN => 02_Collaborative_Filtering_small.ipynb 02_Collaborative_Filtering.ipynb
  • 26. Hands-on practices [ubcf, ibcf] • ubcf Recommendation Results Sources User1 User4 …. item1 item5item2 item3 item4 user_similarity_matrix user2movie
  • 27. Hands-on practices [ubcf, ibcf] • ibcf User1 User4 …. item1 item5item2 item3 item4 item_similarity_matrix user2movie Sources
  • 28. Hands-on practices [cb] • Demo code • RUN => 03_Content_Based_Filtering.ipynb TFIDF is a measure of originality of a word by comparing the number of times a word appears in a doc with the number of docs the word appears in Extracting keywords from text with TFIDF Matrix dot vector used to computing similarity
  • 29. Hands-on practices [cb] • Find most similar item with Harry Potter
  • 30. Hands-on practices [Lightfm] • It make to incorporate both item and user metadata into the traditional matrix factorization algorithms • It represents each user and item as the sum of the latent representations of their features, thus allowing recommendations to generalize to new items (via item features) and to new users (via user features)
  • 31. Hands-on practices [Lightfm] • Divide user/ movie feature matrix with F features
  • 32. Hands-on practices [Lightfm] • Demo code • RUN => • YOU RUN=> 04_LightFM_small.ipynb no_components: the dimensionality of the feature latent embeddings learning_schedule: 'adagrad', 'adadelta' loss: 'logistic', 'bpr', 'warp', 'warp-kos' learning_rate: nitial learning rate for the adagrad learning schedule Ref. https://making.lyst.com/lightfm/docs/_modules/lightfm/lightfm.html 04_LightFM.ipynb
  • 33. Hands-on practices [NCF] • Confusion matrix • Acc = (TP+TN)/(TP+FP+FN+TN) • Recall = (TP)/(TP+FN) • Precision = (TP)/(TP+FP) Fact (H1 = True) Fact (H1 = False) Prediction (True) TP FP Prediction (False) FN TN Add some fake backgroup (negative examples) if your precision is lower Example: add FP(false-positive) examples from testing as negative examples to train a new model
  • 34. Hands-on practices [NCF] • Demo code • YOU RUN => 05_NCF.ipynb NCF MRR = 0.24179100068337941 The difference between implicit and explicit feedbacks is what we only see the items that each user may be interested in. In order to training Deep-Learning model, we need to create some negative training examples. The faster way is to randomly generating some negative examples.
  • 35. Hands-on practices [NCF] • MLP (multilayer perceptron) • Copy and paste get_mlp_model function code from 05_NCF.ipynb to Net2Vis webpage • https://viscom.net2vis.uni- ulm.de/FAPokAxSXSrKPD9xCYjbDerXwa3WAZzeoL5ST6b1pOkCdPHZDo
  • 36. Reference • Credit to https://github.com/khuangaf/tibame_recommender_system • We didn’t cover so much about Deep-Learning topics • MLP • CNN • Dense • Flatten • Activation • Add • Dropout • …