SlideShare a Scribd company logo
1 of 33
Download to read offline
SESSION - BASED RECOMMENDATIONS WITH
RECURRENT NEURAL NETWORKS
ICLR 2016
●
●
●
●
Balázs Hidasi
Gravity R&D , Head of Data Mining and Research
Ph.D Context-aware factorization methods for implicit
feedback based recommendation problems(2016)
Alexandros Karatzoglou
Google, Staff Research Scientist (19. 9 ~)
Telefonica Research, Scientific Director
Cowork
● Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations(RecSys16)
● Recurrent Neural Networks with Top-k Gains for Session-based Recommendations(CIKM ’18)
Author
About us
Gravity is a personalization engine vendor offering a product
portfolio called Yusp, filling multiple needs using the same
underlying technology.
Mission
To enable businesses to be relevant and not forgotten.
Gravity R&D Inc.
출처 :https://www.yusp.com/
Gravity R&D Inc.
출처 :https://www.yusp.com/
출처 :https://ko.wikipedia.org/wiki/%ED%85%94%EB%A0%88%ED%8F%AC%EB%8B%88%EC%B9%B4
● In e-commerce system,
● hard to track users
● even possible
- only one or two sessions
- should be handled independently
● hard to use MF and not accurate
● use item-to-item similarity
- while effective
- only taking into account the last click
of the user, ignoring the past clicks
● 화장품 -> 청소기 -> 우유 -> 옷 을 조회했어도
● 앞에 정보 무시 옷 관련만 추천
Related Work : Co-Occurence, Item2Vec
출처 https://brunch.co.kr/@goodvc78/16
Extended version of General Factorization Framework
- two kinds of latent representation for items
- representation as the item itself
- representation as part of a session
- does not consider ordering within the session
🤔
출처: https://medium.com/snipfeed/time-dependent-recommender-systems-part-2-92e8dfaf1199
Contribution
- RNN 처음으로 사용 ( 딥러닝 도입 시도는 있었음.)
- Session-Parallel Mini-batch
- Sampling on the output
- Ranking Loss
Model Architecture
Model I/O
Input : item click(1-of-N Encoding) or event(weighted sum of items)
Output : item of the next event
without embedding, 1-of-N encoding is always better.
Model I/O
1-of-N encoding weighted sum of items
��
Model Architecture : GRU(Gated Recurrent Unit)
기본적인 RNN 구조
출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
Model Architecture : GRU(Gated Recurrent Unit)
실험결과 Classic RNN, LSTM보다 GRU가 좋음
출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
SESSION-PARALLEL MINI-BATCHES
1) length different even more than sentence
some sessions consists of only 2 events
2) to capture how a session evolves over time
Sessions are assumed to be independent reset the appropriate hidden state when switch occurs
-> 코드 구현 화이팅...
Sampling on the output
출처 :Deep Neural Networks for YouTube Recommendations
Why?
● Calculating a score for each item is unusable
● have to sample for calculating score
How?
● Based on popularity
● Implicit data
● unseen data →dislike? didn’t know?
● 아이유 새 앨범 : 관심없어서인지 몰라서인지
Sampling on the output
출처 :https://instagram-engineering.com/powered-by-ai-instagrams-explore-recommender-system-7ca901d2a882
Ranking Loss
Input : 킬링 이브 -> 비포 선셋 -> 엘르 -> 봄날은 간다 -> 멜로가 체질 -> ??
Output Ground Truth : 해리샐리
Candidates : [독전, 해리샐리, 황해, 조커, 비포 선라이즈]
Predictions : [0.1, 0.8, 0.2, 0.5, 0.9]
After Sorting : [비포 선라이즈, 해리샐리, 조커, 황해, 독전]
Loss
Pointwise : 해리샐리는 1로 나머지는 0으로.
Pairwise : (해리샐리, 독전) , (해리샐리, 황해), (해리샐리, 조커), (해리샐리, 비포 선라이즈) 값 차이가 커지게!
Listwise : [비포 선라이즈, 해리샐리, 조커, 황해, 독전] -> [해리샐리, 비포 선라이즈, 조커, 황해, 독전]로
만들도록!
→ Sorting 필요, O(NlogN) → Nope!
Bayesian Personalized Ranking
UAI 2009
출처 : https://arxiv.org/pdf/1205.2618.pdf
- When Dealing with Implicit Data
- Seen data should be higher than unseen data
- we don’t know ranking b/w seen data
- b/w unseen data also
Bayesian Personalized Ranking
UAI 2009
출처 : https://arxiv.org/pdf/1205.2618.pdf
BPR - max
Recurrent Neural Networks with Top-k Gains for Session-based Recommendations
출처 : https://www.slideshare.net/balazshidasi/gru4rec-v2-recurrent-neural-networks-with-topk-gains-for-sessionbased-recommendations
Experiments - Datasets (RSC15)
● 전자상거래에서 유저들의 클릭 기록과 구매 기록을 모은 데이터.
● 논문에서는 클릭데이터만 사용
● 길이 1짜리는 제외 6달치 데이터를 사용.
● Train Set : 7,966,257 sessions of 31,637,239 clicks on 37,483 item
→ 3.97 clicks / 1 session
● 학습 Session의 subsequent day data들이 Test Set.
● Train Set에 없는 Test Set의 Item들은 전부 제거
● Test Set : 15,324 sessions of 71,222 events for the test set
→ 4.64 clicks/ 1session
● Youtube Like Service
● RSC15와 비슷함
● item2item 추천이 돌아가고 있어서 유저의 행동에 영향을 주었을 거라 의심
Selection Bias?
● Session이 너무 긴거는 Bot일거라고 의심
Experiments - Datasets (youtube like OTT)
Metric : Recall@20 and MRR@20
출처: https://ko.wikipedia.org/wiki/%EC%A0%95%EB%B0%80%EB%8F%84%EC%99%80_%EC%9E%AC%ED%98%84%EC%9C%A8
https://www.blabladata.com/2014/10/26/evaluating-recommender-systems/
Baselines Experiments
Item - KNN이 압도적 → Baseline Model
GRU Model Experiments
● weight initialization : [-x, x] uniformly
● adagrad >rmsprop
● GRU > LSTM, RNN
● Single GRU Layer is Enough (아마도 session 이 비교적 짧아서)
● GRU Size를 늘리는건 도움
● activation 으로 tanh 로 쓰는게 좋음100/ 1000 unit 학습, 추론 속도 차이 크게 없음
● 자주 재학습이 가능한지는 추천 시스템에서 중요
● KNN 보다 훨씬 좋음
● Cross - entropy는 Pair-wise loss아님
● CE의 경우 unit 늘리면 떨어짐
● Top1 , BPR 크게 차이 없음
GRU Model Experiments
Author Code
저자가 공개한 코드 with THEANO : https://github.com/hidasib/GRU4Rec
이 코드의 간단한 리뷰 : https://www.notion.so/zmin/Code-Review-Session-Based-RecSys-7edaa306e3424426aa6956e903fdcc2a
이 코드를 Keras 로 구현한 Repo : https://github.com/pcerdam/KerasGRU4Rec
From RNN to Transformer
출처 :Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
From RNN to CNN

More Related Content

What's hot

Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 

What's hot (20)

Frequently Bought Together Recommendations Based on Embeddings
Frequently Bought Together Recommendations Based on EmbeddingsFrequently Bought Together Recommendations Based on Embeddings
Frequently Bought Together Recommendations Based on Embeddings
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
A Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at NetflixA Multi-Armed Bandit Framework For Recommendations at Netflix
A Multi-Armed Bandit Framework For Recommendations at Netflix
 
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systems
 
인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템인공지능추천시스템 airs개발기_모델링과시스템
인공지능추천시스템 airs개발기_모델링과시스템
 
Graph Databases at Netflix
Graph Databases at NetflixGraph Databases at Netflix
Graph Databases at Netflix
 
Data Engineering 101
Data Engineering 101Data Engineering 101
Data Engineering 101
 
Homepage Personalization at Spotify
Homepage Personalization at SpotifyHomepage Personalization at Spotify
Homepage Personalization at Spotify
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and RecommendationJoint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
 
Reward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfactionReward Innovation for long-term member satisfaction
Reward Innovation for long-term member satisfaction
 
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
 
Contextualization at Netflix
Contextualization at NetflixContextualization at Netflix
Contextualization at Netflix
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Graph Neural Networks for Recommendations
Graph Neural Networks for RecommendationsGraph Neural Networks for Recommendations
Graph Neural Networks for Recommendations
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
[NDC 2018] Spark, Flintrock, Airflow 로 구현하는 탄력적이고 유연한 데이터 분산처리 자동화 인프라 구축
[NDC 2018] Spark, Flintrock, Airflow 로 구현하는 탄력적이고 유연한 데이터 분산처리 자동화 인프라 구축[NDC 2018] Spark, Flintrock, Airflow 로 구현하는 탄력적이고 유연한 데이터 분산처리 자동화 인프라 구축
[NDC 2018] Spark, Flintrock, Airflow 로 구현하는 탄력적이고 유연한 데이터 분산처리 자동화 인프라 구축
 

Similar to Session-based recommendations with recurrent neural networks

Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
Databricks
 

Similar to Session-based recommendations with recurrent neural networks (20)

Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 
Auto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine LearningAuto-Pilot for Apache Spark Using Machine Learning
Auto-Pilot for Apache Spark Using Machine Learning
 
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana CloudUsing SigOpt to Tune Deep Learning Models with Nervana Cloud
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
 
Web Traffic Time Series Forecasting
Web Traffic  Time Series ForecastingWeb Traffic  Time Series Forecasting
Web Traffic Time Series Forecasting
 
1-bit semantic segmentation
1-bit semantic segmentation1-bit semantic segmentation
1-bit semantic segmentation
 
Tuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep LearningTuning for Systematic Trading: Talk 2: Deep Learning
Tuning for Systematic Trading: Talk 2: Deep Learning
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
 
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning PipelinesUsing Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
 
Using GANs to improve generalization in a semi-supervised setting - trying it...
Using GANs to improve generalization in a semi-supervised setting - trying it...Using GANs to improve generalization in a semi-supervised setting - trying it...
Using GANs to improve generalization in a semi-supervised setting - trying it...
 
Semi-supervised learning with GANs
Semi-supervised learning with GANsSemi-supervised learning with GANs
Semi-supervised learning with GANs
 
Elasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ SignalElasticsearch Performance Testing and Scaling @ Signal
Elasticsearch Performance Testing and Scaling @ Signal
 
Technical debt in machine learning - Data Natives Berlin 2018
Technical debt in machine learning - Data Natives Berlin 2018Technical debt in machine learning - Data Natives Berlin 2018
Technical debt in machine learning - Data Natives Berlin 2018
 
Machine Learning for Startups without PhDs
Machine Learning for Startups without PhDsMachine Learning for Startups without PhDs
Machine Learning for Startups without PhDs
 
Machine Learning for Startups without PhDs
Machine Learning for Startups without PhDsMachine Learning for Startups without PhDs
Machine Learning for Startups without PhDs
 
Early Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic ComputingEarly Benchmarking Results for Neuromorphic Computing
Early Benchmarking Results for Neuromorphic Computing
 
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Session-based recommendations with recurrent neural networks

  • 1. SESSION - BASED RECOMMENDATIONS WITH RECURRENT NEURAL NETWORKS ICLR 2016
  • 3. Balázs Hidasi Gravity R&D , Head of Data Mining and Research Ph.D Context-aware factorization methods for implicit feedback based recommendation problems(2016) Alexandros Karatzoglou Google, Staff Research Scientist (19. 9 ~) Telefonica Research, Scientific Director Cowork ● Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations(RecSys16) ● Recurrent Neural Networks with Top-k Gains for Session-based Recommendations(CIKM ’18) Author
  • 4. About us Gravity is a personalization engine vendor offering a product portfolio called Yusp, filling multiple needs using the same underlying technology. Mission To enable businesses to be relevant and not forgotten. Gravity R&D Inc. 출처 :https://www.yusp.com/
  • 5. Gravity R&D Inc. 출처 :https://www.yusp.com/
  • 7. ● In e-commerce system, ● hard to track users ● even possible - only one or two sessions - should be handled independently ● hard to use MF and not accurate ● use item-to-item similarity - while effective - only taking into account the last click of the user, ignoring the past clicks
  • 8.
  • 9. ● 화장품 -> 청소기 -> 우유 -> 옷 을 조회했어도 ● 앞에 정보 무시 옷 관련만 추천 Related Work : Co-Occurence, Item2Vec 출처 https://brunch.co.kr/@goodvc78/16
  • 10. Extended version of General Factorization Framework - two kinds of latent representation for items - representation as the item itself - representation as part of a session - does not consider ordering within the session 🤔 출처: https://medium.com/snipfeed/time-dependent-recommender-systems-part-2-92e8dfaf1199
  • 11. Contribution - RNN 처음으로 사용 ( 딥러닝 도입 시도는 있었음.) - Session-Parallel Mini-batch - Sampling on the output - Ranking Loss
  • 12. Model Architecture Model I/O Input : item click(1-of-N Encoding) or event(weighted sum of items) Output : item of the next event
  • 13. without embedding, 1-of-N encoding is always better. Model I/O 1-of-N encoding weighted sum of items
  • 15.
  • 16. Model Architecture : GRU(Gated Recurrent Unit) 기본적인 RNN 구조 출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
  • 17. Model Architecture : GRU(Gated Recurrent Unit) 실험결과 Classic RNN, LSTM보다 GRU가 좋음 출처: https://towardsdatascience.com/illustrated-guide-to-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21
  • 18. SESSION-PARALLEL MINI-BATCHES 1) length different even more than sentence some sessions consists of only 2 events 2) to capture how a session evolves over time Sessions are assumed to be independent reset the appropriate hidden state when switch occurs -> 코드 구현 화이팅...
  • 19. Sampling on the output 출처 :Deep Neural Networks for YouTube Recommendations Why? ● Calculating a score for each item is unusable ● have to sample for calculating score How? ● Based on popularity ● Implicit data ● unseen data →dislike? didn’t know? ● 아이유 새 앨범 : 관심없어서인지 몰라서인지
  • 20. Sampling on the output 출처 :https://instagram-engineering.com/powered-by-ai-instagrams-explore-recommender-system-7ca901d2a882
  • 21. Ranking Loss Input : 킬링 이브 -> 비포 선셋 -> 엘르 -> 봄날은 간다 -> 멜로가 체질 -> ?? Output Ground Truth : 해리샐리 Candidates : [독전, 해리샐리, 황해, 조커, 비포 선라이즈] Predictions : [0.1, 0.8, 0.2, 0.5, 0.9] After Sorting : [비포 선라이즈, 해리샐리, 조커, 황해, 독전] Loss Pointwise : 해리샐리는 1로 나머지는 0으로. Pairwise : (해리샐리, 독전) , (해리샐리, 황해), (해리샐리, 조커), (해리샐리, 비포 선라이즈) 값 차이가 커지게! Listwise : [비포 선라이즈, 해리샐리, 조커, 황해, 독전] -> [해리샐리, 비포 선라이즈, 조커, 황해, 독전]로 만들도록! → Sorting 필요, O(NlogN) → Nope!
  • 22. Bayesian Personalized Ranking UAI 2009 출처 : https://arxiv.org/pdf/1205.2618.pdf - When Dealing with Implicit Data - Seen data should be higher than unseen data - we don’t know ranking b/w seen data - b/w unseen data also
  • 23. Bayesian Personalized Ranking UAI 2009 출처 : https://arxiv.org/pdf/1205.2618.pdf
  • 24. BPR - max Recurrent Neural Networks with Top-k Gains for Session-based Recommendations 출처 : https://www.slideshare.net/balazshidasi/gru4rec-v2-recurrent-neural-networks-with-topk-gains-for-sessionbased-recommendations
  • 25. Experiments - Datasets (RSC15) ● 전자상거래에서 유저들의 클릭 기록과 구매 기록을 모은 데이터. ● 논문에서는 클릭데이터만 사용 ● 길이 1짜리는 제외 6달치 데이터를 사용. ● Train Set : 7,966,257 sessions of 31,637,239 clicks on 37,483 item → 3.97 clicks / 1 session ● 학습 Session의 subsequent day data들이 Test Set. ● Train Set에 없는 Test Set의 Item들은 전부 제거 ● Test Set : 15,324 sessions of 71,222 events for the test set → 4.64 clicks/ 1session
  • 26. ● Youtube Like Service ● RSC15와 비슷함 ● item2item 추천이 돌아가고 있어서 유저의 행동에 영향을 주었을 거라 의심 Selection Bias? ● Session이 너무 긴거는 Bot일거라고 의심 Experiments - Datasets (youtube like OTT)
  • 27. Metric : Recall@20 and MRR@20 출처: https://ko.wikipedia.org/wiki/%EC%A0%95%EB%B0%80%EB%8F%84%EC%99%80_%EC%9E%AC%ED%98%84%EC%9C%A8 https://www.blabladata.com/2014/10/26/evaluating-recommender-systems/
  • 28. Baselines Experiments Item - KNN이 압도적 → Baseline Model
  • 30. ● weight initialization : [-x, x] uniformly ● adagrad >rmsprop ● GRU > LSTM, RNN ● Single GRU Layer is Enough (아마도 session 이 비교적 짧아서) ● GRU Size를 늘리는건 도움 ● activation 으로 tanh 로 쓰는게 좋음100/ 1000 unit 학습, 추론 속도 차이 크게 없음 ● 자주 재학습이 가능한지는 추천 시스템에서 중요 ● KNN 보다 훨씬 좋음 ● Cross - entropy는 Pair-wise loss아님 ● CE의 경우 unit 늘리면 떨어짐 ● Top1 , BPR 크게 차이 없음 GRU Model Experiments
  • 31. Author Code 저자가 공개한 코드 with THEANO : https://github.com/hidasib/GRU4Rec 이 코드의 간단한 리뷰 : https://www.notion.so/zmin/Code-Review-Session-Based-RecSys-7edaa306e3424426aa6956e903fdcc2a 이 코드를 Keras 로 구현한 Repo : https://github.com/pcerdam/KerasGRU4Rec
  • 32. From RNN to Transformer 출처 :Behavior Sequence Transformer for E-commerce Recommendation in Alibaba
  • 33. From RNN to CNN