SlideShare a Scribd company logo
1 of 30
Download to read offline
Factorization Machines
- Introduction
Bartłomiej Twardowski
18.10.2016
Warsaw Data Science Meetup
Polish English?
• Support Vector Machines
=> “maszyna wektorów
nośnych”
• Matrix Factorization =>
“faktoryzacja macierzy”
• Factorization Machines =>
“maszyna faktoryzująca”?
• LMGTFY:-) Let’s stick to the
English name then!
Motivation
• one of the most successful model with a great of
expressiveness
• great for begin with context-aware recommendations
• considered as base toolbox for advertisers/kagglers
• FFM presentation from many years ago was on
RecSys 2016 ( still, almost nothing new in it :-( )
• considered it as a fun and original subject for meetup
2015.10.6 - meetup about recommender systems
Not motivated enough?
Success stories.
2. more often appears in DS job offers
1. competitions
Factorization Machines
• S. Rendle 2010 [1]
• combines advantages os Support Vector
Machines(SVM) with factorization models
• generic (real-value features)
• incredible good for sparse data
• model expressiveness
MF - quick recap
Simplest problem formulation[3]:
• U - user set, I - item set
• matrix contains user ratings
• find the best representation in k dimensional latent space for
user P (|U| × k) and items Q (|I| × k) so the matrix Rˆ is defined as: 

• to predict rating:
R 2 R|U|⇥|I|
MF - quick recap
with regularization[4]:
Linear & Poly2 models
ˆy(x) = w0 +
nX
i=1
wixi +
nX
i=1
nX
j=i+1
vi,jxixj
ˆy(x) = w0 +
nX
i=1
wixi
simple linear regression model:
adding two-way interactions:
FM Model
for two-way interactions:
model parameters:
For each xi we have dedicated vector vi with k-features.
Then instead of weight wij for feature interactions we have
dot product:
Wait, it’s O(kn2
)! Not linear!
Making it O(kn)
2
6
6
6
6
6
4
x11 x12 x13 . . . x1n
x21 x22 x23 . . . x2n
x31 x32 x33 . . . x3n
...
...
...
...
...
xd1 xd2 xd3 . . . xdn
3
7
7
7
7
7
5
Simplified version
for k = 1, n =2 perspective
(a + b)2
= a2
+ 2ab + b2
ab =
1
2
(a + b)2
a2
b2
let:
v1x1 = a, v2x2 = b
then:
And now it looks very familiar :-)
FM vs SVM
• FM combines the advantages of SVM and factorization
models
• general prediction working on real-values (like SVM)
• good estimates interactions model with huge sparsity,
where SVM fail (e.g. recommender systems)
• model equation of FMs can be calculated in linear time
• comparable to a polynomial kernel in SVM, but works
for very spars data and works fast.
Use case: Context-Aware
Recommender Systems
• U = {Alice (A),Bob (B),Charlie (C), . . .}
• I = {Titanic (TI),Notting Hill (NH), Star Wars (SW),
Star Trek (ST), . . .}
• S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW,
2010-4, 1),(B, SW, 2009-5, 4), (B, ST, 2009-8, 5),
(C,TI, 2009-9, 1), (C, SW, 2009-12, 5)}
• Example from [1]
Example of input data
preparation
Why us FM for this?
The drawback of tensor factorization models and
even more for specialized factorization models is
that [1]:
(1) they are not applicable to standard prediction
data (e.g. a real valued feature vector)
(2) that specialized models are usually derived
individually for a specific task requiring effort in
modeling and design of a learning algorithm.
How about ranking?
Go for pairwise approach!
http://www.tongji.edu.cn/~qiliu/lor_vs.html
Model expressiveness
FM ~ MF
given
the model will then mimic a biased MF:
MF ~ PITF
given user x item x tag interactions as:
FM will mimic a pairwise interaction
tensor factorization model (PITF) [7]:
And others
(e.g. factorized NN, KNN++, SVD++, …)
presented in [2].
Field-aware FM
• Have been used to win two CTR competitions [5].
• Introducing grouped features - fields, eg. user,
color, time.
• Learn a different set of latent factors for every pair
of fields
where f(i) is the field of a feature i.
ˆy(x) = w0 +
nX
i=1
wixi +
nX
i=1
nX
j=i+1
hvi,f(j), vj,f(i)ixixj
Available implementations
• libfm (http://www.libfm.org/), SGD/ALS/MCMC
• FM for Julia (https://github.com/btwardow/
FactorizationMachines.jl)
• fastFM (https://github.com/ibayer/fastFM)
• DiFacto (https://github.com/dmlc/difacto)
• lightfm
• spark-libFM, libffm
My experiments with FM on GPU
The same implementation moved from numpy to Theano was
~7x faster! Without using any special GPU tricks.
Going for click prediction?
• feature engineering (counting features, like hist. ctr)
• hashing trick
• L1, FTRL using e.g. vw
• making new features - e.g. decision tree encoding
How about now? :-)
References
[1] Rendle, Steffen. "Factorization machines." 2010 IEEE International
Conference on Data Mining. IEEE, 2010.
[2] Rendle, Steffen. "Factorization machines with libfm." ACM
Transactions on Intelligent Systems and Technology (TIST) 3.3 (2012): 57.
[3] Takács, Gábor, et al. "Matrix factorization and neighbor based
algorithms for the netflix prize problem." Proceedings of the 2008 ACM
conference on Recommender systems. ACM, 2008.
[4] Paterek, Arkadiusz. "Improving regularized singular value
decomposition for collaborative filtering." Proceedings of KDD cup and
workshop. Vol. 2007. 2007.
References
[5] http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf
[6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005.
Maximum-margin matrix factorization. In Advances in Neural
Information Processing Systems 17,MIT 1329–1336.
[7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction
tensor factorization for personalized tag recommendation. In
Proceedings of the third ACM International Conference on Web
Search and Data Mining (WSDM’10). ACM, New York, NY, 81–90.
Q&A
@btwardow, Bartłomiej Twardowski
B.Twardowski@ii.pw.edu.pl

More Related Content

What's hot

Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
Chris Johnson
 
Multi Task Learning for Recommendation Systems
Multi Task Learning for Recommendation SystemsMulti Task Learning for Recommendation Systems
Multi Task Learning for Recommendation Systems
Vaibhav Singh
 

What's hot (20)

Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems -  ACM RecSys 2013 tutorialLearning to Rank for Recommender Systems -  ACM RecSys 2013 tutorial
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019Netflix talk at ML Platform meetup Sep 2019
Netflix talk at ML Platform meetup Sep 2019
 
Sequential Decision Making in Recommendations
Sequential Decision Making in RecommendationsSequential Decision Making in Recommendations
Sequential Decision Making in Recommendations
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Past, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry PerspectivePast, Present & Future of Recommender Systems: An Industry Perspective
Past, Present & Future of Recommender Systems: An Industry Perspective
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019LinkedIn talk at Netflix ML Platform meetup Sep 2019
LinkedIn talk at Netflix ML Platform meetup Sep 2019
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 
What’s next for deep learning for Search?
What’s next for deep learning for Search?What’s next for deep learning for Search?
What’s next for deep learning for Search?
 
Multi Task Learning for Recommendation Systems
Multi Task Learning for Recommendation SystemsMulti Task Learning for Recommendation Systems
Multi Task Learning for Recommendation Systems
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox ML Infrastracture @ Dropbox
ML Infrastracture @ Dropbox
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
 

Viewers also liked

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
Liangjie Hong
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 

Viewers also liked (20)

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Warsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick ReviewWarsaw Data Science - Recsys2016 Quick Review
Warsaw Data Science - Recsys2016 Quick Review
 
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
Rekomendujemy - Szybkie wprowadzenie do systemów rekomendacji oraz trochę wie...
 
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
Recsys 2016: Modeling Contextual Information in Session-Aware Recommender Sys...
 
Факторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системахФакторизационные модели в рекомендательных системах
Факторизационные модели в рекомендательных системах
 
allegrotech - Data science meetup #1 Intro
allegrotech - Data science  meetup #1 Introallegrotech - Data science  meetup #1 Intro
allegrotech - Data science meetup #1 Intro
 
Naive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event ModelsNaive Bayesian Text Classifier Event Models
Naive Bayesian Text Classifier Event Models
 
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-PieRecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
RecSys Multi-Stack Ensemble for Job Recommendation, Pumpkin-Pie
 
Recommendation Engine Demystified
Recommendation Engine DemystifiedRecommendation Engine Demystified
Recommendation Engine Demystified
 
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
Systemy rekomendacji, Algorytmy rankingu Top-N rekomendacji bazujące na nieja...
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
 
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big DataPrezentacja z Big Data Tech 2016: Machine Learning vs Big Data
Prezentacja z Big Data Tech 2016: Machine Learning vs Big Data
 
Building a Predictive Model
Building a Predictive ModelBuilding a Predictive Model
Building a Predictive Model
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 

Similar to Warsaw Data Science - Factorization Machines Introduction

Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
ssuserff72e4
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
ActiveState
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in Julia
GapData Institute
 

Similar to Warsaw Data Science - Factorization Machines Introduction (20)

Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 
TIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in JuliaTIM: Large-scale Energy Forecasting in Julia
TIM: Large-scale Energy Forecasting in Julia
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
 
Новый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоныНовый InterSystems: open-source, митапы, хакатоны
Новый InterSystems: open-source, митапы, хакатоны
 
Scaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUsScaling Python to CPUs and GPUs
Scaling Python to CPUs and GPUs
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
 
Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"Travis Oliphant "Python for Speed, Scale, and Science"
Travis Oliphant "Python for Speed, Scale, and Science"
 
Go from a PHP Perspective
Go from a PHP PerspectiveGo from a PHP Perspective
Go from a PHP Perspective
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Clojure intro
Clojure introClojure intro
Clojure intro
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 
Making fitting in RooFit faster
Making fitting in RooFit fasterMaking fitting in RooFit faster
Making fitting in RooFit faster
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
 
Basic info on java intro
Basic info on java introBasic info on java intro
Basic info on java intro
 

Recently uploaded

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Recently uploaded (20)

Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 

Warsaw Data Science - Factorization Machines Introduction

  • 1. Factorization Machines - Introduction Bartłomiej Twardowski 18.10.2016 Warsaw Data Science Meetup
  • 2. Polish English? • Support Vector Machines => “maszyna wektorów nośnych” • Matrix Factorization => “faktoryzacja macierzy” • Factorization Machines => “maszyna faktoryzująca”? • LMGTFY:-) Let’s stick to the English name then!
  • 3. Motivation • one of the most successful model with a great of expressiveness • great for begin with context-aware recommendations • considered as base toolbox for advertisers/kagglers • FFM presentation from many years ago was on RecSys 2016 ( still, almost nothing new in it :-( ) • considered it as a fun and original subject for meetup
  • 4. 2015.10.6 - meetup about recommender systems
  • 5. Not motivated enough? Success stories. 2. more often appears in DS job offers 1. competitions
  • 6. Factorization Machines • S. Rendle 2010 [1] • combines advantages os Support Vector Machines(SVM) with factorization models • generic (real-value features) • incredible good for sparse data • model expressiveness
  • 7. MF - quick recap Simplest problem formulation[3]: • U - user set, I - item set • matrix contains user ratings • find the best representation in k dimensional latent space for user P (|U| × k) and items Q (|I| × k) so the matrix Rˆ is defined as: 
 • to predict rating: R 2 R|U|⇥|I|
  • 8. MF - quick recap with regularization[4]:
  • 9. Linear & Poly2 models ˆy(x) = w0 + nX i=1 wixi + nX i=1 nX j=i+1 vi,jxixj ˆy(x) = w0 + nX i=1 wixi simple linear regression model: adding two-way interactions:
  • 10. FM Model for two-way interactions: model parameters: For each xi we have dedicated vector vi with k-features. Then instead of weight wij for feature interactions we have dot product:
  • 11. Wait, it’s O(kn2 )! Not linear!
  • 12. Making it O(kn) 2 6 6 6 6 6 4 x11 x12 x13 . . . x1n x21 x22 x23 . . . x2n x31 x32 x33 . . . x3n ... ... ... ... ... xd1 xd2 xd3 . . . xdn 3 7 7 7 7 7 5
  • 13. Simplified version for k = 1, n =2 perspective (a + b)2 = a2 + 2ab + b2 ab = 1 2 (a + b)2 a2 b2 let: v1x1 = a, v2x2 = b then: And now it looks very familiar :-)
  • 14. FM vs SVM • FM combines the advantages of SVM and factorization models • general prediction working on real-values (like SVM) • good estimates interactions model with huge sparsity, where SVM fail (e.g. recommender systems) • model equation of FMs can be calculated in linear time • comparable to a polynomial kernel in SVM, but works for very spars data and works fast.
  • 15. Use case: Context-Aware Recommender Systems • U = {Alice (A),Bob (B),Charlie (C), . . .} • I = {Titanic (TI),Notting Hill (NH), Star Wars (SW), Star Trek (ST), . . .} • S = {(A,TI, 2010-1, 5), (A,NH, 2010-2, 3), (A, SW, 2010-4, 1),(B, SW, 2009-5, 4), (B, ST, 2009-8, 5), (C,TI, 2009-9, 1), (C, SW, 2009-12, 5)} • Example from [1]
  • 16. Example of input data preparation
  • 17. Why us FM for this? The drawback of tensor factorization models and even more for specialized factorization models is that [1]: (1) they are not applicable to standard prediction data (e.g. a real valued feature vector) (2) that specialized models are usually derived individually for a specific task requiring effort in modeling and design of a learning algorithm.
  • 18. How about ranking? Go for pairwise approach! http://www.tongji.edu.cn/~qiliu/lor_vs.html
  • 20. FM ~ MF given the model will then mimic a biased MF:
  • 21. MF ~ PITF given user x item x tag interactions as: FM will mimic a pairwise interaction tensor factorization model (PITF) [7]:
  • 22. And others (e.g. factorized NN, KNN++, SVD++, …) presented in [2].
  • 23. Field-aware FM • Have been used to win two CTR competitions [5]. • Introducing grouped features - fields, eg. user, color, time. • Learn a different set of latent factors for every pair of fields where f(i) is the field of a feature i. ˆy(x) = w0 + nX i=1 wixi + nX i=1 nX j=i+1 hvi,f(j), vj,f(i)ixixj
  • 24. Available implementations • libfm (http://www.libfm.org/), SGD/ALS/MCMC • FM for Julia (https://github.com/btwardow/ FactorizationMachines.jl) • fastFM (https://github.com/ibayer/fastFM) • DiFacto (https://github.com/dmlc/difacto) • lightfm • spark-libFM, libffm
  • 25. My experiments with FM on GPU The same implementation moved from numpy to Theano was ~7x faster! Without using any special GPU tricks.
  • 26. Going for click prediction? • feature engineering (counting features, like hist. ctr) • hashing trick • L1, FTRL using e.g. vw • making new features - e.g. decision tree encoding
  • 28. References [1] Rendle, Steffen. "Factorization machines." 2010 IEEE International Conference on Data Mining. IEEE, 2010. [2] Rendle, Steffen. "Factorization machines with libfm." ACM Transactions on Intelligent Systems and Technology (TIST) 3.3 (2012): 57. [3] Takács, Gábor, et al. "Matrix factorization and neighbor based algorithms for the netflix prize problem." Proceedings of the 2008 ACM conference on Recommender systems. ACM, 2008. [4] Paterek, Arkadiusz. "Improving regularized singular value decomposition for collaborative filtering." Proceedings of KDD cup and workshop. Vol. 2007. 2007.
  • 29. References [5] http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf [6] SREBRO,N., RENNIE,J. D. M., AND JAAKOLA, T. S. 2005. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems 17,MIT 1329–1336. [7] RENDLE,S. AND SCHMIDT-THIEME, L. 2010. Pairwise interaction tensor factorization for personalized tag recommendation. In Proceedings of the third ACM International Conference on Web Search and Data Mining (WSDM’10). ACM, New York, NY, 81–90.