Efficient Top-N Recommendation by Linear Regression

•

12 likes•11,591 views

Mark Levy

Slides from Large Scale Recommender System workshop at ACM RecSys, 13 October 2013.

Technology

Efficient Top-N Recommendation
by Linear Regression
Mark Levy
Mendeley

What and why?
●

Social network products @Mendeley

What and why?
●

Social network products @Mendeley

●

ERASM Eurostars project

●

Make newsfeed more interesting

●

First task: who to follow

What and why?
●

Multiple datasets
–

2M users, many active

–

~100M documents

–

author keywords, social tags

–

50M physical pdfs

–

15M <user,document> per month

–

User-document matrix currently ~250M non-zeros

What and why?
●

Approach as item similarity problem
–

●

with a constrained target item set

Possible state of the art:
–
–

matrix factorization and then neighbourhood

–

something dynamic (?)

–
●

pimped old skool neighbourhood method

SLIM

Use some side data

SLIM
●

Learn sparse item similarity weights

●

No explicit neighbourhood or metric

●

L1 regularization gives sparsity

●

Bound-constrained least squares:

SLIM

users

R

W

R

x

≈
rj

items
wj

model.fit(R,rj)
wj = model.coef_

SLIM
Good:
–

Outperforms MF methods on implicit ratings data [1]

–

Easy extensions to include side data [2]

Not so good:
–

Reported to be slow beyond small datasets [1]

[1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender
Systems, Proc. IEEE ICDM, 2011.
[2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N
Recommendations, Proc. ACM RecSys, 2012.

From SLIM to regression
●

Avoid constraints

●

Regularized regression, learn with SGD

●

Easy to implement

●

Faster on large datasets

From SLIM to regression

users

R´

W

R

x

≈

items rj

rj
wj

model.fit(R´,rj)
wj = model.coef_

Results on Mendeley data
●

Stacked readership counts, keyword counts

●

5M docs/keywords, 1M users, 140M non-zeros

●

Constrained to ~100k target users

●

Python implementation on top of scikit-learn
–
–

●

Trivially parallelized with IPython
eats CPU but easy on AWS

10% improvement over nearest neighbours

Our use case
R´

W

R

x

≈
all users

j

items

items

readership

keywords

target
users

all users

j

Tuning regularization constants
●

Relate directly to business logic
–

want sparsest similarity lists that are not too sparse

–

“too sparse” = # items with < k similar items

●

Grid search with a small sample of items

●

Empirically corresponds well to optimising
recommendation accuracy of a validation set
–

but faster and easier

Software release: mrec
●

Wrote our own small framework

●

Includes parallelized SLIM

●

Support for evaluation

●

BSD licence

●

https://github.com/mendeley/mrec

●

Please use it or contribute!

Thanks for listening
mark.levy@mendeley.com
@gamboviol
https://github.com/gamboviol
https://github.com/mendeley/mrec

Real World Requirements
●

Cheap to compute

●

Explicable

●

Easy to tweak + combine with business rules

●

Not a black box

●

Work without ratings

●

Handle multiple data sources

●

Offer something to anon users

Cold Start?
●

Can't work miracles for new items

●

Serve new users asap

●

Fine to run a special system for either case

●

Most content leaks

●

… or is leaked

●

All serious commercial content is annotated

What's hot

Fine tuning large LMsSylvainGugger

AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10

Scaling AI in production using PyTorchgeetachauhan

Tutorial: Context In Recommender SystemsYONG ZHENG

Neural Architectures for Named Entity RecognitionRrubaa Panchendrarajan

Missing values in recommender modelsParmeshwar Khurd

Netflix Global Search - Lucene Revolutionivan provalov

Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev

ChatGPT and AI for Web DevelopersMaximiliano Firtman

Sequential Decision Making in RecommendationsJaya Kawale

ML-Ops:Philosophy, Best-Practices and ToolsJorge Davila-Chacon

Past, Present & Future of Recommender Systems: An Industry PerspectiveJustin Basilico

Deep Learning for Recommender SystemsJustin Basilico

‘Big models’: the success and pitfalls of Transformer models in natural langu...Leiden University

Talk@rmit 09112017Shuai Zhang

Tutorial: Context-awareness In Information Retrieval and Recommender SystemsYONG ZHENG

Large Language Models - Chat AI.pdfDavid Rostcheck

Generative modelsBirger Moell

Vim cheat sheet_for_programmers_colorblindskywalker1114

Recent Trends in Personalization at NetflixJustin Basilico

What's hot (20)

Fine tuning large LMs

AI and ML Series - Introduction to Generative AI and LLMs - Session 1

Scaling AI in production using PyTorch

Tutorial: Context In Recommender Systems

Neural Architectures for Named Entity Recognition

Missing values in recommender models

Netflix Global Search - Lucene Revolution

Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)

ChatGPT and AI for Web Developers

Sequential Decision Making in Recommendations

ML-Ops:Philosophy, Best-Practices and Tools

Past, Present & Future of Recommender Systems: An Industry Perspective

Deep Learning for Recommender Systems

‘Big models’: the success and pitfalls of Transformer models in natural langu...

Talk@rmit 09112017

Tutorial: Context-awareness In Information Retrieval and Recommender Systems

Large Language Models - Chat AI.pdf

Generative models

Vim cheat sheet_for_programmers_colorblind

Recent Trends in Personalization at Netflix

Viewers also liked

Offline evaluation of recommender systems: all pain and no gain?Mark Levy

Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain

Collaborative Filtering Recommender Based on Co-occurrence MatrixMarjan Sterjev

Latent factor models for Collaborative Filteringsscdotopen

Matrix Factorization Techniques For Recommender SystemsLei Guo

Efficient Top-N Recommendation for Very Large Scale Binary Rated DatasetsFabio Aiolli

[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...YONG ZHENG

Algorithms on Hadoop at Last.fmMark Levy

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...Yahoo Developer Network

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopYahoo Developer Network

Brandie and Michaella's ProjectLacilia0024

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...Yahoo Developer Network

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...Yahoo Developer Network

Linear Regression Ordinary Least Squares Distributed Calculation ExampleMarjan Sterjev

Mutiple linear regression projectJAPAN SHAH

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network

Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen

Recommendation System --Theory and PracticeKimikazu Kato

econometrics project PG1 2015-16Sayantan Baidya

Viewers also liked (20)

Offline evaluation of recommender systems: all pain and no gain?

Recommender Systems (Machine Learning Summer School 2014 @ CMU)

Collaborative Filtering Recommender Based on Co-occurrence Matrix

Latent factor models for Collaborative Filtering

Matrix Factorization Techniques For Recommender Systems

Efficient Top-N Recommendation for Very Large Scale Binary Rated Datasets

[CIKM 2014] Deviation-Based Contextual SLIM Recommenders

[SAC 2015] Improve General Contextual SLIM Recommendation Algorithms By Facto...

Algorithms on Hadoop at Last.fm

November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...

October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop

Brandie and Michaella's Project

October 2016 HUG: Pulsar, a highly scalable, low latency pub-sub messaging s...

February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...

Linear Regression Ordinary Least Squares Distributed Calculation Example

Mutiple linear regression project

October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...

Co-occurrence Based Recommendations with Mahout, Scala and Spark

Recommendation System --Theory and Practice

econometrics project PG1 2015-16

Similar to Efficient Top-N Recommendation by Linear Regression

Parallel analytics as a servicePetrie Wong

Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous

Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionFormulatedby

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15MLconf

10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain

10 more lessons learned from building Machine Learning systemsXavier Amatriain

Triantafyllia VoulibasiISSEL

Data ops in practice - Swedish styleLars Albertsson

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Pramati Technologies

Recommender Systems from A to Z – Real-Time DeploymentCrossing Minds

Benefits of a Homemade ML PlatformGetInData

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dan Lynn

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...MongoDB

#RADC4L16: An API-First Archives Approach at NPRCamille Salas

LODFlow: Workflow Management System for Linked Data ProcessingIvan Ermilov

BSSML16 L10. Summary Day 2 SessionsBigML, Inc

Intro to ember.jsLeo Hernandez

Moodle performance optimizationsJan Meier

Open source ml systems that need to be builtNikhil Garg

Devday @ Sahaj - Domain Specific NLP PipelinesRajesh Muppalla

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Parallel analytics as a service

Production-Ready BIG ML Workflows - from zero to hero

Data Science Salon: A Journey of Deploying a Data Science Engine to Production

Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15

10 more lessons learned from building Machine Learning systems - MLConf

10 more lessons learned from building Machine Learning systems

Triantafyllia Voulibasi

Data ops in practice - Swedish style

Document Clustering using LDA | Haridas Narayanaswamy [Pramati]

Recommender Systems from A to Z – Real-Time Deployment

Benefits of a Homemade ML Platform

Dirty Data? Clean it up! - Rocky Mountain DataCon 2016

MongoDB World 2019: MongoDB in Data Science: How to Build a Scalable Product ...

#RADC4L16: An API-First Archives Approach at NPR

LODFlow: Workflow Management System for Linked Data Processing

BSSML16 L10. Summary Day 2 Sessions

Intro to ember.js

Moodle performance optimizations

Open source ml systems that need to be built

Devday @ Sahaj - Domain Specific NLP Pipelines

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

WordPress Websites for Engineers: Elevate Your Brandgvaughan

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

unit 4 immunoblotting technique complete.pptxBkGupta21

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

From Family Reminiscence to Scholarly Archive .Alan Dix

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Take control of your SAP testing with UiPath Test SuiteDianaGray10

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs

Unleash Your Potential - Namagunga Girls Coding Club

WordPress Websites for Engineers: Elevate Your Brand

DevoxxFR 2024 Reproducible Builds with Apache Maven

unit 4 immunoblotting technique complete.pptx

"Debugging python applications inside k8s environment", Andrii Soldatenko

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

SIP trunking in Janus @ Kamailio World 2024

Unraveling Multimodality with Large Language Models.pdf

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

Are Multi-Cloud and Serverless Good or Bad?

From Family Reminiscence to Scholarly Archive .

Developer Data Modeling Mistakes: From Postgres to NoSQL

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Take control of your SAP testing with UiPath Test Suite

What's New in Teams Calling, Meetings and Devices March 2024

What is DBT - The Ultimate Data Build Tool.pdf

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

Nell’iperspazio con Rocket: il Framework Web di Rust!

Efficient Top-N Recommendation by Linear Regression

1. Efficient Top-N Recommendation by Linear Regression Mark Levy Mendeley

2. What and why? ● Social network products @Mendeley

3. What and why? ● Social network products @Mendeley

4. What and why? ● Social network products @Mendeley

5. What and why? ● Social network products @Mendeley ● ERASM Eurostars project ● Make newsfeed more interesting ● First task: who to follow

6. What and why? ● Multiple datasets – 2M users, many active – ~100M documents – author keywords, social tags – 50M physical pdfs – 15M <user,document> per month – User-document matrix currently ~250M non-zeros

7. What and why? ● Approach as item similarity problem – ● with a constrained target item set Possible state of the art: – – matrix factorization and then neighbourhood – something dynamic (?) – ● pimped old skool neighbourhood method SLIM Use some side data

8. SLIM ● Learn sparse item similarity weights ● No explicit neighbourhood or metric ● L1 regularization gives sparsity ● Bound-constrained least squares:

9. SLIM users R W R x ≈ rj items wj model.fit(R,rj) wj = model.coef_

10. SLIM Good: – Outperforms MF methods on implicit ratings data [1] – Easy extensions to include side data [2] Not so good: – Reported to be slow beyond small datasets [1] [1] X. Ning and G. Karypis, SLIM: Sparse Linear Methods for Top-N Recommender Systems, Proc. IEEE ICDM, 2011. [2] X. Ning and G. Karypis, Sparse Linear Methods with Side Information for Top-N Recommendations, Proc. ACM RecSys, 2012.

11. From SLIM to regression ● Avoid constraints ● Regularized regression, learn with SGD ● Easy to implement ● Faster on large datasets

12. From SLIM to regression users R´ W R x ≈ items rj rj wj model.fit(R´,rj) wj = model.coef_

13. Results on reference datasets

14. Results on reference datasets

15. Results on reference datasets

16. Results on Mendeley data ● Stacked readership counts, keyword counts ● 5M docs/keywords, 1M users, 140M non-zeros ● Constrained to ~100k target users ● Python implementation on top of scikit-learn – – ● Trivially parallelized with IPython eats CPU but easy on AWS 10% improvement over nearest neighbours

17. Our use case R´ W R x ≈ all users j items items readership keywords target users all users j

18. Tuning regularization constants ● Relate directly to business logic – want sparsest similarity lists that are not too sparse – “too sparse” = # items with < k similar items ● Grid search with a small sample of items ● Empirically corresponds well to optimising recommendation accuracy of a validation set – but faster and easier

19. Software release: mrec ● Wrote our own small framework ● Includes parallelized SLIM ● Support for evaluation ● BSD licence ● https://github.com/mendeley/mrec ● Please use it or contribute!

20. Thanks for listening mark.levy@mendeley.com @gamboviol https://github.com/gamboviol https://github.com/mendeley/mrec

21. Real World Requirements ● Cheap to compute ● Explicable ● Easy to tweak + combine with business rules ● Not a black box ● Work without ratings ● Handle multiple data sources ● Offer something to anon users

22. Cold Start? ● Can't work miracles for new items ● Serve new users asap ● Fine to run a special system for either case ● Most content leaks ● … or is leaked ● All serious commercial content is annotated

23. Degrees of Freedom for k-NN ● Input numbers from mining logs ● Temporal “modelling” (e.g. fake users) ● Data pruning (scalability, popularity bias, quality) ● Preprocessing (tf-idf, log/sqrt, …) ● Hand crafted similarity metric ● Hand crafted aggregation formula ● Postprocessing (popularity matching) ● Diversification ● Attention profile

Efficient Top-N Recommendation by Linear Regression

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Efficient Top-N Recommendation by Linear Regression

Similar to Efficient Top-N Recommendation by Linear Regression (20)

Recently uploaded

Recently uploaded (20)

Efficient Top-N Recommendation by Linear Regression