From Amazon, to Spotify, to thermostats, recommendation systems are everywhere. The ability to provide recommendations for your users is becoming a crucial feature for modern applications. In this talk I'll show you how you can use Ruby to build recommendation systems for your users. You don't need a PhD to build a simple recommendation engine -- all you need is Ruby. Together we'll dive into the dark arts of machine learning and you'll discover that writing a basic recommendation engine is not as hard as you might have imagined. Using Ruby I'll teach you some of the common algorithms used in recommender systems, such as: Collaborative Filtering, K-Nearest Neighbor, and Pearson Correlation Coefficient. At the end of the talk you should be on your way to writing your own basic recommendation system in Ruby.
5. Outline
1) What is a recommendation system?
2) Collaborative filtering based
recommendations
3) Content based recommendations
4) Hybrid systems - the best of both worlds
5) Evaluating your recommendation system
6) Resources & existing libraries
5
6. What this Talk is Not
• Everything there is to know about
recommendation systems.
• Bleeding edge machine learning
• How to use a specific library
6
16. Two Types of CF
1. Memory Based - Uses similarity
between users or items. Dataset
usually kept in memory
2. Model Based - Model generated
to “explain” observed ratings
16
17. User Based CF
(User x Item) Matrix + Similarity
Function = Top-K most similar users
17
18. Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5
User 1 0 1 0 5 0
User 2 1 2 1 0 5
User 3 2 5 0 0 2
User 4 5 4 4 1 1
User 5 2 4 2
? ?
* 0 denotes not rated
18
29. Collaborative Filtering
Video 1 Video 2 Video 3 Video 4 Video 5
User 1 0 1 0 5 0
User 2 1 2 1 0 5
User 3 2 5 0 0 2
User 4 5 4 4 1 1
User 5 2 4 2
? ?
* 0 denotes not rated
29
33. Content Based Recommendations
Classify items based on features of
the item. Pick other items from
same class to recommend.
33
34. Content Based Algorithms
• K-means clustering
• Random Forrest
• Support Vector Machines
• ...
• Insert your favorite ML algorithm
34
35. Content Based Algorithms
Type of Duration Maturity
content Rating
Video 1 comedy 60 G
Video 2 action 120 G
Video 3 comedy 34 PG-13
Video 4 romantic 15 R
Video 5 sports 120 G
35
36. K-means Clustering
Group items into K clusters.
Assign new item to a cluster and
pick items from that cluster
36
38. Problems With Content Based
Recommendations
• Unsupervised Learning is hard
• Training data limited or expensive
• Doesn’t take user into account
• Limited by features of content
38
47. Summary of What We’ve Learned
• Collaborative Filtering using similar users
• Content clustering using k-means
• Combining 2 algorithms to boost quality
• How to evaluate your recommender
47
48. Don’t Reinvent the Wheel
• Apache Mahout
• JRuby mahout gem
• SciRuby
• Recommenderlab for R
48
49. Resources & Further Reading
• Recommender Systems: An Introduction
• Linden, Greg, Brent Smith, and Jeremy York.
"Amazon. com recommendations: Item-to-item
collaborative filtering."
• Resnick, Paul, et al. "GroupLens: an open architecture
for collaborative filtering of netnews."
• ACM RecSys Conference Proceedings
49