Recommender Systems

Recommender Systems

Lior Rokach
Department of Information Systems Engineering
Ben-Gurion University of the Negev

About Me

Prof. Lior Rokach
Department of Information Systems Engineering
Faculty of Engineering Sciences
Head of the Machine Learning Lab
Ben-Gurion University of the Negev

Email: liorrk@bgu.ac.il
http://www.ise.bgu.ac.il/faculty/liorr/

PhD (2004) from Tel Aviv University

Are You Being Served?

 What are you looking for?
 Demographic – Age, Gender, etc.
 Context-
 Casual/Event
 Season
 Gift
 Purchase History
 Loyal Customer
 What is the customer currently wearing?
 Style
 Color
 Social
 Friends and Family
 Companion

Recommender Systems

 A recommender system (RS) helps people that
have not sufficient personal experience or
competence to evaluate the, potentially
overwhelming, number of alternatives offered by
a Web site.
 In their simplest form RSs recommend to their users
personalized and ranked lists of items
 Provide consumers with information to help them
decide which items to purchase

What movie should I watch?

• The Internet Movie Database (IMDb)
provides information about
actors, films, television shows, television
stars, video games and production crew
personnel.
• Owned by Amazon.com since 1998
• 796,328 titles and 2,127,371 people
• More than 50M users per month.

abcd
The Nextflix prize story

 In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
 Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to
Cinematch, Netflix's recommendation software
 After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
 But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
 Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
 Three years later, on 21st of September 2009, Netflix announced the
winner.

30.07.2012

Where should I spend my vacation?

Tripadvisor.com
I would like to escape from this ugly an tedious work life and
relax for two weeks in a sunny place. I am fed up with
these crowded and noisy places … just the sand and the
sea … and some “adventure”.
I would like to bring my wife and my children on a
holiday … it should not be to expensive. I prefer
mountainous places… not too far from home.
Children parks, easy paths and good cuisine are a
must.
I want to experience the contact with a completely different
culture. I would like to be fascinated by the people and
learn to look at my life in a totally different way.

Usage in the market/products Recommendation Procedure SWOT

State-of-the-art solutions
Methods Summary
Model Analysis

Examined Solutions
Method Commonness
Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes Amazon
Collaborative Filtering v v v v v v v v v v v v
Content-Based Techniques v v v v v v v v v v v
Knowledge-Based Techniques v v v v v v v
Stereotype-Based Recommender Systems v v v v v v v
Ontologies and Semantic Web Technologies
v v v
for Recommender Systems
Hybrid Techniques v v v v v v v
Ensemble Techniques for Improving
v future
Recommendation
Context Dependent Recommender Systems v v v v v v
Conversational/Critiquing Recommender
v v
Systems
Community Based Recommender Systems
v v v v v
and Recommender Systems 2.0

30.07.2012

Recom Next Steps. Procedure SWOT

Presenting the Three selected methods
Methods Summary
Model Analysis

 “Customers who bought
1 Collaborative this Item also bought…”
Filtering

2 Ensemble  “The wisdom of crowds”

 “Tell me the music that
3 Context Based
I want to listen NOW"

30.07.2012

Recom Next Steps. Procedure SWOT

Presenting the Three selected methods
Methods Summary
Model Analysis

4 Cross Domain  “Can movies and books collaborate?”

 "Tell me who your friends are,
5 Community
and I will tell you who you are.”

 “Can you recommend a movie for
6 Group
me and my friends?”

30.07.2012

Method 1

Collaborative Filtering

Method 1 Procedure SWOT

Methods Summary
Model Analysis
CF Ensemble Context

 The method of making automatic
predictions (filtering) about the
interests of a user by collecting
Description

taste information from many
users (collaborating). The 1 Collaborative Filtering
underlying assumption of CF
approach is that those who
agreed in the past tend to agree
again in the future.

Selected Techniques
 kNN - Nearest Neighbor
 SVD – Matrix Factorization
 Similarity Weights Optimization
(SWO)

30.07.2012

Collaborative Filtering Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
The Idea

 Trying to predict the opinion the user will have on the
different items and be able to recommend the “best” items to
each user based on: the user’s previous likings and the opinions
of other like minded users

Negative
Rating
?
Positive
Rating

30.07.2012


How does it work?
Methods Summary
Model Analysis
CF Ensemble Context

“People who liked this also
abcd abcd
liked…” User-to-User
 Recommendations are made by finding
users with similar tastes. Jane and Tim
both liked Item 2 and disliked Item 3; it
seems they might have similar
taste, which suggests that in general Jane
agrees with Tim. This makes Item 1 a good
recommendation for Tim.
Item This approach does not scale well for
to millions of users.

Item Item-to-Item
 Recommendations are made by finding
items that have similar appeal to many
users.
Tom and Sandra are two users who liked
both Item 1 and Item 4. That suggests that,
User to in general, people who liked Item 4 will
User also like item 1, so Item 1 will be
recommended to Tim. This approach is
scalable to millions of users and
millions of items.

30.07.2012


Rating Matrix
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Sample of a matrix
 The ratings of users and items are represented in a matrix
 All CF methods are based on such rating matrix

abcd
Items

abcd
Users  TheItems in
the system
 TheUsers in
the system

abcd
Ratings

 Eachitem
may have a
rating

30.07.2012


What is new?
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Few words about the techniques

 Collaborative filtering is one of the most common
recommendation methods in the market today.

 Up until two years ago, the kNN (“k” Nearest Neighbor)
technique was the norm. SVD (Singular Value Decomposition),
which has shown to be successful in the Netflix
recommendation competition, became common in the last
year. SWO is also a newer technique asking to enhance the
veteran kNN.

 In the following slides the three techniques will be
presented. It is important to get acquainted with the
techniques as they will be employed by the Ensemble.

30.07.2012

Method 1


Selected Techniques Explained

Method 1


Technique 1

kNN - Nearest Neighbor

kNN - Nearest Neighbor Procedure SWOT

High level explanation
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd
k-nearest neighbors algorithm
 A method for classifying objects based on closest
training examples in the feature space.
 It is assumed that similar samples are grouped together
 “k” means the number of neighbors – a proximity
measure
abcd
Recommendation example
 Finding the most relevant song by comparing to a set of
already heard ones.

30.07.2012

kNN - Nearest Neighbor Procedure SWOT

Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

Current User Users
1 1st item rate
0 Dislike
?
1
0
1 Like
abcd
abcd
Unknown Rating
Prediction
abcd
Other Users
1  This user did
 The prediction
not rate the  There are

Items
? Unknown 1 was made
item. We will other users
based on the
try to predict who rated the
0 nearest
a rating same item. We
are interested
1 neighbor. toabcd
according
Hamming Distance
in the Nearest
his The Hamming distance is named
neighbors.
1 
after Richard Hamming.
Neighbors.

0  In information theory, the

User Model = 1
abcd
Hamming distance between
two strings of equal length is
interactionlooking 1
Nearest Neighbors
 We are
the number of positions at
which the corresponding abcd
for the
history
symbols are different.
Nearest 1  Nearest
Neighbor. The
one with the 1 Neighbor
lowest
Hamming
0 14th item rate
distance.
Hamming 5 6 6 5 4 8
distance

30.07.2012

Method 1


Technique 2

SVD - Singular Value Decomposition

SVD - Singular Value Decomposition Procedure SWOT

Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd abcd
SVD sample matrix
 SVD is extraordinarily useful and
has many applications such as data
analysis, signal processing, pattern
recognition, image compression,
weather prediction, and Latent
Semantic Analysis or LSA

 Probably most popular model
among Netflix contestants.
 Has become the Collaborative
Filtering standard

 The Singular Value Decomposition
(SVD) is a widely used technique to
decompose a matrix into several
component matrices, exposing
many of the useful and interesting
properties of the original matrix.

30.07.2012

SVD - Singular Value Decomposition Procedure SWOT

Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd abcd
SVD sample matrix
 In the Recommendation Systems
field, SVD models users and items
as vectors of latent features
which when cross product produce
the rating for the user of the item

 With SVD a matrix is factored into
a series of linear approximations
that expose the underlying
structure of the matrix.

 The goal is to uncover latent
features that explain observed
ratings

30.07.2012

Latent Factor Models Procedure SWOT

Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

Users & Ratings Latent Concepts or Factors
abcd
Hidden Concept
 SVDreveals
hidden
connections
and its
strength

abcdVD
S

 SVD Process
abcd
Revealed Concept
abcd
SVD
 Malethat like
watching
 User Rating serious Movies

30.07.2012


Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

Users & Ratings Latent Concepts or Factors

abcd
Recommendation
 SVD
revealed a
movie this
user might
like!

30.07.2012


Concept space
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

30.07.2012

Method 1


Technique 3

SWO - Similarity Weights Optimization

Similarity Weights Optimization Procedure SWOT

SWO vs. Nearest Neighbor
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd abcd
SWO kNN
 The similarity function  the similarity function
(Pearson, Cosine) is used (Pearson, Cosine) is used
to determine the for both:
neighbors.  Determining the nearest
 The weights for the neighbors.
weighted average are  Determining the weights in
found via an optimization the weighted average of
process which minimizes the prediction.
the total prediction
error.

30.07.2012


Data Normalization
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd
Data Normalization

 Need to identify relations and mix ratings across items/users
 However, User and item-specific variability masks fundamental
relationships

 Examples:
 Some items are systematically rated higher
 Some items were rated by users that tend to rate
low
 Ratings change along time
 Normalization is critical to the success of a kNN
approach

30.07.2012

Similarity Weights Optimization
Data Normalization
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO

abcd
Data Normalization

 Remove data characteristics that are unlikely to be
explained by kNN
 Common practice is to use centering: Remove user- and
item-means
 A more comprehensive approach eliminates additional
interfering variability such as time effects
 Here, we normalize by removing the baseline estimates

30.07.2012


Neighborhood modeling through global optimization Model
CF
Methods

Ensemble
Analysis
Summary

Context
kNN SVD SWO

abcd
A basic model

30.07.2012


Ensemble
Methods Summary
Model Analysis
CF Ensemble Context

 Ensemble methodology imitates
Description

the human nature to seek advice
before making any crucial 2 Ensemble
decision.
 “Two heads are better than one”.

 Bagging (Breiman, 1996)

Selected Techniques
 AdaBoost (Freund and
Schapire, 1996)
 Random Parameter Manipulation

 The innovation is adopting the
Ensemble concept from the
general machine learning field to
the Recommender System domain.

30.07.2012

Ensemble at 30,000 feet Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 When important decisions have to be made, society often
places its trust in groups of people. We have parliaments,
juries, committees, and boards of directors, whom we are
happy to have make decisions for us.

 Ensemble imitates the human nature to seek advice before
making any crucial decision. It is achieved by weighing the
individual opinions, and combining them before reaching a final
decision, hence the names “The Wisdom of Crowds” and
“Committee of Experts”.

 We can ensure that the ensemble will produce
results that are in the worst case as bad as the
worst classifier in the ensemble.

30.07.2012

Ensemble Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
What is it?
 If you think about it, Ensemble is not a question to be
answered.
 So what is it than?

 Ensemble is the answer.

 So what is the question?

 How to improve results!

30.07.2012

Ensemble
Improving result…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Why do we care? Because...

 Having improved
results will prevent
cases like this.

30.07.2012


A short story
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Francis Galton

 Galton promoted statistics and invented the concept of
correlation.
 In 1906 Galton visited a livestock fair and stumbled upon an
intriguing contest.
 An ox was on display, and the villagers were invited to guess
the animal's weight.
 Nearly 800 gave it a go and, not surprisingly, not one hit the
exact mark: 1,198 pounds.
 Astonishingly, however, the average of those 800 guesses came
close - very close indeed. It was 1,197 pounds.

30.07.2012


Does it always work?
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Does Ensemble always work? No

 Not all crowds
(groups) are wise.
 For example, crazed
investors in a stock
market bubble.

30.07.2012


Schematic Example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Recommender 1 abcd
Recommender 2 abcd
Recommender 3

abcd
Weak Learners

 And
they all
abcd may be just
Problem Example
weak
 Linear
learners.
recommenders
cannot solve non-
linearly
separable
abcd
Combined Recommender
problems

 however,
their
combination can

30.07.2012

Ensemble
Why using Ensembles?
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context

Statistical Reasons, Risk reduction Computational Reasons
 Out of many recommender models  Every time we run a
with similar training / test errors, recommendation algorithm, we may
which one shall we pick? If we just find different local optima.
pick one at random, we risk the
possibility of choosing a really  Combining their outputs may allow
poor one us to find a solution that is closer
 Combining / averaging them may to the global minimum.
prevent us from making one such
unfortunate
decision
Too little data / too much data Representational Reasons

 Generating multiple recommenders  The recommender space may not
with the re-sampling of the contain the solution to a given
available data / mutually exclusive particular problem. However, an
subsets of the available data. ensemble of such recommenders
may.

30.07.2012


The Diversity Paradox
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Diversity vs. Accuracy Description
 On one hand we expect the
ensemble members to be
as good as possible so
they all target the same
goal.

 On the other hand they
have to be independent,
which means different,
hence, lowering the
accuracy.

abcd
There’s no real Paradox…
 Ideally, all committee members would be right about everything!
 If not, they should be wrong about different things.

30.07.2012


Single–model Ensemble RS
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Example configuration
abcd 4
Step
abcd 2
Step
 Produce
several
abcd 5
Step
 Generate recommendatio
different ns  Combinethe
variations of different
the same input recommendations
Rating
RS 1
Matrix 1

Training
Rating Inducer Ensemble ratings
Matrix RS

Rating
abcd 1
Step RS M
Matrix M abcdtep 6
S
abcd 3
Step
 Users&
Items  Theactual CF  Generates more
ratings Method & accurate predictions
input Technique than each individual RS

30.07.2012

Netflix Prize Procedure SWOT

The Competition
Methods Summary
Model Analysis
CF Ensemble Context

abcd
The Nextflix prize story

 In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
 Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to Cinematch,
Netflix's recommendation software
 After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
 But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
 Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
 Three years later, on 21st of September 2009, Netflix announced the
winner.

30.07.2012

Netflix Prize Procedure SWOT

The winner team used an Ensemble
Methods Summary
Model Analysis
CF Ensemble Context

abcdFACT

Actually, the top
100 solutions
were Ensemble
based

30.07.2012

Netflix Prize
And the winner is…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
We have a winner! So why bother?
 You may ask yourself,
why do we need to
further research &
develop the Ensemble?
 Because it was solved in a
manual tailored way,
combining a set of
predefined methods.
 There is plenty of room
for improvements.

30.07.2012

Netflix Prize Procedure
Methods
SWOT
Summary

The real winner
Model Analysis
CF Ensemble Context

abcd
The real winner is the method!
 One could say that the Ensemble techniques and methods helped tip the
scales.

 While the algorithms and good knowledge of statistics goes a long
way, it was ultimately the cross-team collaboration that ended the
contest.

 It is easy to overlook the fact that many teams were actually
committees of experts by themselves.

 "The Ensemble" team, appropriately named for the technique they used
to merge their results consists of over 30 people.

 Likewise, the winning team is a collaborative effort of several distinct
groups that merged their results.

30.07.2012

Method 2

Ensemble


Method 2

Ensemble

Technique 1

Bagging (Breiman, 1996)

Bagging Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM

abcd
Overview

 Introduced by Breiman (1996)
 “Bagging” stands for “bootstrap aggregating”.
 It is an ensemble method
 a method of combining multiple predictors.

 The intuition is that by using only part of the data and making
some data (randomly) have more impact, you get a better
variety of models that will reduce over fitting

30.07.2012

Bagging-based sampling of rating matrix Procedure SWOT

Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Bagging in action

abcd
Step 1
Arandom
subset of the
training set is
taken.

30.07.2012


Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Bagging in action

30.07.2012


Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Bagging in action

abcd 2
Step

 Some of the
data in this
subset is
duplicated
several times.

30.07.2012


Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Bagging in action

abcd
From here to a recommendation

 The input set is given to one
of the recommendation
methods.

 It is repeated until every
method has an input set.

 The average result (or most
common one) is picked.

30.07.2012

Method 2

Ensemble

Technique 2

AdaBoost (Freund and Schapire, 1996)

AdaBoost Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 Introduced by Freund and Schapire, 1996
 “AadBoost” stands for “Adaptive Boosting”.

 Boosting - To boost a “weak” learning algorithm into a
“strong” learning algorithm

 It is an ensemble method
 Training samples are weighted differently across the
ensemble members

30.07.2012


Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Overview The Process

 We start with
building an initial
model.
 Next that model is
improved, by
modifying the input
(training) set to
emphasize (for
example by
duplicating) the
part of the input
where the model
was less accurate.
 The model is
rebuilt and checked
for its accuracy.
 The process repeats
until the error of
the model is lower
than some bound.

30.07.2012


Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Step 1

abcd
Step 2
We start with
Next that model is building an
improved, by abcd Step initial model.
Final
modifying the input set
abcd 3
Step to emphasize the part process
The
repeats until
of the input where the
The model ismodel was less the error of
rebuilt and accurate.
Training
checked for its
the model is Combined
lower than
accuracy. some bound. recommender

30.07.2012

Method 2

Ensemble

Technique 3

Random Parameter Manipulation

Random Parameter Manipulation Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 The idea is to have multiple variations of the same
recommendation technique

 The variations are formed by changing the input parameters
systematically

 The Ensemble is achieved by combining the modified
recommenders in order to produce a unified prediction

30.07.2012

Random Parameter Manipulation Procedure SWOT

Schematic example
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Example: Averaging multiple SVD matrix based on different values of F

abcd
Variations of SVD

 Different F
values, 3 to 5
abcd
Ensemble

 Combined
Recommenders

30.07.2012

Method 2

Ensemble

Testing coverage


Testing coverage
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Coverage Details

 Each of the three CF
techniques will be tested
with an ensemble technique

 There are 9 possible
combinations of techniques.

 The diagram is color coded
for convenience.

30.07.2012

Method 3

Context-Based recommendation


Context-Based
Methods Summary
Model Analysis
CF Ensemble Context

 Adapting the recommendations to
Description

the specific user context.
 “Tell me the music that I want to
3 Context-Based
listen NOW“.

Selected Techniques
 Item Split
 Linear Models

30.07.2012

Context-Based Recommender Systems Procedure SWOT

Overview
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 The recommender system uses additional data about the
context of an item consumption.

 For example, in the case of a restaurant the time or the
location may be used to improve the recommendation
compared to what could be performed without this
additional source of information.

 A restaurant recommendation for a Saturday evening when
you go with your spouse should be different than a restaurant
recommendation on a workday afternoon when you go with
co-workers

30.07.2012


Motivation
Methods Summary
Model Analysis
CF Ensemble Context

Motivating Examples

 Recommend a vacation
 Winter vs. summer

 Recommend a purchase (e-retailer)
 Gift vs. for yourself

 Recommend a movie
 To a student who wants to see it on Saturday
night with his girlfriend in a movie theater.

30.07.2012


Motivation
Methods Summary
Model Analysis
CF Ensemble Context

Motivating Examples

 Recommend music
 The music that we like to hear is greatly affected by a
context, such that can be thought of a mixture of our
feelings (mood) and the situation or location (the theme)
we associate it with.
 Listen to Bruce Springteen "Born in USA" while driving
along the 101.
 Listening to Mozart's Magic Flute while walking in
Salzburg.

30.07.2012

Information Discovery: Example
“Tell me the music that I want to listen NOW"
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context

abcd abcd
Musicovery.com Details

 An Interactive
personalized WebRadio
 A mood matrix propose
a relationship between
music and mood.
 20 genres and time
periods, a popularity
scale (hits, less known
songs/discovery).
 covers all musical
genres, rap to funk via
electro, rock, disco…
or classical.
 Ethnographic studies
have shown that people
choose music peaces
according to their
mood or mood change
expectation.
 Musicovery relied on
this principle to build
an effective
relationship between
music and emotion.

30.07.2012


Context vs. others
Methods Summary
Model Analysis
CF Ensemble Context

What simple recommendation techniques ignore?

 What is the user when asking for a recommendation?
 Where (and when) the user is ?
 What does the user (e.g., improve his knowledge
or really buy a product)?
 Is the user or with other ?
 Are there products to choose or only ?
 Is the word economy or ?

30.07.2012

Context-Based Recommender Systems
Context vs. others
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context

What simple recommendation techniques ignore?

 What is the user when asking for a recommendation?
 Where (and when) the user is ?
 What does the user (e.g., improve his knowledge
or really buy a product)?
 Is the user or with other ?
 Are there products to choose or only ?
 Is the word economy or ?

Plain recommendation technologies forget to
take
into account the user context.

30.07.2012


Foundations
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Contextual Computing
 Contextual computing refers to the enhancement of a user’s
interactions by understanding the user, the context, and the
applications and information being used, typically across a
wide set of user goals

 Actively adapting the computational environment - for each
and every user - at each point of computation

 Contextual computing approach focuses on understanding the
information consumption patterns of each user

 Contextual computing focuses on the process not only on the
output of the search process. [Pitkow
et al., 2002]

30.07.2012


Major obstacles
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Major obstacle for contextual computing
 Obtain sufficient and reliable data describing the user context

 Selecting the right information, i.e., relevant in a particular
personalization task

 Understand the impact of contextual dimensions on the
personalization process

 Computational model the contextual dimension in a more
classical recommendation technology
 For instance: how to extend Collaborative Filtering to
include contextual dimensions?

30.07.2012

Method 3



Item Split


Item Split approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models

abcd
Item Split - Intuition and Approach
 The same item in different contextual conditions may produce
a different user experience
 We consider the same item in different contexts as distinct
items

 Research goal: Provide better music recommendations. Improve
Collaborative Filtering accuracy when the user context is known.

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Context in Collaborative Filtering
 “Context is any information that can be used to characterize
the situation of an entity” [A.K.Dey, 2001]

 In Item Splitting approach - similarly to [Adomavicius et. al,
2005] - we model the context with a set of dynamic features
of the rating – representing conditions that can rapidly change
their state

 When a user evaluates an item, the rating is recoded together
with the current state of the contextual variables

 CF does not provide a direct method to integrate additional
information into the recommendation process

30.07.2012


Reduction-Based Approach
Methods Summary
Model Analysis
CF Ensemble Context

abcd
 Reduce the problem of multi-dimensional recommendation to the
traditional two-dimensional User x Item
 For each “value” of the contextual dimension(s) estimate the missing
ratings with a traditional method

abcd
Example
 R: U x I x T  [0,1] U {?} ; User, Item, Time
 RD(u, i, t) = RD[T=t](u, i)
 The context-dependent estimation for (u, i, t) is computed using a
traditional approach, in a two-dimensional setting, but using only the
ratings that have T=t.

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

Multidimensional Model Bi-dimensional Model

item

We use only the
slice for T=t
user

User
ratings features

abcd
From here

 Theidea is Product
to reduce features
the
problem
abcdhere
To

 Into
a
manageable
model

30.07.2012


Reduction-Based vs. Item splitting
Methods Summary
Model Analysis
CF Ensemble Context

Reduction Based Item splitting

 Uses cross-validation as  Uses external impurity
goodness of segmentation – measures
Expensive (i.e. IG) - Heuristic based

 Segments are the same for  Each item is tested for a split
all the items separately

 Prediction is made using only  Prediction is made using all
the relevant segment the information, including
split items
Bottom Line

 The best known method (Reduction Based) is difficult to apply
(need to search in a huge space of contextual sectors).
 We are proposing a more adaptive, and computationally
efficient approach.

30.07.2012


Item Split technique
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Item Split - Intuition and Approach
 Each item in the data base ( ) is a candidate for splitting
 Context defines ( ) all possible splits of an item ratings vector
 We test all the possible splits – we do not have many contextual
features
 We choose one split (using a single contextual feature) that maximizes
an impurity measure and whose impurity is higher than a threshold

30.07.2012

Method 3



Linear Models


Contextual Modelling approach
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 In these approaches the context data are explicitly used in the
prediction model.

 There are several possibilities for using the contextual data.

 For instance the context can be used to extend the definition
of the distance function in nearest neighbours approaches

 The distance function must now also include a "context
distance"
aspect in it in addition to the user distance (CF) or item
distance (CB).

30.07.2012


Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context

abcd
Overview

 Presents an extension of the Matrix Factorization (MF) rating
prediction technique that incorporates contextual
information to adapt the recommendation to the user target
context.

 In this approach one model parameter was introduced for
each contextual factor and music track genre pair.

 This allowed learning how the context affects the ratings and
how they deviate from the classical personalized prediction.

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Example

 standard rating prediction for a user u and item i that can be
computed by a standard matrix factorization method for
collaborative filtering, this is the simple predicted rating for
this user and item pair, namely 4.24.

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Example

 The model that we have used in addition to that estimates
context-aware predictions, i.e., predictions were a context is
specified:
 in the figure we have two contexts c1 and c2 (sun and
rain).

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Example
 The model makes these two context aware rating predictions
(4.94 and 3.84) by estimating on the available data two
additional parameters that models the influence of the
context on the item, bic1 and bic2

 These two parameters describe the modifications to be made
to the non context-aware prediction to take into account the
context.In the first case the predicted rating must be
increased by 0.7 and in the second case decreased by 0.4.

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Predictive Model

 Context Aware Collaborative Filtering

30.07.2012


Methods Summary
Model Analysis
CF Ensemble Context

abcd
Comparison performance of Mean Absolute Error

 The largest improvement with respect to the non-personalized model based on
the item average is achieved as expected, by personalizing the recommendations
(“MF CF"), This gives an improvement of 5%.
 The personalized model can be further improved by contextualization (“MF CF +
Context") producing an improvement of 7% with respect to the item average
prediction, and a 3% improvement over the personalized model.
 The modeling approach and the rating acquisition process can substantially
improve the rating prediction accuracy when taking into account the contextual
information.

30.07.2012


Cross Domain
Methods Summary
Model Analysis
Cross Domain Community Group

 Cross-domain recommenders can
recommend products and services of
several domains that share resources
Description

(e.g., users, items, ratings, features, late
nt patterns s, features, latent
patterns).
4 Cross Domain

 Knowledge from one or several
domains might be utilized in another
domain to improve recommendations.

Selected Techniques
 User-model mediation and
aggregation

30.07.2012

Cross-Domain Procedure SWOT

Overview
Methods Summary
Model Analysis

abcd
Overview

 The majority of recommender systems (RS) work in a single
domain, such as movies, books, tourism etc.

 However, human preferences may span across multiple
domains.

 Knowledge of a user’s behavior in different domains might
improve prediction in a specific domain.

 A company might have knowledge of a user in one or more
different domains than the target recommendation and would
like to use it

30.07.2012


Overview
Methods Summary
Model Analysis

abcd
Motivation
 Sparsity and cold-start problems: cross-domain algorithms may
enrich the training data with data from other domains to prevent
sparsity.

 User friendly systems: by making use of data that was collected for
one domain in other domains, systems can prevent user’s interfering
for providing feedback.

 Availability of cross domain data: many e-commerce systems and
social networks contain information of users' preferences in several
domains. Thus, cross-domain information is available, and it is
motivating to look for effective algorithm that can make use of this
data to improve recommender systems performance (e.g., x-loads
domains).

 Marketing – cross-selling of new products: Marketing studies found
out that it is effective to promote products from different domains
to a user if they fit her buying patterns across domains.

30.07.2012


Overview
Methods Summary
Model Analysis

abcd
State of the art techniques
 User-model mediation and aggregation
 This technique was suggested by (Berkovsy et al, 2006,2007,2008).
 Aims at the sparsity challenge of recommender systems by
enriching the UM with data from a remote system.
 Requires overlap of users between domains
 Evaluation was performed for sub-domains of the same domain

 Content-based unified user-model
 (Gahni and Fano 2002) proposed generating a content-based user
model that can be used across domains.
 Extracting semantic features that might be relevant for many
domains and are predefined by domain experts (e.g., trendiness
vs. individualism)
 Not implemented or evaluated

30.07.2012


Overview
Methods Summary
Model Analysis

abcd
State of the art techniques
 Transfer learning (TL)
 A relatively young research area (since 1995) in Machine learning
 Aims at extracting knowledge that was learned for one task in a
domain and use it for a target task in a different domain.

 TL technique is recently gaining attention for application where
datasets are available only for specific domains

30.07.2012

Method 4

Cross Domain recommendation


User-model mediation and aggregation

Methods Summary
Model Analysis

User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group

abcd
Intuition and Approach

 This technique was suggested by Berkovsy et al., (2006, 2007,
2008) and aims at the sparsity challenge of recommender
systems by enriching the UM with data from a remote (source)
system.

 The suggested technique was demonstrated for the
collaborative filtering approach and is based on mediating
user model data form other domains to enrich the user's
model.

 A similar approach was presented by (Gonzales et al., 2006)
that generate a unified UM approach that aggregates features
from different domains, and maps the features that are
aggregated to relevant domains

30.07.2012

Methods Summary
Model Analysis

Aggregation
Community
CBT`
Group

abcd
Intuition and Approach
 Application of the mediation suggested above by Berkovsky at
al., requires:

 Overlapping users – mediation enriches the data about a specific
user with data about the same user from another domain (for
other items, and may be also in another context)

 Same prediction task – mediation of data from other users
models were applied from system that implemented the same
prediction function (collaborative filtering), thus employing the
same UM (user's ratings on items).

 Similarity between domains. A method to identify such similarity
is needed. Similarity should be integrated in the recommender
algorithm.

30.07.2012

Methods Summary
Model Analysis

Aggregation
Community
CBT`
Group

abcd
UM Aggregation approches

Domain 1 Domain 2
Source Target

abcd abcd abcd
Type 1 Type 2 Combine recommendation
 K nearest neighbors are  K nearest neighbors are  Consider the two domains as one
computed in the source computed in the source integrated domain:
domain domain to Ks.  As in Type1, set of K from the
domain 1 presents the
nearest neighbors.
 These neighbors are  K nearest neighbors are
utilized to generate also computed in the  But in this case it aggregates
target domain to Kt. with the set of K nearest-
recommendation in the
neighbors within domain 2.
target domain.
 From the aggregation
 The most similar K results K users with a
 This method is usable neighbors are selected maximum cosine similarity
for a user that is new in from U(Ks,Kt). value were selected and the
the target domain, and prediction was done with an
has history in the attitude to those K nearest
source domain. neighbors.

30.07.2012

Method 4

Cross Domain recommendation


CBT – Codebook Transfer

Recommender Systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Recommender Systems

Similar to Recommender Systems (20)

Recently uploaded

Recently uploaded (20)

Recommender Systems

Editor's Notes