2. About Me
Prof. Lior Rokach
Department of Information Systems Engineering
Faculty of Engineering Sciences
Head of the Machine Learning Lab
Ben-Gurion University of the Negev
Email: liorrk@bgu.ac.il
http://www.ise.bgu.ac.il/faculty/liorr/
PhD (2004) from Tel Aviv University
3. Are You Being Served?
What are you looking for?
Demographic – Age, Gender, etc.
Context-
Casual/Event
Season
Gift
Purchase History
Loyal Customer
What is the customer currently wearing?
Style
Color
Social
Friends and Family
Companion
4. Recommender Systems
A recommender system (RS) helps people that
have not sufficient personal experience or
competence to evaluate the, potentially
overwhelming, number of alternatives offered by
a Web site.
In their simplest form RSs recommend to their users
personalized and ranked lists of items
Provide consumers with information to help them
decide which items to purchase
7. What movie should I watch?
• The Internet Movie Database (IMDb)
provides information about
actors, films, television shows, television
stars, video games and production crew
personnel.
• Owned by Amazon.com since 1998
• 796,328 titles and 2,127,371 people
• More than 50M users per month.
8. abcd
The Nextflix prize story
In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to
Cinematch, Netflix's recommendation software
After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
Three years later, on 21st of September 2009, Netflix announced the
winner.
30.07.2012
10. Where should I spend my vacation?
Tripadvisor.com
I would like to escape from this ugly an tedious work life and
relax for two weeks in a sunny place. I am fed up with
these crowded and noisy places … just the sand and the
sea … and some “adventure”.
I would like to bring my wife and my children on a
holiday … it should not be to expensive. I prefer
mountainous places… not too far from home.
Children parks, easy paths and good cuisine are a
must.
I want to experience the contact with a completely different
culture. I would like to be fascinated by the people and
learn to look at my life in a totally different way.
11.
12. Usage in the market/products Recommendation Procedure SWOT
State-of-the-art solutions
Methods Summary
Model Analysis
Examined Solutions
Method Commonness
Jinni Taste Kid Nanocrowd Clerkdogs Criticker IMDb Flixster Movielens Netflix Shazam Pandora LastFM YooChoose Think Analytics Itunes Amazon
Collaborative Filtering v v v v v v v v v v v v
Content-Based Techniques v v v v v v v v v v v
Knowledge-Based Techniques v v v v v v v
Stereotype-Based Recommender Systems v v v v v v v
Ontologies and Semantic Web Technologies
v v v
for Recommender Systems
Hybrid Techniques v v v v v v v
Ensemble Techniques for Improving
v future
Recommendation
Context Dependent Recommender Systems v v v v v v
Conversational/Critiquing Recommender
v v
Systems
Community Based Recommender Systems
v v v v v
and Recommender Systems 2.0
30.07.2012
14. Recom Next Steps. Procedure SWOT
Presenting the Three selected methods
Methods Summary
Model Analysis
“Customers who bought
1 Collaborative this Item also bought…”
Filtering
2 Ensemble “The wisdom of crowds”
“Tell me the music that
3 Context Based
I want to listen NOW"
30.07.2012
15. Recom Next Steps. Procedure SWOT
Presenting the Three selected methods
Methods Summary
Model Analysis
4 Cross Domain “Can movies and books collaborate?”
"Tell me who your friends are,
5 Community
and I will tell you who you are.”
“Can you recommend a movie for
6 Group
me and my friends?”
30.07.2012
17. Method 1 Procedure SWOT
Collaborative Filtering
Methods Summary
Model Analysis
CF Ensemble Context
The method of making automatic
predictions (filtering) about the
interests of a user by collecting
Description
taste information from many
users (collaborating). The 1 Collaborative Filtering
underlying assumption of CF
approach is that those who
agreed in the past tend to agree
again in the future.
Selected Techniques
kNN - Nearest Neighbor
SVD – Matrix Factorization
Similarity Weights Optimization
(SWO)
30.07.2012
18. Collaborative Filtering Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
The Idea
Trying to predict the opinion the user will have on the
different items and be able to recommend the “best” items to
each user based on: the user’s previous likings and the opinions
of other like minded users
Negative
Rating
?
Positive
Rating
30.07.2012
19. Collaborative Filtering Procedure SWOT
How does it work?
Methods Summary
Model Analysis
CF Ensemble Context
“People who liked this also
abcd abcd
liked…” User-to-User
Recommendations are made by finding
users with similar tastes. Jane and Tim
both liked Item 2 and disliked Item 3; it
seems they might have similar
taste, which suggests that in general Jane
agrees with Tim. This makes Item 1 a good
recommendation for Tim.
Item This approach does not scale well for
to millions of users.
Item Item-to-Item
Recommendations are made by finding
items that have similar appeal to many
users.
Tom and Sandra are two users who liked
both Item 1 and Item 4. That suggests that,
User to in general, people who liked Item 4 will
User also like item 1, so Item 1 will be
recommended to Tim. This approach is
scalable to millions of users and
millions of items.
30.07.2012
20. Collaborative Filtering Procedure SWOT
Rating Matrix
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Sample of a matrix
The ratings of users and items are represented in a matrix
All CF methods are based on such rating matrix
abcd
Items
abcd
Users TheItems in
the system
TheUsers in
the system
abcd
Ratings
Eachitem
may have a
rating
30.07.2012
21. Collaborative Filtering Procedure SWOT
What is new?
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Few words about the techniques
Collaborative filtering is one of the most common
recommendation methods in the market today.
Up until two years ago, the kNN (“k” Nearest Neighbor)
technique was the norm. SVD (Singular Value Decomposition),
which has shown to be successful in the Netflix
recommendation competition, became common in the last
year. SWO is also a newer technique asking to enhance the
veteran kNN.
In the following slides the three techniques will be
presented. It is important to get acquainted with the
techniques as they will be employed by the Ensemble.
30.07.2012
24. kNN - Nearest Neighbor Procedure SWOT
High level explanation
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
k-nearest neighbors algorithm
A method for classifying objects based on closest
training examples in the feature space.
It is assumed that similar samples are grouped together
“k” means the number of neighbors – a proximity
measure
abcd
Recommendation example
Finding the most relevant song by comparing to a set of
already heard ones.
30.07.2012
25. kNN - Nearest Neighbor Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Current User Users
1 1st item rate
0 Dislike
?
1
0
1 Like
abcd
abcd
Unknown Rating
Prediction
abcd
Other Users
1 This user did
The prediction
not rate the There are
Items
? Unknown 1 was made
item. We will other users
based on the
try to predict who rated the
0 nearest
a rating same item. We
are interested
1 neighbor. toabcd
according
Hamming Distance
in the Nearest
his The Hamming distance is named
neighbors.
1
after Richard Hamming.
Neighbors.
0 In information theory, the
User Model = 1
abcd
Hamming distance between
two strings of equal length is
interactionlooking 1
Nearest Neighbors
We are
the number of positions at
which the corresponding abcd
for the
history
symbols are different.
Nearest 1 Nearest
Neighbor. The
one with the 1 Neighbor
lowest
Hamming
0 14th item rate
distance.
Hamming 5 6 6 5 4 8
distance
30.07.2012
27. SVD - Singular Value Decomposition Procedure SWOT
Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SVD sample matrix
SVD is extraordinarily useful and
has many applications such as data
analysis, signal processing, pattern
recognition, image compression,
weather prediction, and Latent
Semantic Analysis or LSA
Probably most popular model
among Netflix contestants.
Has become the Collaborative
Filtering standard
The Singular Value Decomposition
(SVD) is a widely used technique to
decompose a matrix into several
component matrices, exposing
many of the useful and interesting
properties of the original matrix.
30.07.2012
28. SVD - Singular Value Decomposition Procedure SWOT
Matrix factorization technique
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SVD sample matrix
In the Recommendation Systems
field, SVD models users and items
as vectors of latent features
which when cross product produce
the rating for the user of the item
With SVD a matrix is factored into
a series of linear approximations
that expose the underlying
structure of the matrix.
The goal is to uncover latent
features that explain observed
ratings
30.07.2012
29. Latent Factor Models Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Users & Ratings Latent Concepts or Factors
abcd
Hidden Concept
SVDreveals
hidden
connections
and its
strength
abcdVD
S
SVD Process
abcd
Revealed Concept
abcd
SVD
Malethat like
watching
User Rating serious Movies
30.07.2012
30. Latent Factor Models Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
Users & Ratings Latent Concepts or Factors
abcd
Recommendation
SVD
revealed a
movie this
user might
like!
30.07.2012
31. Latent Factor Models Procedure SWOT
Concept space
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
30.07.2012
33. Similarity Weights Optimization Procedure SWOT
SWO vs. Nearest Neighbor
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd abcd
SWO kNN
The similarity function the similarity function
(Pearson, Cosine) is used (Pearson, Cosine) is used
to determine the for both:
neighbors. Determining the nearest
The weights for the neighbors.
weighted average are Determining the weights in
found via an optimization the weighted average of
process which minimizes the prediction.
the total prediction
error.
30.07.2012
34. Similarity Weights Optimization Procedure SWOT
Data Normalization
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
Data Normalization
Need to identify relations and mix ratings across items/users
However, User and item-specific variability masks fundamental
relationships
Examples:
Some items are systematically rated higher
Some items were rated by users that tend to rate
low
Ratings change along time
Normalization is critical to the success of a kNN
approach
30.07.2012
35. Similarity Weights Optimization
Data Normalization
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
kNN SVD SWO
abcd
Data Normalization
Remove data characteristics that are unlikely to be
explained by kNN
Common practice is to use centering: Remove user- and
item-means
A more comprehensive approach eliminates additional
interfering variability such as time effects
Here, we normalize by removing the baseline estimates
30.07.2012
36. Similarity Weights Optimization Procedure SWOT
Neighborhood modeling through global optimization Model
CF
Methods
Ensemble
Analysis
Summary
Context
kNN SVD SWO
abcd
A basic model
30.07.2012
38. Method 2 Procedure SWOT
Ensemble
Methods Summary
Model Analysis
CF Ensemble Context
Ensemble methodology imitates
Description
the human nature to seek advice
before making any crucial 2 Ensemble
decision.
“Two heads are better than one”.
Bagging (Breiman, 1996)
Selected Techniques
AdaBoost (Freund and
Schapire, 1996)
Random Parameter Manipulation
The innovation is adopting the
Ensemble concept from the
general machine learning field to
the Recommender System domain.
30.07.2012
39. Ensemble at 30,000 feet Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Overview
When important decisions have to be made, society often
places its trust in groups of people. We have parliaments,
juries, committees, and boards of directors, whom we are
happy to have make decisions for us.
Ensemble imitates the human nature to seek advice before
making any crucial decision. It is achieved by weighing the
individual opinions, and combining them before reaching a final
decision, hence the names “The Wisdom of Crowds” and
“Committee of Experts”.
We can ensure that the ensemble will produce
results that are in the worst case as bad as the
worst classifier in the ensemble.
30.07.2012
40. Ensemble Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
What is it?
If you think about it, Ensemble is not a question to be
answered.
So what is it than?
Ensemble is the answer.
So what is the question?
How to improve results!
30.07.2012
41. Ensemble
Improving result…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Why do we care? Because...
Having improved
results will prevent
cases like this.
30.07.2012
42. Ensemble Procedure SWOT
A short story
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Francis Galton
Galton promoted statistics and invented the concept of
correlation.
In 1906 Galton visited a livestock fair and stumbled upon an
intriguing contest.
An ox was on display, and the villagers were invited to guess
the animal's weight.
Nearly 800 gave it a go and, not surprisingly, not one hit the
exact mark: 1,198 pounds.
Astonishingly, however, the average of those 800 guesses came
close - very close indeed. It was 1,197 pounds.
30.07.2012
43. Ensemble Procedure SWOT
Does it always work?
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Does Ensemble always work? No
Not all crowds
(groups) are wise.
For example, crazed
investors in a stock
market bubble.
30.07.2012
44. Ensemble Procedure SWOT
Schematic Example
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Recommender 1 abcd
Recommender 2 abcd
Recommender 3
abcd
Weak Learners
And
they all
abcd may be just
Problem Example
weak
Linear
learners.
recommenders
cannot solve non-
linearly
separable
abcd
Combined Recommender
problems
however,
their
combination can
30.07.2012
45. Ensemble
Why using Ensembles?
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
Statistical Reasons, Risk reduction Computational Reasons
Out of many recommender models Every time we run a
with similar training / test errors, recommendation algorithm, we may
which one shall we pick? If we just find different local optima.
pick one at random, we risk the
possibility of choosing a really Combining their outputs may allow
poor one us to find a solution that is closer
Combining / averaging them may to the global minimum.
prevent us from making one such
unfortunate
decision
Too little data / too much data Representational Reasons
Generating multiple recommenders The recommender space may not
with the re-sampling of the contain the solution to a given
available data / mutually exclusive particular problem. However, an
subsets of the available data. ensemble of such recommenders
may.
30.07.2012
46. Ensemble Procedure SWOT
The Diversity Paradox
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Diversity vs. Accuracy Description
On one hand we expect the
ensemble members to be
as good as possible so
they all target the same
goal.
On the other hand they
have to be independent,
which means different,
hence, lowering the
accuracy.
abcd
There’s no real Paradox…
Ideally, all committee members would be right about everything!
If not, they should be wrong about different things.
30.07.2012
47. Ensemble Procedure SWOT
Single–model Ensemble RS
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Example configuration
abcd 4
Step
abcd 2
Step
Produce
several
abcd 5
Step
Generate recommendatio
different ns Combinethe
variations of different
the same input recommendations
Rating
RS 1
Matrix 1
Training
Rating Inducer Ensemble ratings
Matrix RS
Rating
abcd 1
Step RS M
Matrix M abcdtep 6
S
abcd 3
Step
Users&
Items Theactual CF Generates more
ratings Method & accurate predictions
input Technique than each individual RS
30.07.2012
48. Netflix Prize Procedure SWOT
The Competition
Methods Summary
Model Analysis
CF Ensemble Context
abcd
The Nextflix prize story
In October 2006, Netflix announced it would give a $1 million to
whoever created a movie-recommending algorithm 10% better than its
own.
Within two weeks, the DVD rental company had received 169
submissions, including three that were slightly superior to Cinematch,
Netflix's recommendation software
After a month, more than a thousand programs had been entered, and
the top scorers were almost halfway to the goal
But what started out looking simple suddenly got hard. The rate of
improvement began to slow. The same three or four teams clogged
the top of the leader-board.
Progress was almost imperceptible, and people began to say a 10
percent improvement might not be possible.
Three years later, on 21st of September 2009, Netflix announced the
winner.
30.07.2012
49. Netflix Prize Procedure SWOT
The winner team used an Ensemble
Methods Summary
Model Analysis
CF Ensemble Context
abcdFACT
Actually, the top
100 solutions
were Ensemble
based
30.07.2012
50. Netflix Prize
And the winner is…
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
We have a winner! So why bother?
You may ask yourself,
why do we need to
further research &
develop the Ensemble?
Because it was solved in a
manual tailored way,
combining a set of
predefined methods.
There is plenty of room
for improvements.
30.07.2012
51. Netflix Prize Procedure
Methods
SWOT
Summary
The real winner
Model Analysis
CF Ensemble Context
abcd
The real winner is the method!
One could say that the Ensemble techniques and methods helped tip the
scales.
While the algorithms and good knowledge of statistics goes a long
way, it was ultimately the cross-team collaboration that ended the
contest.
It is easy to overlook the fact that many teams were actually
committees of experts by themselves.
"The Ensemble" team, appropriately named for the technique they used
to merge their results consists of over 30 people.
Likewise, the winning team is a collaborative effort of several distinct
groups that merged their results.
30.07.2012
54. Bagging Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
Introduced by Breiman (1996)
“Bagging” stands for “bootstrap aggregating”.
It is an ensemble method
a method of combining multiple predictors.
The intuition is that by using only part of the data and making
some data (randomly) have more impact, you get a better
variety of models that will reduce over fitting
30.07.2012
55. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd
Step 1
Arandom
subset of the
training set is
taken.
30.07.2012
56. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
30.07.2012
57. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd 2
Step
Some of the
data in this
subset is
duplicated
several times.
30.07.2012
58. Bagging-based sampling of rating matrix Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Bagging in action
abcd
From here to a recommendation
The input set is given to one
of the recommendation
methods.
It is repeated until every
method has an input set.
The average result (or most
common one) is picked.
30.07.2012
60. AdaBoost Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
Introduced by Freund and Schapire, 1996
“AadBoost” stands for “Adaptive Boosting”.
Boosting - To boost a “weak” learning algorithm into a
“strong” learning algorithm
It is an ensemble method
Training samples are weighted differently across the
ensemble members
30.07.2012
61. AdaBoost Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd abcd
Overview The Process
We start with
building an initial
model.
Next that model is
improved, by
modifying the input
(training) set to
emphasize (for
example by
duplicating) the
part of the input
where the model
was less accurate.
The model is
rebuilt and checked
for its accuracy.
The process repeats
until the error of
the model is lower
than some bound.
30.07.2012
62. AdaBoost Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Step 1
abcd
Step 2
We start with
Next that model is building an
improved, by abcd Step initial model.
Final
modifying the input set
abcd 3
Step to emphasize the part process
The
repeats until
of the input where the
The model ismodel was less the error of
rebuilt and accurate.
Training
checked for its
the model is Combined
lower than
accuracy. some bound. recommender
30.07.2012
64. Random Parameter Manipulation Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Overview
The idea is to have multiple variations of the same
recommendation technique
The variations are formed by changing the input parameters
systematically
The Ensemble is achieved by combining the modified
recommenders in order to produce a unified prediction
30.07.2012
65. Random Parameter Manipulation Procedure SWOT
Schematic example
Methods Summary
Model Analysis
CF Ensemble Context
Bagging AdaBoost RPM
abcd
Example: Averaging multiple SVD matrix based on different values of F
abcd
Variations of SVD
Different F
values, 3 to 5
abcd
Ensemble
Combined
Recommenders
30.07.2012
67. Ensemble Procedure SWOT
Testing coverage
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Coverage Details
Each of the three CF
techniques will be tested
with an ensemble technique
There are 9 possible
combinations of techniques.
The diagram is color coded
for convenience.
30.07.2012
69. Method 3 Procedure SWOT
Context-Based
Methods Summary
Model Analysis
CF Ensemble Context
Adapting the recommendations to
Description
the specific user context.
“Tell me the music that I want to
3 Context-Based
listen NOW“.
Selected Techniques
Item Split
Linear Models
30.07.2012
70. Context-Based Recommender Systems Procedure SWOT
Overview
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Overview
The recommender system uses additional data about the
context of an item consumption.
For example, in the case of a restaurant the time or the
location may be used to improve the recommendation
compared to what could be performed without this
additional source of information.
A restaurant recommendation for a Saturday evening when
you go with your spouse should be different than a restaurant
recommendation on a workday afternoon when you go with
co-workers
30.07.2012
71. Context-Based Recommender Systems Procedure SWOT
Motivation
Methods Summary
Model Analysis
CF Ensemble Context
Motivating Examples
Recommend a vacation
Winter vs. summer
Recommend a purchase (e-retailer)
Gift vs. for yourself
Recommend a movie
To a student who wants to see it on Saturday
night with his girlfriend in a movie theater.
30.07.2012
72. Context-Based Recommender Systems Procedure SWOT
Motivation
Methods Summary
Model Analysis
CF Ensemble Context
Motivating Examples
Recommend music
The music that we like to hear is greatly affected by a
context, such that can be thought of a mixture of our
feelings (mood) and the situation or location (the theme)
we associate it with.
Listen to Bruce Springteen "Born in USA" while driving
along the 101.
Listening to Mozart's Magic Flute while walking in
Salzburg.
30.07.2012
73. Information Discovery: Example
“Tell me the music that I want to listen NOW"
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
abcd abcd
Musicovery.com Details
An Interactive
personalized WebRadio
A mood matrix propose
a relationship between
music and mood.
20 genres and time
periods, a popularity
scale (hits, less known
songs/discovery).
covers all musical
genres, rap to funk via
electro, rock, disco…
or classical.
Ethnographic studies
have shown that people
choose music peaces
according to their
mood or mood change
expectation.
Musicovery relied on
this principle to build
an effective
relationship between
music and emotion.
30.07.2012
74. Context-Based Recommender Systems Procedure SWOT
Context vs. others
Methods Summary
Model Analysis
CF Ensemble Context
What simple recommendation techniques ignore?
What is the user when asking for a recommendation?
Where (and when) the user is ?
What does the user (e.g., improve his knowledge
or really buy a product)?
Is the user or with other ?
Are there products to choose or only ?
Is the word economy or ?
30.07.2012
75. Context-Based Recommender Systems
Context vs. others
Procedure SWOT
Methods Summary
Model Analysis
CF Ensemble Context
What simple recommendation techniques ignore?
What is the user when asking for a recommendation?
Where (and when) the user is ?
What does the user (e.g., improve his knowledge
or really buy a product)?
Is the user or with other ?
Are there products to choose or only ?
Is the word economy or ?
Plain recommendation technologies forget to
take
into account the user context.
30.07.2012
76. Context-Based Recommender Systems Procedure SWOT
Foundations
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Contextual Computing
Contextual computing refers to the enhancement of a user’s
interactions by understanding the user, the context, and the
applications and information being used, typically across a
wide set of user goals
Actively adapting the computational environment - for each
and every user - at each point of computation
Contextual computing approach focuses on understanding the
information consumption patterns of each user
Contextual computing focuses on the process not only on the
output of the search process. [Pitkow
et al., 2002]
30.07.2012
77. Context-Based Recommender Systems Procedure SWOT
Major obstacles
Methods Summary
Model Analysis
CF Ensemble Context
abcd
Major obstacle for contextual computing
Obtain sufficient and reliable data describing the user context
Selecting the right information, i.e., relevant in a particular
personalization task
Understand the impact of contextual dimensions on the
personalization process
Computational model the contextual dimension in a more
classical recommendation technology
For instance: how to extend Collaborative Filtering to
include contextual dimensions?
30.07.2012
79. Context-Based Recommender Systems Procedure SWOT
Item Split approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Item Split - Intuition and Approach
The same item in different contextual conditions may produce
a different user experience
We consider the same item in different contexts as distinct
items
Research goal: Provide better music recommendations. Improve
Collaborative Filtering accuracy when the user context is known.
30.07.2012
80. Context-Based Recommender Systems Procedure SWOT
Collaborative Filtering
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Context in Collaborative Filtering
“Context is any information that can be used to characterize
the situation of an entity” [A.K.Dey, 2001]
In Item Splitting approach - similarly to [Adomavicius et. al,
2005] - we model the context with a set of dynamic features
of the rating – representing conditions that can rapidly change
their state
When a user evaluates an item, the rating is recoded together
with the current state of the contextual variables
CF does not provide a direct method to integrate additional
information into the recommendation process
30.07.2012
81. Context-Based Recommender Systems Procedure SWOT
Reduction-Based Approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Reduction-Based Approach
Reduce the problem of multi-dimensional recommendation to the
traditional two-dimensional User x Item
For each “value” of the contextual dimension(s) estimate the missing
ratings with a traditional method
abcd
Example
R: U x I x T [0,1] U {?} ; User, Item, Time
RD(u, i, t) = RD[T=t](u, i)
The context-dependent estimation for (u, i, t) is computed using a
traditional approach, in a two-dimensional setting, but using only the
ratings that have T=t.
30.07.2012
82. Context-Based Recommender Systems Procedure SWOT
Reduction-Based Approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
Multidimensional Model Bi-dimensional Model
item
We use only the
slice for T=t
user
User
ratings features
abcd
From here
Theidea is Product
to reduce features
the
problem
abcdhere
To
Into
a
manageable
model
30.07.2012
83. Context-Based Recommender Systems Procedure SWOT
Reduction-Based vs. Item splitting
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
Reduction Based Item splitting
Uses cross-validation as Uses external impurity
goodness of segmentation – measures
Expensive (i.e. IG) - Heuristic based
Segments are the same for Each item is tested for a split
all the items separately
Prediction is made using only Prediction is made using all
the relevant segment the information, including
split items
Bottom Line
The best known method (Reduction Based) is difficult to apply
(need to search in a huge space of contextual sectors).
We are proposing a more adaptive, and computationally
efficient approach.
30.07.2012
84. Context-Based Recommender Systems Procedure SWOT
Item Split technique
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Item Split - Intuition and Approach
Each item in the data base ( ) is a candidate for splitting
Context defines ( ) all possible splits of an item ratings vector
We test all the possible splits – we do not have many contextual
features
We choose one split (using a single contextual feature) that maximizes
an impurity measure and whose impurity is higher than a threshold
30.07.2012
86. Context-Based Recommender Systems Procedure SWOT
Contextual Modelling approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Overview
In these approaches the context data are explicitly used in the
prediction model.
There are several possibilities for using the contextual data.
For instance the context can be used to extend the definition
of the distance function in nearest neighbours approaches
The distance function must now also include a "context
distance"
aspect in it in addition to the user distance (CF) or item
distance (CB).
30.07.2012
87. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Overview
Presents an extension of the Matrix Factorization (MF) rating
prediction technique that incorporates contextual
information to adapt the recommendation to the user target
context.
In this approach one model parameter was introduced for
each contextual factor and music track genre pair.
This allowed learning how the context affects the ratings and
how they deviate from the classical personalized prediction.
30.07.2012
88. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
standard rating prediction for a user u and item i that can be
computed by a standard matrix factorization method for
collaborative filtering, this is the simple predicted rating for
this user and item pair, namely 4.24.
30.07.2012
89. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
The model that we have used in addition to that estimates
context-aware predictions, i.e., predictions were a context is
specified:
in the figure we have two contexts c1 and c2 (sun and
rain).
30.07.2012
90. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Example
The model makes these two context aware rating predictions
(4.94 and 3.84) by estimating on the available data two
additional parameters that models the influence of the
context on the item, bic1 and bic2
These two parameters describe the modifications to be made
to the non context-aware prediction to take into account the
context.In the first case the predicted rating must be
increased by 0.7 and in the second case decreased by 0.4.
30.07.2012
91. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Predictive Model
Context Aware Collaborative Filtering
30.07.2012
92. Context-Based Recommender Systems Procedure SWOT
Linear Models approach
Methods Summary
Model Analysis
CF Ensemble Context
Item Split Linear Models
abcd
Comparison performance of Mean Absolute Error
The largest improvement with respect to the non-personalized model based on
the item average is achieved as expected, by personalizing the recommendations
(“MF CF"), This gives an improvement of 5%.
The personalized model can be further improved by contextualization (“MF CF +
Context") producing an improvement of 7% with respect to the item average
prediction, and a 3% improvement over the personalized model.
The modeling approach and the rating acquisition process can substantially
improve the rating prediction accuracy when taking into account the contextual
information.
30.07.2012
94. Method 4 Procedure SWOT
Cross Domain
Methods Summary
Model Analysis
Cross Domain Community Group
Cross-domain recommenders can
recommend products and services of
several domains that share resources
Description
(e.g., users, items, ratings, features, late
nt patterns s, features, latent
patterns).
4 Cross Domain
Knowledge from one or several
domains might be utilized in another
domain to improve recommendations.
Selected Techniques
User-model mediation and
aggregation
30.07.2012
95. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
Overview
The majority of recommender systems (RS) work in a single
domain, such as movies, books, tourism etc.
However, human preferences may span across multiple
domains.
Knowledge of a user’s behavior in different domains might
improve prediction in a specific domain.
A company might have knowledge of a user in one or more
different domains than the target recommendation and would
like to use it
30.07.2012
96. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
Motivation
Sparsity and cold-start problems: cross-domain algorithms may
enrich the training data with data from other domains to prevent
sparsity.
User friendly systems: by making use of data that was collected for
one domain in other domains, systems can prevent user’s interfering
for providing feedback.
Availability of cross domain data: many e-commerce systems and
social networks contain information of users' preferences in several
domains. Thus, cross-domain information is available, and it is
motivating to look for effective algorithm that can make use of this
data to improve recommender systems performance (e.g., x-loads
domains).
Marketing – cross-selling of new products: Marketing studies found
out that it is effective to promote products from different domains
to a user if they fit her buying patterns across domains.
30.07.2012
97. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
State of the art techniques
User-model mediation and aggregation
This technique was suggested by (Berkovsy et al, 2006,2007,2008).
Aims at the sparsity challenge of recommender systems by
enriching the UM with data from a remote system.
Requires overlap of users between domains
Evaluation was performed for sub-domains of the same domain
Content-based unified user-model
(Gahni and Fano 2002) proposed generating a content-based user
model that can be used across domains.
Extracting semantic features that might be relevant for many
domains and are pre- defined by domain experts (e.g., trendiness
vs. individualism)
Not implemented or evaluated
30.07.2012
98. Cross-Domain Procedure SWOT
Overview
Methods Summary
Model Analysis
Cross Domain Community Group
abcd
State of the art techniques
Transfer learning (TL)
A relatively young research area (since 1995) in Machine learning
Aims at extracting knowledge that was learned for one task in a
domain and use it for a target task in a different domain.
TL technique is recently gaining attention for application where
datasets are available only for specific domains
30.07.2012
100. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
Intuition and Approach
This technique was suggested by Berkovsy et al., (2006, 2007,
2008) and aims at the sparsity challenge of recommender
systems by enriching the UM with data from a remote (source)
system.
The suggested technique was demonstrated for the
collaborative filtering approach and is based on mediating
user model data form other domains to enrich the user's
model.
A similar approach was presented by (Gonzales et al., 2006)
that generate a unified UM approach that aggregates features
from different domains, and maps the features that are
aggregated to relevant domains
30.07.2012
101. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
Intuition and Approach
Application of the mediation suggested above by Berkovsky at
al., requires:
Overlapping users – mediation enriches the data about a specific
user with data about the same user from another domain (for
other items, and may be also in another context)
Same prediction task – mediation of data from other users
models were applied from system that implemented the same
prediction function (collaborative filtering), thus employing the
same UM (user's ratings on items).
Similarity between domains. A method to identify such similarity
is needed. Similarity should be integrated in the recommender
algorithm.
30.07.2012
102. Cross-Domain Procedure SWOT
Methods Summary
Model Analysis
User-model Mediation and Aggregation Cross Domain
Aggregation
Community
CBT`
Group
abcd
UM Aggregation approches
Domain 1 Domain 2
Source Target
abcd abcd abcd
Type 1 Type 2 Combine recommendation
K nearest neighbors are K nearest neighbors are Consider the two domains as one
computed in the source computed in the source integrated domain:
domain domain to Ks. As in Type1, set of K from the
domain 1 presents the
nearest neighbors.
These neighbors are K nearest neighbors are
utilized to generate also computed in the But in this case it aggregates
target domain to Kt. with the set of K nearest-
recommendation in the
neighbors within domain 2.
target domain.
From the aggregation
The most similar K results K users with a
This method is usable neighbors are selected maximum cosine similarity
for a user that is new in from U(Ks,Kt). value were selected and the
the target domain, and prediction was done with an
has history in the attitude to those K nearest
source domain. neighbors.
30.07.2012
Similarity Weights Optimization: also known by the name "Neighborhood modeling through global optimization". In SWO the similarity function (Pearson, Cosine) is only used to determine the neighbours. The weights for the weighted average are found via an optimization process which minimizes the total prediction error – the weights are the optimized parameter in the error function. The difference between NN CF and SWO (similarity weight optimization) is that in NN CF the similarity function (Pearson, Cosine) is used to both determine the nearest neighbours and determine the weights in the weighted average of the prediction. This technique requires data normalization.
In some situations the system can be asked for a recommendation tailored for a group of people. For example if a family is sitting together watching TV, the system needs to recommend something that suits the family as a whole. A sports show might be more interesting for the father, but would leave some other members of the family unsatisfied. In some systems the group is dynamic, and the members of the group change over time, which requires constant adjustments on the system's part. The satisfaction of individuals may be a complex matter since for example if the TV shows makes the children happy, then the mother may also be (indirectly) happy just because her children are happy. In some cases multiple items are recommended to the group, for example in a trip recommender there is time to visit 4 different places within a day's trip, and different members prefer to visit different locations.[1,2,3].