SlideShare a Scribd company logo
1 of 47
1©MapR Technologies 2013-
Multi-input Recommendations
2©MapR Technologies 2013-
Me, Us
 Ted Dunning, Chief Application Architect, MapR
Committer PMC member, Mahout, Zookeeper, Drill
Bought the beer at the first HUG
 MapR
Distributes more open source components for Hadoop
Adds major technology for performance, HA, industry standard API’s
 Info
Hash tag - #mapr
See also - @ApacheMahout @ApacheDrill
@ted_dunning and @mapR
3©MapR Technologies 2013-
Topic For Today
 What is recommendation?
 What makes it different?
 What is multi-model recommendation?
 How can I build it using common household items?
4©MapR Technologies 2013-
Oh … Also This
 Detailed break-down of a live machine learning system running
with Mahout on MapR
 With code examples
5©MapR Technologies 2013-
I may have to
summarize
6©MapR Technologies 2013-
I may have to
summarize
just a bit
7©MapR Technologies 2013-
Part 1:
5 minutes of background
8©MapR Technologies 2013-
Part 2:
10 minutes: I want a pony
9©MapR Technologies 2013-
Part 3:
Implementation Examples
10©MapR Technologies 2013-
11©MapR Technologies 2013-
Part 1:
5 minutes of background
12©MapR Technologies 2013-
What Does Machine Learning Look Like?
13©MapR Technologies 2013-
Recommendations as Machine Learning
 Recommendation:
– Involves observation of interactions between people taking action (users)
and items for input data to the recommender model
– Goal is to suggest additional appropriate or desirable interactions
– Applications include: movie, music or map-based restaurant choices;
suggesting sale items for e-stores or via cash-register receipts
14©MapR Technologies 2013-
15©MapR Technologies 2013-
16©MapR Technologies 2013-
Part 2:
How recommenders work
(I still want a pony)
17©MapR Technologies 2013-
Recommendations
Recap:
Behavior of a crowd helps us
understand what individuals will do
18©MapR Technologies 2013-
Recommendations
Alice got an apple and a
puppy
Charles got a bicycle
Alice
Charles
19©MapR Technologies 2013-
Recommendations
Alice got an apple and a
puppy
Charles got a bicycle
Bob got an apple
Alice
Bob
Charles
20©MapR Technologies 2013-
Recommendations
What else would Bob like??
Alice
Bob
Charles
21©MapR Technologies 2013-
Recommendations
A puppy, of course!
Alice
Bob
Charles
22©MapR Technologies 2013-
You get the idea of how
recommenders work…
(By the way, like me, Bob
also wants a pony)
23©MapR Technologies 2013-
Recommendations
What if everybody gets a
pony?
?
Alice
Bob
Charles
Amelia What else would you
recommend for Amelia?
24©MapR Technologies 2013-
Recommendations
?
Alice
Bob
Charles
Amelia
If everybody gets a pony, it’s
not a very good indicator of
what to else predict...
25©MapR Technologies 2013-
Problems with Raw Co-occurrence
 Very popular items co-occur with everything (or why it’s not
very helpful to know that everybody wants a pony…)
– Examples: Welcome document; Elevator music
 Very widespread occurrence is not interesting as a way to
generate indicators
– Unless you want to offer an item that is constantly desired, such as razor
blades (or ponies)
 What we want is anomalous co-occurrence
– This is the source of interesting indicators of preference on which to
base recommendation
26©MapR Technologies 2013-
Get Useful Indicators from Behaviors
1. Use log files to build history matrix of users x items
– Remember: this history of interactions will be sparse compared to all
potential combinations
2. Transform to a co-occurrence matrix of items x items
3. Look for useful co-occurrence by looking for anomalous co-
occurrences to make an indicator matrix
– Log Likelihood Ratio (LLR) can be helpful to judge which co-occurrences
can with confidence be used as indicators of preference
– RowSimilarityJob in Apache Mahout uses LLR
27©MapR Technologies 2013-
Log Files
Alice
Bob
Charles
Alice
Bob
Charles
Alice
28©MapR Technologies 2013-
History Matrix: Users by Items
Alice
Bob
Charles
✔ ✔ ✔
✔ ✔
✔ ✔
29©MapR Technologies 2013-
Co-occurrence Matrix: Items by Items
-
1 2
1 1
1
1
2 1
How do you tell which co-occurrences are useful?.
0
0
0 0
30©MapR Technologies 2013-
Co-occurrence Binary Matrix
1
1not
not
1
31©MapR Technologies 2013-
Indicator Matrix: Anomalous Co-Occurrence
✔
Result: The marked row will be added to the indicator
field in the item document…
32©MapR Technologies 2013-
Indicator Matrix
✔
id: t4
title: puppy
desc: The sweetest little puppy ever.
keywords: puppy, dog, pet
indicators: (t1)
That one row from indicator matrix becomes the indicator field in the Solr
document used to deploy the recommendation engine.
Note: data for the indicator field is added directly to meta-data for a document in
Solr index. You don’t need to create a separate index for the indicators.
33©MapR Technologies 2013-
Internals of the Recommender Engine
33
34©MapR Technologies 2013-
Internals of the Recommender Engine
34
35©MapR Technologies 2013-
A Quick Simplification
 Users who do h (a vector of things a user has done)
 Also do r
Ah
AT
Ah( )
AT
A( )h
User-centric recommendations
(transpose translates back to things)
Item-centric recommendations
(change the order of operations)
A translates things into users
36©MapR Technologies 2013-
Nice. But we
can do better?
37©MapR Technologies 2013-
Ausers
things
38©MapR Technologies 2013-
A1 A2
é
ë
ù
û
users
thing
type 1
thing
type 2
39©MapR Technologies 2013-
A1 A2
é
ë
ù
û
T
A1 A2
é
ë
ù
û=
A1
T
A2
T
é
ë
ê
ê
ù
û
ú
ú
A1 A2
é
ë
ù
û
=
A1
T
A1 A1
T
A2
AT
2A1 AT
2A2
é
ë
ê
ê
ù
û
ú
ú
r1
r2
é
ë
ê
ê
ù
û
ú
ú
=
A1
T
A1 A1
T
A2
AT
2A1 AT
2A2
é
ë
ê
ê
ù
û
ú
ú
h1
h2
é
ë
ê
ê
ù
û
ú
ú
r1 = A1
T
A1 A1
T
A2
é
ëê
ù
ûú
h1
h2
é
ë
ê
ê
ù
û
ú
ú
40©MapR Technologies 2013-
Now again, without
the scary math
41©MapR Technologies 2013-
Part 3:
Implementation Examples
42©MapR Technologies 2013-
For example
 Users enter queries (A)
– (actor = user, item=query)
 Users view videos (B)
– (actor = user, item=video)
 ATA gives query recommendation
– “did you mean to ask for”
 BTB gives video recommendation
– “you might like these videos”
43©MapR Technologies 2013-
The punch-line
 BTA recommends videos in response to a query
– (isn’t that a search engine?)
– (not quite, it doesn’t look at content or meta-data)
44©MapR Technologies 2013-
Real-life example
 Query: “Paco de Lucia”
 Conventional meta-data search results:
– “hombres del paco” times 400
– not much else
 Recommendation based search:
– Flamenco guitar and dancers
– Spanish and classical guitar
– Van Halen doing a classical/flamenco riff
45©MapR Technologies 2013-
Real-life example
46©MapR Technologies 2013-
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text description
– Phone
– Address
– Location
– Indicator merchant id’s
– Indicator industry (SIC) id’s
– Indicator offers
– Indicator text
– Local top40
 Sample query
– Current location
– Recent merchant descriptions
– Recent merchant id’s
– Recent SIC codes
– Recent accepted offers
– Local top40
Original data
and meta-data
Derived from cooccurrence
and cross-occurrence
analysis
Recommendation
query
47©MapR Technologies 2013-
Me, Us
 Ted Dunning, Chief Application Architect, MapR
Committer PMC member, Mahout, Zookeeper, Drill
Bought the beer at the first HUG
 MapR
Distributes more open source components for Hadoop
Adds major technology for performance, HA, industry standard API’s
 Info
Hash tag - #mapr
See also - @ApacheMahout @ApacheDrill
@ted_dunning and @mapR

More Related Content

What's hot

Doing-the-impossible
Doing-the-impossibleDoing-the-impossible
Doing-the-impossibleTed Dunning
 
What is the past future tense of data?
What is the past future tense of data?What is the past future tense of data?
What is the past future tense of data?Ted Dunning
 
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Ted Dunning
 
Which Algorithms Really Matter
Which Algorithms Really MatterWhich Algorithms Really Matter
Which Algorithms Really MatterTed Dunning
 
Strata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionStrata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionTed Dunning
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015Ted Dunning
 
Cognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approachesCognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approachesTed Dunning
 
Polyvalent recommendations
Polyvalent recommendationsPolyvalent recommendations
Polyvalent recommendationsTed Dunning
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningTed Dunning
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoopTed Dunning
 
Finding Changes in Real Data
Finding Changes in Real DataFinding Changes in Real Data
Finding Changes in Real DataTed Dunning
 
Buzz words-dunning-real-time-learning
Buzz words-dunning-real-time-learningBuzz words-dunning-real-time-learning
Buzz words-dunning-real-time-learningTed Dunning
 
Possible Visions for Mahout 1.0
Possible Visions for Mahout 1.0Possible Visions for Mahout 1.0
Possible Visions for Mahout 1.0Ted Dunning
 
Mathematical bridges From Old to New
Mathematical bridges From Old to NewMathematical bridges From Old to New
Mathematical bridges From Old to NewMapR Technologies
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterDataWorks Summit
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseAllen Day, PhD
 

What's hot (20)

Doing-the-impossible
Doing-the-impossibleDoing-the-impossible
Doing-the-impossible
 
What is the past future tense of data?
What is the past future tense of data?What is the past future tense of data?
What is the past future tense of data?
 
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
 
Which Algorithms Really Matter
Which Algorithms Really MatterWhich Algorithms Really Matter
Which Algorithms Really Matter
 
Strata 2014 Anomaly Detection
Strata 2014 Anomaly DetectionStrata 2014 Anomaly Detection
Strata 2014 Anomaly Detection
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
 
Cognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approachesCognitive computing with big data, high tech and low tech approaches
Cognitive computing with big data, high tech and low tech approaches
 
Polyvalent recommendations
Polyvalent recommendationsPolyvalent recommendations
Polyvalent recommendations
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoop
 
Finding Changes in Real Data
Finding Changes in Real DataFinding Changes in Real Data
Finding Changes in Real Data
 
Dunning ml-conf-2014
Dunning ml-conf-2014Dunning ml-conf-2014
Dunning ml-conf-2014
 
Buzz words-dunning-real-time-learning
Buzz words-dunning-real-time-learningBuzz words-dunning-real-time-learning
Buzz words-dunning-real-time-learning
 
T digest-update
T digest-updateT digest-update
T digest-update
 
GoTo Amsterdam 2013 Skinned
GoTo Amsterdam 2013 SkinnedGoTo Amsterdam 2013 Skinned
GoTo Amsterdam 2013 Skinned
 
Possible Visions for Mahout 1.0
Possible Visions for Mahout 1.0Possible Visions for Mahout 1.0
Possible Visions for Mahout 1.0
 
Mathematical bridges From Old to New
Mathematical bridges From Old to NewMathematical bridges From Old to New
Mathematical bridges From Old to New
 
How to Determine which Algorithms Really Matter
How to Determine which Algorithms Really MatterHow to Determine which Algorithms Really Matter
How to Determine which Algorithms Really Matter
 
Polyvalent Recommendations
Polyvalent RecommendationsPolyvalent Recommendations
Polyvalent Recommendations
 
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San JoseR + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
R + Storm Moneyball - Realtime Advanced Statistics - Hadoop Summit - San Jose
 

Viewers also liked

Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Cataldo Musto
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - RecommendationCataldo Musto
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineNYC Predictive Analytics
 
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationBuzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationTed Dunning
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsUniversity of Bergen
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahoutGregg Barrett
 
Personalized news recommendation engine
Personalized news recommendation enginePersonalized news recommendation engine
Personalized news recommendation enginePrateek Sachdev
 
Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Christophe Willemsen
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshBigDataCloud
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen
 
A recommendation engine for your php application
A recommendation engine for your php applicationA recommendation engine for your php application
A recommendation engine for your php applicationMichele Orselli
 
Build Your Own Recommendation Engine
Build Your Own Recommendation EngineBuild Your Own Recommendation Engine
Build Your Own Recommendation EngineSri Ambati
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filteringsscdotopen
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsLei Guo
 
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)Jee Vang, Ph.D.
 
Developing a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkDeveloping a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkEdureka!
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engineKeeyong Han
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkCaserta
 

Viewers also liked (20)

Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Buzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendationBuzz words-dunning-multi-modal-recommendation
Buzz words-dunning-multi-modal-recommendation
 
Interaction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender SystemsInteraction Design Patterns in Recommender Systems
Interaction Design Patterns in Recommender Systems
 
Example: movielens data with mahout
Example: movielens data with mahoutExample: movielens data with mahout
Example: movielens data with mahout
 
Personalized news recommendation engine
Personalized news recommendation enginePersonalized news recommendation engine
Personalized news recommendation engine
 
Recommendation engine
Recommendation engineRecommendation engine
Recommendation engine
 
Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16Your own recommendation engine with neo4j and reco4php - DPC16
Your own recommendation engine with neo4j and reco4php - DPC16
 
Recommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab GhoshRecommendation Engine Powered by Hadoop - Pranab Ghosh
Recommendation Engine Powered by Hadoop - Pranab Ghosh
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
 
A recommendation engine for your php application
A recommendation engine for your php applicationA recommendation engine for your php application
A recommendation engine for your php application
 
Build Your Own Recommendation Engine
Build Your Own Recommendation EngineBuild Your Own Recommendation Engine
Build Your Own Recommendation Engine
 
Latent factor models for Collaborative Filtering
Latent factor models for Collaborative FilteringLatent factor models for Collaborative Filtering
Latent factor models for Collaborative Filtering
 
Matrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender SystemsMatrix Factorization Techniques For Recommender Systems
Matrix Factorization Techniques For Recommender Systems
 
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
A Quick Tutorial on Mahout’s Recommendation Engine (v 0.4)
 
Developing a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkDeveloping a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with Spark
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 

Similar to Using Mahout and a Search Engine for Recommendation

DFW Big Data talk on Mahout Recommenders
DFW Big Data talk on Mahout RecommendersDFW Big Data talk on Mahout Recommenders
DFW Big Data talk on Mahout RecommendersTed Dunning
 
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsBuzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsMapR Technologies
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with HadoopDataWorks Summit
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutTed Dunning
 
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutWhat's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutMapR Technologies
 
Goto amsterdam-2013-skinned
Goto amsterdam-2013-skinnedGoto amsterdam-2013-skinned
Goto amsterdam-2013-skinnedTed Dunning
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation WorkshopMapR Technologies
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadooplucenerevolution
 
Recommendation as Search: Reflections on Symmetry
Recommendation as Search: Reflections on SymmetryRecommendation as Search: Reflections on Symmetry
Recommendation as Search: Reflections on SymmetryMapR Technologies
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFMLconf
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to MahoutTed Dunning
 
Introduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUGIntroduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUGMapR Technologies
 
Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solrboorad
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in businessMapR Technologies
 
Predictive Analytics San Diego
Predictive Analytics San DiegoPredictive Analytics San Diego
Predictive Analytics San DiegoMapR Technologies
 
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15MLconf
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsNeo4j
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningMapR Technologies
 

Similar to Using Mahout and a Search Engine for Recommendation (20)

DFW Big Data talk on Mahout Recommenders
DFW Big Data talk on Mahout RecommendersDFW Big Data talk on Mahout Recommenders
DFW Big Data talk on Mahout Recommenders
 
Buzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal RecommendationsBuzz Words Dunning Multi Modal Recommendations
Buzz Words Dunning Multi Modal Recommendations
 
Predictive Analytics with Hadoop
Predictive Analytics with HadoopPredictive Analytics with Hadoop
Predictive Analytics with Hadoop
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
What's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache MahoutWhat's Right and Wrong with Apache Mahout
What's Right and Wrong with Apache Mahout
 
Goto amsterdam-2013-skinned
Goto amsterdam-2013-skinnedGoto amsterdam-2013-skinned
Goto amsterdam-2013-skinned
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
 
Crowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoopCrowd sourced intelligence built into search over hadoop
Crowd sourced intelligence built into search over hadoop
 
Recommendation as Search: Reflections on Symmetry
Recommendation as Search: Reflections on SymmetryRecommendation as Search: Reflections on Symmetry
Recommendation as Search: Reflections on Symmetry
 
Ted Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SFTed Dunning, Chief Application Architect, MapR at MLconf SF
Ted Dunning, Chief Application Architect, MapR at MLconf SF
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to Mahout
 
Introduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUGIntroduction to Mahout given at Twin Cities HUG
Introduction to Mahout given at Twin Cities HUG
 
Big Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and SolrBig Data Analysis Patterns with Hadoop, Mahout and Solr
Big Data Analysis Patterns with Hadoop, Mahout and Solr
 
Big Data Paris
Big Data ParisBig Data Paris
Big Data Paris
 
Big Data Paris
Big Data ParisBig Data Paris
Big Data Paris
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in business
 
Predictive Analytics San Diego
Predictive Analytics San DiegoPredictive Analytics San Diego
Predictive Analytics San Diego
 
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
Ted Dunning, Chief Application Architect, MapR at MLconf ATL - 9/18/15
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
 

More from Ted Dunning

Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxDunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxTed Dunning
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with KubernetesTed Dunning
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in KubernetesTed Dunning
 
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forAnomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forTed Dunning
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning LogisticsTed Dunning
 
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTed Dunning
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logisticsTed Dunning
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownTed Dunning
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopTed Dunning
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7Ted Dunning
 

More from Ted Dunning (11)

Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxDunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with Kubernetes
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
 
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forAnomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning Logistics
 
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Using Mahout and a Search Engine for Recommendation

Editor's Notes

  1. Note to speaker: Move quickly through 1st two slides just to set the tone of familiar use cases but somewhat complicated under-the-covers math and algorithms… You don’t need to explain or discuss these examples at this point… just mention one or twoTalk track: Machine learning shows up in many familiar everyday examples, from product recommendations to listing news topics to filtering out that nasty spam from email….
  2. Note to trainers: the next series of slides start with a cartoon example just to set the pattern of how to find co-occurrence and use it to find indicators of what to recommend. Of course, real examples require a LOT of data of user-item interaction history to actually work, so this is just an analogy to get the idea across…
  3. * A history of what everybody has done. Obviously this is just a cartoon because large numbers of users and interactions with items would be required to build a recommender* Next step will be to predict what a new user might like…
  4. *Bob is the “new user” and getting apple is his history
  5. *Here is where the recommendation engine needs to go to work…Note to trainer: you might see if audience calls out the answer before revealing next slide…
  6. Now you see the idea of co-occurrence as a basis for recommendation…
  7. *Now we have a new user, Amelia. Like everybody else, she gets a pony… what should the recommender offer her based on her history?
  8. * Pony not interesting because it is so widespread that it does not differentiate a pattern
  9. Note to trainer: This is the situation similar to that in which we started, with three users in our history. The difference is that now everybody got a pony. Bob has apple and pony but not a puppy…yet
  10. *Binary matrix is stored sparsely
  11. *Convert by MapReduce into a binary matrixNote to trainer: Whether consider apple to have occurred with self is open question
  12. Old joke: all the world can be divided into 2 categories: Scotch tape and non-Scotch tape… This is a way to think about the co-occurrence
  13. Only important co-occurrence is puppy follows apple
  14. *Take that row of matrix and combine with all the meta data we might have…*Important thing to get from the co-occurrence matrix is this indicator..Cool thing: analogous to what a lot of recommendation engines do*This row forms the indicator field in a Solr document containing meta-data (you do NOT have to build a separate index for the indicators)Find the useful co-occurrence and get rid of the rest. Sparsify and get the anomalous co-occurrence
  15. Note to trainer: take a little time to explore this here and on the next couple of slides. Details enlarged on next slide
  16. *This indicator field is where the output of the Mahout recommendation engine are stored (the row from the indicator matrix that identified significant or interesting co-occurrence. *Keep in mind that this recommendation indicator data is added to the same original document in the Solr index that contains meta data for the item in question
  17. Note to trainer: you could ask the class to consider which data is related… for example, the first 3 bullets of the query relate to meta data for the item, not to data produced by the recommendation algorithm. The last 3 bullets refer to data in the sample query related to data in the indicator field(s) that were produced by the Mahout recommendation engine.