SlideShare a Scribd company logo
1 of 29
Kostiantyn Kudriavtsev
April 2014
Creating Impersonal
Recommendation
system in the BigData
era
Agenda
1. Recommendation system overview
2. Different approaches to build recommendation system
3. Impersonal recommendation system in theory
4. Impersonal recommendation system in practice
Recommendation system
The goal of a recommendation system is to predict the degree
to which an user will like or dislike a set of items, such as
goods or services.
Recommendation systems have become extremely common in
recent years, and are applied in a variety of applications and
fields. The most popular ones are goods, movies, music,
news, books, research articles, search queries, social tags,
restaurants, financial services, live insurances and people
(social networks and online dating).
Examples of using
recommenders
Amazon uses recommendation system to
increase sales by 35% and suggests goods
based on previous user’s experience and
the frequenters bought goods
Netflix suggests movies based on behaviour
of similar users and previous user’s rating
(result: 2 of 3 movies are watched after
recommendation)
Pandora radio suggest music base on
previous user’s experience
Examples of using
recommenders
In 2012, Target predicted woman
pregnant before medical test based
on changes in her shopping
behaviour
Possible approaches
There are several total different approaches to build
recommendation system:
❖ Collaborative filtering – based on users interaction (likes,
views, buys); extremely popular on online services, shops, etc
❖ Knowledge base – pursue knowledge-based approach; common
used for impersonal recommendations
❖ Content based – similarity of items results in suggestions;
common used to suggest text articles, songs
❖ Hybrid – combine the others approaches
Collaborative filtering
❖ Also known as social-filtering systems, aggregate data
about customer’s preferences or purchasing habits. Then
they give recommendations based on similarity between
users or similarity in overall behaviour patterns.
❖ For example, Netflix uses tuned collaborative filtering
algorithm to suggest movies. If user U1 likes movie M1,
and user U2 likes movies M1 and M2 then movie M2
will be recommended for user M1
Collaborative filtering
❖ The users behaviour history (views, clicks, buys) is required to
implement collaborative filtering recommender. The idea is to
find users with similar preferences and gives them
recommendations based on similar user’s preferences.
❖ In fact, this approach requires access to user’s profiles and
capability to save each action (both technical and legal). After
that, analysis may be run to get list of preferences for each user.
❖ There is cold start problem: not possible to get
recommendations for new user, because of similar user is
unknown yet
Knowledge based
recommenders
❖ Suggest products/services based on inferences about a
user’s preferences and needs. There are several different
types of these systems: some of them uses prebuilt/already
known rules, the others build these rules dynamically.
❖ Unlike collaborative filtering this approach doesn’t require
user’s profiles. Recommendations may be given based on
some predefined or dynamically created rules.
❖ This approach may be used not only for online application,
but also for different offline use cases as retail
Knowledge based
recommenders
For example, there is recommender built by Yhat that
suggest new sort of beer to try based on knowledge about
beer (i.e. user who likes light lager with known aroma,
palate, etc will like similar beer XXX).
http://jeweell.com/ct/food/1133467-beer.html
Content-based
recommenders
❖ Content-based recommenders are based on machine learning
research (particularly, clustering and classification). It’s
common used by news aggregators to suggest new stories the
user might like to read and cluster them in different groups.
❖ For example, Google News recommendations for the article:
Hybrid approach
❖ Combine previous described methods to reach the best
performance.
❖ There is well known cold start problem when algorithm
doesn’t have data to give recommendation for new
user/product. It can be solved by using different
approaches to give recommendations for new/well-known
users or products. For example, goods might be
recommended by collaborative filtering, but knowledge
base recommender will be used for new users/products (we
don’t have history yet)
Which approach to choose?
❖ In fact, the thorough analysis is required to choose the
correct approach for each use case.
❖ Several approaches may be used to solve the same issue
and the correct one is not easy to choose, because of a lot
of factors influence performance of recommendations
and different goals may returns in different approaches.
Which approach to choose?
Let’s imagine user living in Lviv with some café preferences. He is making a short
weekend trip to London. What could be recommend for him in London?
All previously mentioned
approaches applicable to
answer this question:
•Collaborative filtering
•Knowledge-base
•Content-base
Impersonal recommender
The idea of impersonal recommender is to give
recommendations not for particular user, but in general.
For instance, it may be used in retail to find goods-
complements. There are really not obvious case:
Wal-Mart discovered that diapers are sold together with
expensive beer on friday evening. Placed them together
leads in geometrical growth of sales.
Impersonal recommender
Applicable in the different areas:
• retail, by employees to increase
revenue/sales
• in e-commerce as short-budget
approach for making
recommendations on web-site
• interactive navigator-kiosk
http://smartcity.prom.ua/g2763766-interaktivnyj-sensornyj-kiosk
Data Science way of getting
things done
http://www.tomatosphere.org/teacher-resources/teachers-guide/principal-investigation/scientific-method.cfm
The problem
There is a history of customer’s actions:
{a1, a2, a3}
{a2, a3, a5, a6}
{a4, a2}
{a1, a5, a6, a3}
{a3, a5, a2}
…
What should we suggest to customer who has already
committed {a2, a5} (let’s assume that order doesn’t matter)?
Naive approach: frequent
item-sets
Affinity analysis is used to build Frequent Item-Sets
are widely used in Market Basket analysis.
Several algorithms were created to perform affinity
analysis (Apriory, FP-Growth)
Unfortunately, it doesn’t work. Frequent Item-Sets don’t
filter out already-purchased goods and “cannibals”.
Next step: association rules
Association rules are active used in Market Basket Analysis and may be
effectively used for creating recommendations.
General association rule looks like:
A => B,
usually purchase of A leads to purchase of B (rule is user independent).
Rule has several statistical characteristics (supports, confidence, lift) that
show strength of rule and may be used for high quality recommendations
Rules for recommendation
It’s not enough to build rules, they must be correctly interpreted lately. The
most important properties of each rule are support (show who this rule is
important/frequent), confidence (how you can rely on this rule) and lift.
Lift is a derivate from Bayes’ theorem and show positive/negative correlation
between head and tail of rule:
head => tail
All these parameters must be chosen for each particular case. In general,
• lift < 1 means negative correlation (rules works, but has negative effect)
• lift ~ 1 means no correlation (rules doesn’t work)
• lift > 1 means positive correlation
Online recommender
evaluation
Of course, recommender is not
ended up generating rules.
The remaining task: evaluate
quality of generated rules. It
gives possibility not only
compare different models (using
A/B testing), but also use
clickstream to improve rules. http://www.sitedoublers.com/blog/multivariate-test-victorias-
secret
Online recommender
improvement
❖ Users reaction on recommendation may be used to
improve quality of recommender. For example, it’s
possible to save successful/ignored recommendations
and use these information to improve new generated
recommendations.
❖ It is quite important, because user preferences is not
stable and is changing during the time.
Technology stack
❖ There are a lot of already implemented solutions for building
association rules made by Oracle, SAS, Microsoft, etc.
❖ However, in the new world of unstructured/semistructured data
and growing data amount, it’s not enough. Quality of recommender
depends on amount of data used to train recommender. Than more
data is available for analysis, than better.
❖ EDW trends to engage Hadoop as main storage and processing
system
❖ Here comes Hadoop-centric solution…
Overal architecture
Apache Hadoop
Hadoop is designed to save and process petabytes of data and is an
ideal choice to build recommendation system on top of it.
Hadoop provides wide range of tools for efficient data processing as
well as specialised library for data science needs (Mahout), i.e. for
building of recommender
Apache Hadoop is an open-source software framework for storage
and large-scale processing of data-sets on clusters of commodity
hardware.
ElasticSearch
Elasticsearch is a search server based on Lucene. It provides a
distributed, multitenant-capable full-text search engine with a
RESTful web interface and schema-free JSON documents. It
provides scalable search, has near real-time search, and supports
multitenancy.
ElasticSearch is used by GitHub, Foursquare, Etsy, SoundCloud,
Xing and Wikimedia and can leverage several TB of data.
ElasticSearch will be used for keeping rules and serving requests
Architecture
Questions?
Thank you for attention

More Related Content

What's hot

Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systems
Joaquin Delgado PhD.
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation Systems
Salil Navgire
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques
sun9413
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
ijerd
 

What's hot (20)

Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Scalable advertising recommender systems
Scalable advertising recommender systemsScalable advertising recommender systems
Scalable advertising recommender systems
 
Recommender systems for E-commerce
Recommender systems for E-commerceRecommender systems for E-commerce
Recommender systems for E-commerce
 
Recommendation Systems - Why How and Real Life Applications
Recommendation Systems - Why How and Real Life ApplicationsRecommendation Systems - Why How and Real Life Applications
Recommendation Systems - Why How and Real Life Applications
 
Recommender system
Recommender systemRecommender system
Recommender system
 
A Hybrid Recommendation system
A Hybrid Recommendation systemA Hybrid Recommendation system
A Hybrid Recommendation system
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Data Mining and Recommendation Systems
Data Mining and Recommendation SystemsData Mining and Recommendation Systems
Data Mining and Recommendation Systems
 
Content - Based Recommendations Enhanced with Collaborative Information
Content - Based Recommendations Enhanced with Collaborative InformationContent - Based Recommendations Enhanced with Collaborative Information
Content - Based Recommendations Enhanced with Collaborative Information
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Recommendation techniques
Recommendation techniques Recommendation techniques
Recommendation techniques
 
Recommendation System Explained
Recommendation System ExplainedRecommendation System Explained
Recommendation System Explained
 
Recommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assocRecommendation system using unsupervised machine learning algorithm & assoc
Recommendation system using unsupervised machine learning algorithm & assoc
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Machine Learning for Recommender Systems MLSS 2015 Sydney
Machine Learning for Recommender Systems MLSS 2015 SydneyMachine Learning for Recommender Systems MLSS 2015 Sydney
Machine Learning for Recommender Systems MLSS 2015 Sydney
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Kdd 2014 Tutorial - the recommender problem revisited
Kdd 2014 Tutorial -  the recommender problem revisitedKdd 2014 Tutorial -  the recommender problem revisited
Kdd 2014 Tutorial - the recommender problem revisited
 

Viewers also liked

Febelfin Recommendation on Sustainable Financial Products
Febelfin Recommendation on Sustainable Financial ProductsFebelfin Recommendation on Sustainable Financial Products
Febelfin Recommendation on Sustainable Financial Products
Febelfin
 
Spatially Aware Recommendation System
Spatially Aware Recommendation SystemSpatially Aware Recommendation System
Spatially Aware Recommendation System
Veer Chandra
 
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
Dr. Haxel Consult
 
TOROS: Python Framework for Recommender System
TOROS: Python Framework for Recommender SystemTOROS: Python Framework for Recommender System
TOROS: Python Framework for Recommender System
Kwangseob Kim
 
Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems: Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems:
Vincent Chu
 
e-learning 3.0 and AI
e-learning 3.0 and AIe-learning 3.0 and AI
e-learning 3.0 and AI
Neil Rubens
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
Roger Chen
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
PyData
 

Viewers also liked (20)

Febelfin Recommendation on Sustainable Financial Products
Febelfin Recommendation on Sustainable Financial ProductsFebelfin Recommendation on Sustainable Financial Products
Febelfin Recommendation on Sustainable Financial Products
 
Financial Products for the Family
Financial Products for the FamilyFinancial Products for the Family
Financial Products for the Family
 
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
 
Spatially Aware Recommendation System
Spatially Aware Recommendation SystemSpatially Aware Recommendation System
Spatially Aware Recommendation System
 
Recommendation System --Theory and Practice
Recommendation System --Theory and PracticeRecommendation System --Theory and Practice
Recommendation System --Theory and Practice
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
 
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
II-SDV 2014 Recommender Systems for Analysis Applications (Roger Bradford - A...
 
TOROS: Python Framework for Recommender System
TOROS: Python Framework for Recommender SystemTOROS: Python Framework for Recommender System
TOROS: Python Framework for Recommender System
 
Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems: Toward the Next Generation of Recommender Systems:
Toward the Next Generation of Recommender Systems:
 
Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?Offline evaluation of recommender systems: all pain and no gain?
Offline evaluation of recommender systems: all pain and no gain?
 
Hybrid Recommender System Architecture for Personalized Wellness Management
Hybrid Recommender System Architecture for Personalized Wellness ManagementHybrid Recommender System Architecture for Personalized Wellness Management
Hybrid Recommender System Architecture for Personalized Wellness Management
 
Introduction to behavior based recommendation system
Introduction to behavior based recommendation systemIntroduction to behavior based recommendation system
Introduction to behavior based recommendation system
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Trust and Recommender Systems
Trust and  Recommender SystemsTrust and  Recommender Systems
Trust and Recommender Systems
 
Profile injection attack detection in recommender system
Profile injection attack detection in recommender systemProfile injection attack detection in recommender system
Profile injection attack detection in recommender system
 
e-learning 3.0 and AI
e-learning 3.0 and AIe-learning 3.0 and AI
e-learning 3.0 and AI
 
Recommender Systems in E-Commerce
Recommender Systems in E-CommerceRecommender Systems in E-Commerce
Recommender Systems in E-Commerce
 
Recommender Systems and Active Learning (for Startups)
Recommender Systems and Active Learning (for Startups)Recommender Systems and Active Learning (for Startups)
Recommender Systems and Active Learning (for Startups)
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 

Similar to Impersonal Recommendation system on top of Hadoop

Recommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptxRecommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptx
Satyam Sharma
 
ADM6274 - Final (NEHA)
ADM6274 - Final (NEHA)ADM6274 - Final (NEHA)
ADM6274 - Final (NEHA)
Neha Gupta
 
Werkstuk nooij tcm39-91406
Werkstuk nooij tcm39-91406Werkstuk nooij tcm39-91406
Werkstuk nooij tcm39-91406
Khalil Muhammad
 

Similar to Impersonal Recommendation system on top of Hadoop (20)

Mini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation DemystifiedMini-training: Personalization & Recommendation Demystified
Mini-training: Personalization & Recommendation Demystified
 
Recommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptxRecommender System _Module 1_Introduction to Recommender System.pptx
Recommender System _Module 1_Introduction to Recommender System.pptx
 
Fuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender SystemFuzzy Logic Based Recommender System
Fuzzy Logic Based Recommender System
 
recommendation system techunique and issue
recommendation system techunique and issuerecommendation system techunique and issue
recommendation system techunique and issue
 
leewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdfleewayhertz.com-How to build an AI-powered recommendation system.pdf
leewayhertz.com-How to build an AI-powered recommendation system.pdf
 
IRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation SystemIRJET- Hybrid Book Recommendation System
IRJET- Hybrid Book Recommendation System
 
ADM6274 - Final (NEHA)
ADM6274 - Final (NEHA)ADM6274 - Final (NEHA)
ADM6274 - Final (NEHA)
 
IR UNIT V.docx
IR UNIT  V.docxIR UNIT  V.docx
IR UNIT V.docx
 
Werkstuk nooij tcm39-91406
Werkstuk nooij tcm39-91406Werkstuk nooij tcm39-91406
Werkstuk nooij tcm39-91406
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems Basics
 
Recommender systems in indian e-commerce context
Recommender systems in indian e-commerce contextRecommender systems in indian e-commerce context
Recommender systems in indian e-commerce context
 
Recommending the Appropriate Products for target user in E-commerce using SBT...
Recommending the Appropriate Products for target user in E-commerce using SBT...Recommending the Appropriate Products for target user in E-commerce using SBT...
Recommending the Appropriate Products for target user in E-commerce using SBT...
 
Recommended System.pptx
 Recommended System.pptx Recommended System.pptx
Recommended System.pptx
 
Demystifying recommender systems: know how recommendation systems work
Demystifying recommender systems: know how recommendation systems workDemystifying recommender systems: know how recommendation systems work
Demystifying recommender systems: know how recommendation systems work
 
Recommender system
Recommender system Recommender system
Recommender system
 
MOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEMMOVIE RECOMMENDATION SYSTEM
MOVIE RECOMMENDATION SYSTEM
 
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEMA NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
 
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEMA NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
A NOVEL RESEARCH PAPER RECOMMENDATION SYSTEM
 
Web personalization
Web personalizationWeb personalization
Web personalization
 
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
A REVIEW PAPER ON BFO AND PSO BASED MOVIE RECOMMENDATION SYSTEM | J4RV4I1015
 

Recently uploaded

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

Impersonal Recommendation system on top of Hadoop

  • 1. Kostiantyn Kudriavtsev April 2014 Creating Impersonal Recommendation system in the BigData era
  • 2. Agenda 1. Recommendation system overview 2. Different approaches to build recommendation system 3. Impersonal recommendation system in theory 4. Impersonal recommendation system in practice
  • 3. Recommendation system The goal of a recommendation system is to predict the degree to which an user will like or dislike a set of items, such as goods or services. Recommendation systems have become extremely common in recent years, and are applied in a variety of applications and fields. The most popular ones are goods, movies, music, news, books, research articles, search queries, social tags, restaurants, financial services, live insurances and people (social networks and online dating).
  • 4. Examples of using recommenders Amazon uses recommendation system to increase sales by 35% and suggests goods based on previous user’s experience and the frequenters bought goods Netflix suggests movies based on behaviour of similar users and previous user’s rating (result: 2 of 3 movies are watched after recommendation)
  • 5. Pandora radio suggest music base on previous user’s experience Examples of using recommenders In 2012, Target predicted woman pregnant before medical test based on changes in her shopping behaviour
  • 6. Possible approaches There are several total different approaches to build recommendation system: ❖ Collaborative filtering – based on users interaction (likes, views, buys); extremely popular on online services, shops, etc ❖ Knowledge base – pursue knowledge-based approach; common used for impersonal recommendations ❖ Content based – similarity of items results in suggestions; common used to suggest text articles, songs ❖ Hybrid – combine the others approaches
  • 7. Collaborative filtering ❖ Also known as social-filtering systems, aggregate data about customer’s preferences or purchasing habits. Then they give recommendations based on similarity between users or similarity in overall behaviour patterns. ❖ For example, Netflix uses tuned collaborative filtering algorithm to suggest movies. If user U1 likes movie M1, and user U2 likes movies M1 and M2 then movie M2 will be recommended for user M1
  • 8. Collaborative filtering ❖ The users behaviour history (views, clicks, buys) is required to implement collaborative filtering recommender. The idea is to find users with similar preferences and gives them recommendations based on similar user’s preferences. ❖ In fact, this approach requires access to user’s profiles and capability to save each action (both technical and legal). After that, analysis may be run to get list of preferences for each user. ❖ There is cold start problem: not possible to get recommendations for new user, because of similar user is unknown yet
  • 9. Knowledge based recommenders ❖ Suggest products/services based on inferences about a user’s preferences and needs. There are several different types of these systems: some of them uses prebuilt/already known rules, the others build these rules dynamically. ❖ Unlike collaborative filtering this approach doesn’t require user’s profiles. Recommendations may be given based on some predefined or dynamically created rules. ❖ This approach may be used not only for online application, but also for different offline use cases as retail
  • 10. Knowledge based recommenders For example, there is recommender built by Yhat that suggest new sort of beer to try based on knowledge about beer (i.e. user who likes light lager with known aroma, palate, etc will like similar beer XXX). http://jeweell.com/ct/food/1133467-beer.html
  • 11. Content-based recommenders ❖ Content-based recommenders are based on machine learning research (particularly, clustering and classification). It’s common used by news aggregators to suggest new stories the user might like to read and cluster them in different groups. ❖ For example, Google News recommendations for the article:
  • 12. Hybrid approach ❖ Combine previous described methods to reach the best performance. ❖ There is well known cold start problem when algorithm doesn’t have data to give recommendation for new user/product. It can be solved by using different approaches to give recommendations for new/well-known users or products. For example, goods might be recommended by collaborative filtering, but knowledge base recommender will be used for new users/products (we don’t have history yet)
  • 13. Which approach to choose? ❖ In fact, the thorough analysis is required to choose the correct approach for each use case. ❖ Several approaches may be used to solve the same issue and the correct one is not easy to choose, because of a lot of factors influence performance of recommendations and different goals may returns in different approaches.
  • 14. Which approach to choose? Let’s imagine user living in Lviv with some café preferences. He is making a short weekend trip to London. What could be recommend for him in London? All previously mentioned approaches applicable to answer this question: •Collaborative filtering •Knowledge-base •Content-base
  • 15. Impersonal recommender The idea of impersonal recommender is to give recommendations not for particular user, but in general. For instance, it may be used in retail to find goods- complements. There are really not obvious case: Wal-Mart discovered that diapers are sold together with expensive beer on friday evening. Placed them together leads in geometrical growth of sales.
  • 16. Impersonal recommender Applicable in the different areas: • retail, by employees to increase revenue/sales • in e-commerce as short-budget approach for making recommendations on web-site • interactive navigator-kiosk http://smartcity.prom.ua/g2763766-interaktivnyj-sensornyj-kiosk
  • 17. Data Science way of getting things done http://www.tomatosphere.org/teacher-resources/teachers-guide/principal-investigation/scientific-method.cfm
  • 18. The problem There is a history of customer’s actions: {a1, a2, a3} {a2, a3, a5, a6} {a4, a2} {a1, a5, a6, a3} {a3, a5, a2} … What should we suggest to customer who has already committed {a2, a5} (let’s assume that order doesn’t matter)?
  • 19. Naive approach: frequent item-sets Affinity analysis is used to build Frequent Item-Sets are widely used in Market Basket analysis. Several algorithms were created to perform affinity analysis (Apriory, FP-Growth) Unfortunately, it doesn’t work. Frequent Item-Sets don’t filter out already-purchased goods and “cannibals”.
  • 20. Next step: association rules Association rules are active used in Market Basket Analysis and may be effectively used for creating recommendations. General association rule looks like: A => B, usually purchase of A leads to purchase of B (rule is user independent). Rule has several statistical characteristics (supports, confidence, lift) that show strength of rule and may be used for high quality recommendations
  • 21. Rules for recommendation It’s not enough to build rules, they must be correctly interpreted lately. The most important properties of each rule are support (show who this rule is important/frequent), confidence (how you can rely on this rule) and lift. Lift is a derivate from Bayes’ theorem and show positive/negative correlation between head and tail of rule: head => tail All these parameters must be chosen for each particular case. In general, • lift < 1 means negative correlation (rules works, but has negative effect) • lift ~ 1 means no correlation (rules doesn’t work) • lift > 1 means positive correlation
  • 22. Online recommender evaluation Of course, recommender is not ended up generating rules. The remaining task: evaluate quality of generated rules. It gives possibility not only compare different models (using A/B testing), but also use clickstream to improve rules. http://www.sitedoublers.com/blog/multivariate-test-victorias- secret
  • 23. Online recommender improvement ❖ Users reaction on recommendation may be used to improve quality of recommender. For example, it’s possible to save successful/ignored recommendations and use these information to improve new generated recommendations. ❖ It is quite important, because user preferences is not stable and is changing during the time.
  • 24. Technology stack ❖ There are a lot of already implemented solutions for building association rules made by Oracle, SAS, Microsoft, etc. ❖ However, in the new world of unstructured/semistructured data and growing data amount, it’s not enough. Quality of recommender depends on amount of data used to train recommender. Than more data is available for analysis, than better. ❖ EDW trends to engage Hadoop as main storage and processing system ❖ Here comes Hadoop-centric solution…
  • 26. Apache Hadoop Hadoop is designed to save and process petabytes of data and is an ideal choice to build recommendation system on top of it. Hadoop provides wide range of tools for efficient data processing as well as specialised library for data science needs (Mahout), i.e. for building of recommender Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware.
  • 27. ElasticSearch Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTful web interface and schema-free JSON documents. It provides scalable search, has near real-time search, and supports multitenancy. ElasticSearch is used by GitHub, Foursquare, Etsy, SoundCloud, Xing and Wikimedia and can leverage several TB of data. ElasticSearch will be used for keeping rules and serving requests