SlideShare a Scribd company logo
1 of 3
Download to read offline
MovieLens data with Mahout
MovieLens data sets are collected by the GroupLens Research Project at the University of Minnesota and is
available from http://grouplens.org/datasets/movielens/
This data set consists of:
Users and items are numbered consecutively from 1.
The data is randomly ordered.
This is a tab separated list of user id | item id | rating | timestamp.
The time stamps are unix seconds since 1/1/1970 UTC
Example:
1 272 3 887431647
2 1 4 888550871
2 10 2 888551853
Line 1:
1 (user id) 272 (item id) 3 (rating) 887431647 (timestamp)
The objective:
The objective is to implement a Collaborative Filtering framework using historical data, in this instance, movie
ratings by 943 users, to provide Item-based recommendations. Three item based recommendations will be
provided for each user.
In order to and make these recommendations it is necessary to calculate similarity between items. Items
usually don't change much, so this often can be computed offline and has been popularized by Amazon and
others.
In the example provided the measure of similarity used is Euclidean Distance, however other measures are
available, including:
- Pearson correlation
- Spearman correlation
- Tanimoto coefficient
- LogLikelihood similarity
The code:
mahout recommenditembased
--input /user/cloudera/ua.base
--tempDir /user/cloudera/run1
--similarityClassname SIMILARITY_EUCLIDEAN_DISTANCE
--output /user/cloudera/run1/results
--numRecommendations 3
The output:
Item recommendations for the first 33 users:
Item recommendations for the last 33 users:
Explanation of output:
Line 1:
1 (user id) 407 (item id) 5.0 (rating)132 (item id) 5.0 (rating)1323 (item id) 5.0 (rating)

More Related Content

What's hot

Are we really including all relevant evidence
Are we really including all relevant evidence Are we really including all relevant evidence
Are we really including all relevant evidence cheweb1
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecurityEswar Publications
 
Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Prasenjeet Acharjee
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Behrang Mehrparvar
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesJPINFOTECH JAYAPRAKASH
 
Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)承剛 謝
 
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...John McDonald
 
IRJET- An Intuitive Sky-High View of Recommendation Systems
IRJET- An Intuitive Sky-High View of Recommendation SystemsIRJET- An Intuitive Sky-High View of Recommendation Systems
IRJET- An Intuitive Sky-High View of Recommendation SystemsIRJET Journal
 
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...IEEEFINALYEARSTUDENTPROJECTS
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
 
Movie recommendation project
Movie recommendation projectMovie recommendation project
Movie recommendation projectAbhishek Jaisingh
 
Fuzzy weblog extraction
Fuzzy weblog extractionFuzzy weblog extraction
Fuzzy weblog extractionEdy Portmann
 
Query Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain ObjectsQuery Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain Objects1crore projects
 
How popular are your tweets?
How popular are your tweets?How popular are your tweets?
How popular are your tweets?avijit_saha
 
Novel Algorithms for Ranking and Suggesting True Popular Items
Novel Algorithms for Ranking and Suggesting True Popular ItemsNovel Algorithms for Ranking and Suggesting True Popular Items
Novel Algorithms for Ranking and Suggesting True Popular ItemsIJMER
 

What's hot (20)

Are we really including all relevant evidence
Are we really including all relevant evidence Are we really including all relevant evidence
Are we really including all relevant evidence
 
Survey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network SecuritySurvey: Biological Inspired Computing in the Network Security
Survey: Biological Inspired Computing in the Network Security
 
Tweets Classifier
Tweets ClassifierTweets Classifier
Tweets Classifier
 
Combined queries
Combined queriesCombined queries
Combined queries
 
Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...
 
Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)Community Analysis of Deep Networks (poster)
Community Analysis of Deep Networks (poster)
 
Crowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomesCrowdsourcing predictors of behavioral outcomes
Crowdsourcing predictors of behavioral outcomes
 
Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)Collaborative Metric Learning (WWW'17)
Collaborative Metric Learning (WWW'17)
 
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
Usage Statistics & Information Behaviors: understanding User Behavior with Qu...
 
IRJET- An Intuitive Sky-High View of Recommendation Systems
IRJET- An Intuitive Sky-High View of Recommendation SystemsIRJET- An Intuitive Sky-High View of Recommendation Systems
IRJET- An Intuitive Sky-High View of Recommendation Systems
 
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...
IEEE 2014 JAVA DATA MINING PROJECTS Active learning of constraints for semi s...
 
Mahout part1
Mahout part1Mahout part1
Mahout part1
 
Temporal based Recommendation System
Temporal based Recommendation SystemTemporal based Recommendation System
Temporal based Recommendation System
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
Movie recommendation project
Movie recommendation projectMovie recommendation project
Movie recommendation project
 
Fuzzy weblog extraction
Fuzzy weblog extractionFuzzy weblog extraction
Fuzzy weblog extraction
 
Query Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain ObjectsQuery Aware Determinization of Uncertain Objects
Query Aware Determinization of Uncertain Objects
 
How popular are your tweets?
How popular are your tweets?How popular are your tweets?
How popular are your tweets?
 
Novel Algorithms for Ranking and Suggesting True Popular Items
Novel Algorithms for Ranking and Suggesting True Popular ItemsNovel Algorithms for Ranking and Suggesting True Popular Items
Novel Algorithms for Ranking and Suggesting True Popular Items
 

Viewers also liked

Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Cataldo Musto
 
Building multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search enginesBuilding multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search enginesTed Dunning
 
Des maths et des recommandations - Devoxx 2014
Des maths et des recommandations - Devoxx 2014Des maths et des recommandations - Devoxx 2014
Des maths et des recommandations - Devoxx 2014Loïc Knuchel
 
Le temps réel au coeur de toutes les stratégies digitales
Le temps réel au coeur de toutes les stratégies digitales Le temps réel au coeur de toutes les stratégies digitales
Le temps réel au coeur de toutes les stratégies digitales Netwave
 
The good the bad and the ugly - final
The good the bad and the ugly - finalThe good the bad and the ugly - final
The good the bad and the ugly - finalAndre Verschelling
 
Using Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for RecommendationUsing Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for RecommendationTed Dunning
 
Slope one recommender on hadoop
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoopYONG ZHENG
 
Recommendation Engine using Apache Mahout
Recommendation Engine using Apache MahoutRecommendation Engine using Apache Mahout
Recommendation Engine using Apache MahoutAmbarish Hazarnis
 
Mahout Workshop on Google Cloud Platform
Mahout Workshop on Google Cloud PlatformMahout Workshop on Google Cloud Platform
Mahout Workshop on Google Cloud PlatformIMC Institute
 
Big Data Analytics using Mahout
Big Data Analytics using MahoutBig Data Analytics using Mahout
Big Data Analytics using MahoutIMC Institute
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Alan Said
 
The Good, Bad and Ugly of Serverless
The Good, Bad and Ugly of ServerlessThe Good, Bad and Ugly of Serverless
The Good, Bad and Ugly of ServerlessPipedrive
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - RecommendationCataldo Musto
 
Yrecommender, machine learning sur Hybris
Yrecommender, machine learning sur HybrisYrecommender, machine learning sur Hybris
Yrecommender, machine learning sur HybrisGuillaume Kpotufe
 
Ranking (par IBRAHIM Sirine et TANIOS Dany)
Ranking (par IBRAHIM Sirine et TANIOS	 Dany)Ranking (par IBRAHIM Sirine et TANIOS	 Dany)
Ranking (par IBRAHIM Sirine et TANIOS Dany)rchbeir
 

Viewers also liked (15)

Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Building multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search enginesBuilding multi-modal recommendation engines using search engines
Building multi-modal recommendation engines using search engines
 
Des maths et des recommandations - Devoxx 2014
Des maths et des recommandations - Devoxx 2014Des maths et des recommandations - Devoxx 2014
Des maths et des recommandations - Devoxx 2014
 
Le temps réel au coeur de toutes les stratégies digitales
Le temps réel au coeur de toutes les stratégies digitales Le temps réel au coeur de toutes les stratégies digitales
Le temps réel au coeur de toutes les stratégies digitales
 
The good the bad and the ugly - final
The good the bad and the ugly - finalThe good the bad and the ugly - final
The good the bad and the ugly - final
 
Using Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for RecommendationUsing Mahout and a Search Engine for Recommendation
Using Mahout and a Search Engine for Recommendation
 
Slope one recommender on hadoop
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoop
 
Recommendation Engine using Apache Mahout
Recommendation Engine using Apache MahoutRecommendation Engine using Apache Mahout
Recommendation Engine using Apache Mahout
 
Mahout Workshop on Google Cloud Platform
Mahout Workshop on Google Cloud PlatformMahout Workshop on Google Cloud Platform
Mahout Workshop on Google Cloud Platform
 
Big Data Analytics using Mahout
Big Data Analytics using MahoutBig Data Analytics using Mahout
Big Data Analytics using Mahout
 
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
 
The Good, Bad and Ugly of Serverless
The Good, Bad and Ugly of ServerlessThe Good, Bad and Ugly of Serverless
The Good, Bad and Ugly of Serverless
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
Yrecommender, machine learning sur Hybris
Yrecommender, machine learning sur HybrisYrecommender, machine learning sur Hybris
Yrecommender, machine learning sur Hybris
 
Ranking (par IBRAHIM Sirine et TANIOS Dany)
Ranking (par IBRAHIM Sirine et TANIOS	 Dany)Ranking (par IBRAHIM Sirine et TANIOS	 Dany)
Ranking (par IBRAHIM Sirine et TANIOS Dany)
 

Similar to Example: movielens data with mahout

Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...ijeei-iaes
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender Systemtheijes
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender Systemtheijes
 
Using content features to enhance the
Using content features to enhance theUsing content features to enhance the
Using content features to enhance theijaia
 
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Rajasekar Nonburaj
 
Recommendation system based on association rules applied to consistent behavi...
Recommendation system based on association rules applied to consistent behavi...Recommendation system based on association rules applied to consistent behavi...
Recommendation system based on association rules applied to consistent behavi...IAEME Publication
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaEditor IJCATR
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaEditor IJCATR
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaEditor IJCATR
 
powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.amanpandey7656
 
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGIRJET Journal
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectorsManish Kumar
 
A hybrid recommender system user profiling from keywords and ratings
A hybrid recommender system user profiling from keywords and ratingsA hybrid recommender system user profiling from keywords and ratings
A hybrid recommender system user profiling from keywords and ratingsAravindharamanan S
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
 
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...Malim Siregar
 
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...Ijripublishers Ijri
 

Similar to Example: movielens data with mahout (20)

At4102337341
At4102337341At4102337341
At4102337341
 
Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...Music Recommendation System with User-based and Item-based Collaborative Filt...
Music Recommendation System with User-based and Item-based Collaborative Filt...
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
 
A Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender SystemA Study of Neural Network Learning-Based Recommender System
A Study of Neural Network Learning-Based Recommender System
 
Using content features to enhance the
Using content features to enhance theUsing content features to enhance the
Using content features to enhance the
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
 
Recommendation system based on association rules applied to consistent behavi...
Recommendation system based on association rules applied to consistent behavi...Recommendation system based on association rules applied to consistent behavi...
Recommendation system based on association rules applied to consistent behavi...
 
20320140501009 2
20320140501009 220320140501009 2
20320140501009 2
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social Media
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social Media
 
A Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social MediaA Survey on Decision Support Systems in Social Media
A Survey on Decision Support Systems in Social Media
 
powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.powerpoint presentation on movie recommender system.
powerpoint presentation on movie recommender system.
 
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNINGENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
ENTERTAINMENT CONTENT RECOMMENDATION SYSTEM USING MACHINE LEARNING
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectors
 
A hybrid recommender system user profiling from keywords and ratings
A hybrid recommender system user profiling from keywords and ratingsA hybrid recommender system user profiling from keywords and ratings
A hybrid recommender system user profiling from keywords and ratings
 
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISFEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSIS
 
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...
 
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...
Improving-Movie-Recommendation-Systems-Filtering-by-Exploiting-UserBased-Revi...
 
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...
Ijricit 01-008 confidentiality strategy deduction of user-uploaded pictures o...
 

More from Gregg Barrett

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Gregg Barrett
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeGregg Barrett
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: InsuranceGregg Barrett
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentGregg Barrett
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingGregg Barrett
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Gregg Barrett
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsGregg Barrett
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introductionGregg Barrett
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings powerGregg Barrett
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be excitingGregg Barrett
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerGregg Barrett
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersGregg Barrett
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Gregg Barrett
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in RGregg Barrett
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using RGregg Barrett
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using RGregg Barrett
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R ServicesGregg Barrett
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overviewGregg Barrett
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainGregg Barrett
 

More from Gregg Barrett (20)

Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018Cirrus: Africa's AI initiative, Proposal 2018
Cirrus: Africa's AI initiative, Proposal 2018
 
Cirrus: Africa's AI initiative
Cirrus: Africa's AI initiativeCirrus: Africa's AI initiative
Cirrus: Africa's AI initiative
 
Applied machine learning: Insurance
Applied machine learning: InsuranceApplied machine learning: Insurance
Applied machine learning: Insurance
 
Road and Track Vehicle - Project Document
Road and Track Vehicle - Project DocumentRoad and Track Vehicle - Project Document
Road and Track Vehicle - Project Document
 
Modelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boostingModelling the expected loss of bodily injury claims using gradient boosting
Modelling the expected loss of bodily injury claims using gradient boosting
 
Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?Data Science Introduction - Data Science: What Art Thou?
Data Science Introduction - Data Science: What Art Thou?
 
Revenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla MotorsRevenue Generation Ideas for Tesla Motors
Revenue Generation Ideas for Tesla Motors
 
Data science unit introduction
Data science unit introductionData science unit introduction
Data science unit introduction
 
Social networking brings power
Social networking brings powerSocial networking brings power
Social networking brings power
 
Procurement can be exciting
Procurement can be excitingProcurement can be exciting
Procurement can be exciting
 
Machine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing BeerMachine Learning Approaches to Brewing Beer
Machine Learning Approaches to Brewing Beer
 
A note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managersA note to Data Science and Machine Learning managers
A note to Data Science and Machine Learning managers
 
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
Quick Introduction: To run a SQL query on the Chicago Employee Data, using Cl...
 
Efficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in REfficient equity portfolios using mean variance optimisation in R
Efficient equity portfolios using mean variance optimisation in R
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Variable selection for classification and regression using R
Variable selection for classification and regression using RVariable selection for classification and regression using R
Variable selection for classification and regression using R
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
Introduction to Microsoft R Services
Introduction to Microsoft R ServicesIntroduction to Microsoft R Services
Introduction to Microsoft R Services
 
Insurance metrics overview
Insurance metrics overviewInsurance metrics overview
Insurance metrics overview
 
Review of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at IntermountainReview of mit sloan management review case study on analytics at Intermountain
Review of mit sloan management review case study on analytics at Intermountain
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Example: movielens data with mahout

  • 1. MovieLens data with Mahout MovieLens data sets are collected by the GroupLens Research Project at the University of Minnesota and is available from http://grouplens.org/datasets/movielens/ This data set consists of: Users and items are numbered consecutively from 1. The data is randomly ordered. This is a tab separated list of user id | item id | rating | timestamp. The time stamps are unix seconds since 1/1/1970 UTC Example: 1 272 3 887431647 2 1 4 888550871 2 10 2 888551853 Line 1: 1 (user id) 272 (item id) 3 (rating) 887431647 (timestamp) The objective: The objective is to implement a Collaborative Filtering framework using historical data, in this instance, movie ratings by 943 users, to provide Item-based recommendations. Three item based recommendations will be provided for each user. In order to and make these recommendations it is necessary to calculate similarity between items. Items usually don't change much, so this often can be computed offline and has been popularized by Amazon and others. In the example provided the measure of similarity used is Euclidean Distance, however other measures are available, including: - Pearson correlation - Spearman correlation - Tanimoto coefficient - LogLikelihood similarity The code: mahout recommenditembased --input /user/cloudera/ua.base --tempDir /user/cloudera/run1 --similarityClassname SIMILARITY_EUCLIDEAN_DISTANCE --output /user/cloudera/run1/results --numRecommendations 3 The output: Item recommendations for the first 33 users:
  • 2. Item recommendations for the last 33 users:
  • 3. Explanation of output: Line 1: 1 (user id) 407 (item id) 5.0 (rating)132 (item id) 5.0 (rating)1323 (item id) 5.0 (rating)