SlideShare a Scribd company logo
1 of 68
Download to read offline
Apache Mahout - Tutorial
                     Cataldo Musto, Ph.D.



 Corso di Accesso Intelligente all’Informazione ed Elaborazione del Linguaggio Naturale
Università degli Studi di Bari – Dipartimento di Informatica – A.A. 2012/2013

                                    11/01/2013
Outline
• What is Mahout ?
  – Overview
• How to use Mahout ?
  – Java snippets
Part 1



What is Mahout?
Goal
• Mahout is a Java library
  – Implementing Machine Learning techniques
Goal
• Mahout is a Java library
  – Implementing Machine Learning techniques
     •   Clustering
     •   Classification
     •   Recommendation
     •   Frequent ItemSet
Goal
• Mahout is a Java library
  – Implementing Machine Learning technique
     •   Clustering
     •   Classification
     •   Recommendation
     •   Frequent ItemSet
  – Scalable
What can we do?
• Currently Mahout supports mainly four use cases:
   – Recommendation - takes users' behavior and from that
     tries to find items users might like.
   – Clustering - takes e.g. text documents and groups them
     into groups of topically related documents.
   – Classification - learns from exisiting categorized
     documents what documents of a specific category look like
     and is able to assign unlabelled documents to the
     (hopefully) correct category.
   – Frequent itemset mining - takes a set of item groups
     (terms in a query session, shopping cart content) and
     identifies, which individual items usually appear together.
Algorithms
• Recommendation
  – User-based Collaborative Filtering
  – Item-based Collaborative Filering
  – SlopeOne Recommenders
  – Singular Value Decomposition
Algorithms
• Clustering
  – Canopy
  – K-Means
  – Fuzzy K-Means
  – Hierarchical Clustering
  – Minhash Clustering

  (and much more…)
Algorithms
• Classification
   –   Logistic Regression
   –   Bayes
   –   Support Vector Machines
   –   Perceptrons
   –   Neural Networks
   –   Restricted Boltzmann Machines
   –   Hidden Markov Models

   (and much more…)
Mahout in the
Apache Software Foundation
Focus
• In this tutorial we will focus just on:
   – Recommendation
Recommendation
• Mahout implements a Collaborative Filtering
  framework
   – Popularized by Amazon and others
   – Uses hystorical data (ratings, clicks, and purchases) to
     provide recommendations
      • User-based: Recommend items by finding similar users. This is
        often harder to scale because of the dynamic nature of users.
      • Item-based: Calculate similarity between items and make
        recommendations. Items usually don't change much, so this often
        can be computed offline.
      • Slope-One: A very fast and simple item-based recommendation
        approach applicable when users have given ratings (and not just
        boolean preferences).
Recommendation - Architecture
Recommendation - Architecture




                 Physical Storage
Recommendation - Architecture




                 Data Model




                 Physical Storage
Recommendation - Architecture


                 Recommender




                 Data Model




                 Physical Storage
Recommendation - Architecture
                 External Application



                 Recommender




                 Data Model




                 Physical Storage
Recommendation in Mahout
• Input: raw data (user preferences)
• Output: preferences estimation

• Step 1
  – Mapping raw data into a DataModel Mahout-
    compliant
• Step 2
  – Tuning recommender components
     • Similarity measure, neighborhood, etc.
Recommendation Components
• Mahout implements interfaces to these key
  abstractions:
   – DataModel
      • Methods for mapping raw data to a Mahout-compliant form
   – UserSimilarity
      • Methods to calculate the degree of correlation between two users
   – ItemSimilarity
      • Methods to calculate the degree of correlation between two items
   – UserNeighborhood
      • Methods to define the concept of ‘neighborhood’
   – Recommender
      • Methods to implement the recommendation step itself
Components: DataModel
• A DataModel is the interface to draw information
  about user preferences.

• Which sources is it possible to draw?
  – Database
     • MySQLJDBCDataModel
  – External Files
     • FileDataModel
  – Generic (preferences directly feed through Java code)
     • GenericDataModel
Components: DataModel
• GenericDataModel
  – Feed through Java calls
• FileDataModel
  – CSV (Comma Separated Values)
• JDBCDataModel
  – JDBC Driver
  – Standard database structure
FileDataModel – CSV input
Components: DataModel
• Regardless the source, they all share a
  common implementation.
• Basic object: Preference
  – Preference is a triple (user,item,score)
  – Stored in UserPreferenceArray
Components: DataModel
• Basic object: Preference
  – Preference is a triple (user,item,score)
  – Stored in UserPreferenceArray


• Two implementations
  – GenericUserPreferenceArray
     • It stores numerical preference, as well.
  – BooleanUserPreferenceArray
     • It skips numerical preference values.
Components: UserSimilarity
• UserSimilarity defines a notion of similarity between two
  Users.
   – (respectively) ItemSimilarity defines a notion of similarity
     between two Items.

• Which definition of similarity are available?
   –   Pearson Correlation
   –   Spearman Correlation
   –   Euclidean Distance
   –   Tanimoto Coefficient
   –   LogLikelihood Similarity

   – Already implemented!
Pearson’s vs. Euclidean distance
Components: UserNeighborhood
• UserSimilarity defines a notion of similarity between two
  Users.
   – (respectively) ItemSimilarity defines a notion of similarity
     between two Items.

• Which definition of neighborhood are available?
   – Nearest N users
       • The first N users with the highest similarity are labeled as ‘neighbors’
   – Tresholds
       • Users whose similarity is above a threshold are labeled as ‘neighbors’

   – Already implemented!
Components: Recommender
• Given a DataModel, a definition of similarity between
  users (items) and a definition of neighborhood, a
  recommender produces as output an estimation of
  relevance for each unseen item

• Which recommendation algorithms are implemented?
   –   User-based CF
   –   Item-based CF
   –   SVD-based CF
   –   SlopeOne CF

   (and much more…)
Recommendation Engines
Recap
• Hundreds of possible implementations of a CF-based
  recommender!
   – 6 different recommendation algorithms
   – 2 different neighborhood definitions
   – 5 different similarity definitions

• Evaluation fo the different implementations is actually very
  time-consuming
   – The strength of Mahout lies in that it is possible to save time in
     the evaluation of the different combinations of the parameters!
   – Standard interface for the evaluation of a Recommender
     System
Evaluation
• Mahout provides classes for the evaluation of a
  recommender system
  – Prediction-based measures
     • Mean Average Error
     • RMSE (Root Mean Square Error)
  – IR-based measures
     •   Precision
     •   Recall
     •   F1-measure
     •   F1@n
     •   NDCG (ranking measure)
Evaluation
• Prediction-based Measures
  – Class: AverageAbsoluteDifferenceEvaluator
  – Method: evaluate()
  – Parameters:
     •   Recommender implementation
     •   DataModel implementation
     •   TrainingSet size (e.g. 70%)
     •   % of the data to use in the evaluation (smaller % for
         fast prototyping)
Evaluation
• IR-based Measures
  – Class: GenericRecommenderIRStatsEvaluator
  – Method: evaluate()
  – Parameters:
    •   Recommender implementation
    •   DataModel implementation
    •   Relevance Threshold (mean+standard deviation)
    •   % of the data to use in the evaluation (smaller % for
        fast prototyping)
Part 2



How to use Mahout?
Download Mahout
• Official Release
  – The latest Mahout release is 0.7
  – Available at:
    http://www.apache.org/dyn/closer.cgi/mahout


• Java 1.6.x or greater.
• Hadoop is not mandatory!
Example 1: preferences
import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import org.apache.mahout.cf.taste.model.Preference;
import org.apache.mahout.cf.taste.model.PreferenceArray;

class CreatePreferenceArray {

    private CreatePreferenceArray() {
    }

    public static void main(String[] args) {
      PreferenceArray user1Prefs = new GenericUserPreferenceArray(2);

        user1Prefs.setUserID(0, 1L);

        user1Prefs.setItemID(0, 101L);
        user1Prefs.setValue(0, 2.0f);

        user1Prefs.setItemID(1, 102L);
        user1Prefs.setValue(1, 3.0f);

        Preference pref = user1Prefs.get(1);
        System.out.println(pref);
    }
}
Example 1: preferences
import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import org.apache.mahout.cf.taste.model.Preference;
import org.apache.mahout.cf.taste.model.PreferenceArray;

class CreatePreferenceArray {

    private CreatePreferenceArray() {
    }

    public static void main(String[] args) {
      PreferenceArray user1Prefs = new GenericUserPreferenceArray(2);

        user1Prefs.setUserID(0, 1L);

        user1Prefs.setItemID(0, 101L);
        user1Prefs.setValue(0, 2.0f);              Score 2 for Item 101

        user1Prefs.setItemID(1, 102L);
        user1Prefs.setValue(1, 3.0f);

        Preference pref = user1Prefs.get(1);
        System.out.println(pref);
    }
}
Example 2: data model

• PreferenceArray stores the preferences of a
  single user

• Where do the preferences of all the users are
  stored?
  – An HashMap? No.
  – Mahout introduces data structures optimized for
    recommendation tasks
     • HashMap are replaced by FastByIDMap
Example 2: data model
import     org.apache.mahout.cf.taste.impl.common.FastByIDMap;
import     org.apache.mahout.cf.taste.impl.model.GenericDataModel;
import     org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray;
import     org.apache.mahout.cf.taste.model.DataModel;
import     org.apache.mahout.cf.taste.model.PreferenceArray;

class CreateGenericDataModel {

    private CreateGenericDataModel() {
    }

    public static void main(String[] args) {
      FastByIDMap<PreferenceArray> preferences = new FastByIDMap<PreferenceArray>();
      PreferenceArray prefsForUser1 = new GenericUserPreferenceArray(10);
      prefsForUser1.setUserID(0, 1L);
      prefsForUser1.setItemID(0, 101L);
      prefsForUser1.setValue(0, 3.0f);
      prefsForUser1.setItemID(1, 102L);
      prefsForUser1.setValue(1, 4.5f);

        preferences.put(1L, prefsForUser1);

        DataModel model = new GenericDataModel(preferences);
        System.out.println(model);
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));         FileDataModel
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);
                                                                               2 neighbours
        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 3: First Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("intro.csv"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 1);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }                                                                  Top-1 Recommendation
}                                                                      for User 1
Example 4: MovieLens Recommender

• Download the GroupLens dataset (100k)
  – Its format is already Mahout compliant


• Now we can run the recommendation
  framework against a state-of-the-art dataset
Example 4: MovieLens Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("ua.base"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(1, 20);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }
}
Example 4: MovieLens Recommender
import    org.apache.mahout.cf.taste.impl.model.file.*;
import    org.apache.mahout.cf.taste.impl.neighborhood.*;
import    org.apache.mahout.cf.taste.impl.recommender.*;
import    org.apache.mahout.cf.taste.impl.similarity.*;
import    org.apache.mahout.cf.taste.model.*;
import    org.apache.mahout.cf.taste.neighborhood.*;
import    org.apache.mahout.cf.taste.recommender.*;
import    org.apache.mahout.cf.taste.similarity.*;

class RecommenderIntro {

    private RecommenderIntro() {
    }

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("ua.base"));
        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
        UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(
            model, neighborhood, similarity);

        List<RecommendedItem> recommendations = recommender.recommend(10, 50);

        for (RecommendedItem recommendation : recommendations) {
          System.out.println(recommendation);
        }
    }                                                                       We can play with
}                                                                            parameters!
Example 5: evaluation
class EvaluatorIntro {
                                              Ensures the consistency between
    private EvaluatorIntro() {
    }                                            different evaluation runs.
    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                        70%training
                                                  (evaluation on the whole
                                                          dataset)
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                           Recommendation Engine
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
        RecommenderEvaluator rmse = new RMSEEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        System.out.println(score);
    }
}
                                                          We can add more measures
Example 5: evaluation
class EvaluatorIntro {

    private EvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator();
        RecommenderEvaluator rmse = new RMSEEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);
        double rmse = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0);

        System.out.println(score);
        System.out.println(rmse);
    }
}
Example 6: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                      Precision@5 , Recall@5, etc.
}
Example 6: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                      Precision@5 , Recall@5, etc.
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Neighborhood Size
Example 6b: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(200, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                        Set Neighborhood to 200
}
Example 6b: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                        Set Neighborhood to 500
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Similarity Measure (e.g. Euclidean One)
Example 6c: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             UserSimilarity similarity = new EuclideanDistanceSimilarity(model);
             UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model);
             return new GenericUserBasedRecommender(model, neighborhood, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                                            Set Euclidean Distance
}
Mahout Strengths
• Fast-prototyping and evaluation
  – To evaluate a different configuration of the same
    algorithm we just need to update a parameter and
    run again.

  – Example
     • Different Recommendation Engine (e.g. SlopeOne)
Example 6d: IR-based evaluation
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
          @Override
          public Recommender buildRecommender(DataModel model) throws TasteException {
                  return new SlopeOneRecommender(model);
             }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
                                                    Change Recommendation
}                                                          Algorithm
Example 7: item-based recommender
• Mahout provides Java classes for building an
  item-based recommender system
  – Amazon-like
  – Recommendations are based on similarities
    among items (generally pre-computed offline)
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
                                                                                 ItemSimilarity
Example 7: item-based recommender
class IREvaluatorIntro {

    private IREvaluatorIntro() {
    }

    public static void main(String[] args) throws Exception {
      RandomUtils.useTestSeed();
      DataModel model = new FileDataModel(new File("ua.base"));

        RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator();

        // Build the same recommender for testing that we did last time:
        RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
           @Override
           public Recommender buildRecommender(DataModel model) throws TasteException {
             ItemSimilarity similarity = new PearsonCorrelationSimilarity(model);
             return new GenericItemBasedRecommender(model, similarity);
           }
        };

        IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5,
                            GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0);

        System.out.println(stats.getPrecision());
        System.out.println(stats.getRecall());
        System.out.println(stats.getF1());
    }
}
                                                             No Neighborhood definition for item-
                                                                    based recommenders
End. Do you want more?
Do you want more?
• Recommendation
  – Deploy of a Mahout-based Web Recommender
  – Integration with Hadoop
  – Integration of content-based information
  – Custom similarities, Custom recommenders, Re-
    scoring functions


• Classification, Clustering and Pattern Mining

More Related Content

What's hot

Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildSujit Pal
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Edureka!
 
Get Intelligent with Metabase
Get Intelligent with MetabaseGet Intelligent with Metabase
Get Intelligent with MetabaseAnant Corporation
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes ClassifierArunabha Saha
 
Learning sets of rules, Sequential Learning Algorithm,FOIL
Learning sets of rules, Sequential Learning Algorithm,FOILLearning sets of rules, Sequential Learning Algorithm,FOIL
Learning sets of rules, Sequential Learning Algorithm,FOILPavithra Thippanaik
 
Lecture 1- Introduction to Operating Systems.pdf
Lecture 1- Introduction to Operating Systems.pdfLecture 1- Introduction to Operating Systems.pdf
Lecture 1- Introduction to Operating Systems.pdfAmanuelmergia
 
Lecture 1 graphical models
Lecture 1  graphical modelsLecture 1  graphical models
Lecture 1 graphical modelsDuy Tung Pham
 
Collaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCFCollaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCFPark JunPyo
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learningTom Dierickx
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningDaniel Emaasit
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender SystemVõ Duy Tuấn
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERKnoldus Inc.
 
Python Crash Course
Python Crash CoursePython Crash Course
Python Crash CourseHaim Michael
 

What's hot (20)

Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
 
Text Classification
Text ClassificationText Classification
Text Classification
 
Python numbers
Python numbersPython numbers
Python numbers
 
Get Intelligent with Metabase
Get Intelligent with MetabaseGet Intelligent with Metabase
Get Intelligent with Metabase
 
Naive Bayes Classifier
Naive Bayes ClassifierNaive Bayes Classifier
Naive Bayes Classifier
 
Learning sets of rules, Sequential Learning Algorithm,FOIL
Learning sets of rules, Sequential Learning Algorithm,FOILLearning sets of rules, Sequential Learning Algorithm,FOIL
Learning sets of rules, Sequential Learning Algorithm,FOIL
 
Lecture 1- Introduction to Operating Systems.pdf
Lecture 1- Introduction to Operating Systems.pdfLecture 1- Introduction to Operating Systems.pdf
Lecture 1- Introduction to Operating Systems.pdf
 
Lecture 1 graphical models
Lecture 1  graphical modelsLecture 1  graphical models
Lecture 1 graphical models
 
Collaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCFCollaborative Filtering - MF, NCF, NGCF
Collaborative Filtering - MF, NCF, NGCF
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
 
Parallelformers
ParallelformersParallelformers
Parallelformers
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine Learning
 
How to build a Recommender System
How to build a Recommender SystemHow to build a Recommender System
How to build a Recommender System
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Bleu vs rouge
Bleu vs rougeBleu vs rouge
Bleu vs rouge
 
Python Crash Course
Python Crash CoursePython Crash Course
Python Crash Course
 

Viewers also liked

Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahoutsscdotopen
 
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whyKorea Sdec
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Cataldo Musto
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Cataldo Musto
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutTed Dunning
 
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionVarad Meru
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningVarad Meru
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle CompetitionsDataRobot
 

Viewers also liked (9)

Introduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache MahoutIntroduction to Collaborative Filtering with Apache Mahout
Introduction to Collaborative Filtering with Apache Mahout
 
SDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the whySDEC2011 Mahout - the what, the how and the why
SDEC2011 Mahout - the what, the how and the why
 
Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)Mahout Tutorial and Hands-on (version 2015)
Mahout Tutorial and Hands-on (version 2015)
 
Intro to Apache Mahout
Intro to Apache MahoutIntro to Apache Mahout
Intro to Apache Mahout
 
Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014 Apache Mahout Tutorial - Recommendation - 2013/2014
Apache Mahout Tutorial - Recommendation - 2013/2014
 
Whats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache MahoutWhats Right and Wrong with Apache Mahout
Whats Right and Wrong with Apache Mahout
 
Machine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An IntroductionMachine Learning and Apache Mahout : An Introduction
Machine Learning and Apache Mahout : An Introduction
 
Introduction to Mahout and Machine Learning
Introduction to Mahout and Machine LearningIntroduction to Mahout and Machine Learning
Introduction to Mahout and Machine Learning
 
10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions10 R Packages to Win Kaggle Competitions
10 R Packages to Win Kaggle Competitions
 

Similar to Tutorial Mahout - Recommendation

Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender systemKaren Li
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSpark Summit
 
Sparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsMaya Hristakeva
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engineKeeyong Han
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignData Science London
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Teacher training material
Teacher training materialTeacher training material
Teacher training materialVikram Parmar
 
Apache Mahout
Apache MahoutApache Mahout
Apache MahoutAjit Koti
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Sonya Liberman
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsNavisro Analytics
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.ASHISH JAGTAP
 
Running with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightRunning with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightChris Price
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutKorea Sdec
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 

Similar to Tutorial Mahout - Recommendation (20)

Tag based recommender system
Tag based recommender systemTag based recommender system
Tag based recommender system
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
Sparking Science up with Research Recommendations
Sparking Science up with Research RecommendationsSparking Science up with Research Recommendations
Sparking Science up with Research Recommendations
 
Buidling large scale recommendation engine
Buidling large scale recommendation engineBuidling large scale recommendation engine
Buidling large scale recommendation engine
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
Beyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems DesignBeyond Accuracy: Goal-Driven Recommender Systems Design
Beyond Accuracy: Goal-Driven Recommender Systems Design
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Teacher training material
Teacher training materialTeacher training material
Teacher training material
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
Taking the Pain out of Data Science - RecSys Machine Learning Framework Over ...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Collaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro AnalyticsCollaborative Filtering and Recommender Systems By Navisro Analytics
Collaborative Filtering and Recommender Systems By Navisro Analytics
 
IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.IOTA 2016 Social Recomender System Presentation.
IOTA 2016 Social Recomender System Presentation.
 
Running with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsightRunning with Elephants: Predictive Analytics with HDInsight
Running with Elephants: Predictive Analytics with HDInsight
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of Mahout
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 

More from Cataldo Musto

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...Cataldo Musto
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationCataldo Musto
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Cataldo Musto
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Cataldo Musto
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Cataldo Musto
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Cataldo Musto
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Cataldo Musto
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsCataldo Musto
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Cataldo Musto
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeCataldo Musto
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemCataldo Musto
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Cataldo Musto
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...Cataldo Musto
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfCataldo Musto
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Cataldo Musto
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesCataldo Musto
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsCataldo Musto
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?Cataldo Musto
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Cataldo Musto
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkCataldo Musto
 

More from Cataldo Musto (20)

MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...MyrrorBot: a Digital Assistant Based on Holistic User Models forPersonalize...
MyrrorBot: a Digital Assistant Based on Holistic User Models for Personalize...
 
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical EvaluationFairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
Fairness and Popularity Bias in Recommender Systems: an Empirical Evaluation
 
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
Intelligenza Artificiale e Social Media - Monitoraggio della Farnesina e La M...
 
Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...Exploring the Effects of Natural Language Justifications in Food Recommender ...
Exploring the Effects of Natural Language Justifications in Food Recommender ...
 
Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...Exploiting Distributional Semantics Models for Natural Language Context-aware...
Exploiting Distributional Semantics Models for Natural Language Context-aware...
 
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
Towards a Knowledge-aware Food Recommender System Exploiting Holistic User Mo...
 
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
Towards Queryable User Profiles: Introducing Conversational Agents in a Platf...
 
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph EmbeddingsHybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
Hybrid Semantics aware Recommendations Exploiting Knowledge Graph Embeddings
 
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...Natural Language Justifications for Recommender Systems Exploiting Text Summa...
Natural Language Justifications for Recommender Systems Exploiting Text Summa...
 
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA RispondeL'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
L'IA per l'Empowerment del Cittadino: Hate Map, Myrror, PA Risponde
 
Explanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender SystemExplanation Strategies - Advances in Content-based Recommender System
Explanation Strategies - Advances in Content-based Recommender System
 
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
Justifying Recommendations through Aspect-based Sentiment Analysis of Users R...
 
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...ExpLOD: un framework per la generazione di spiegazioni per recommender system...
ExpLOD: un framework per la generazione di spiegazioni per recommender system...
 
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified SelfMyrror: una piattaforma per Holistic User Modeling e Quantified Self
Myrror: una piattaforma per Holistic User Modeling e Quantified Self
 
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...Semantic Holistic User Modeling for Personalized Access to Digital Content an...
Semantic Holistic User Modeling for Personalized Access to Digital Content an...
 
Holistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart CitiesHolistic User Modeling for Personalized Services in Smart Cities
Holistic User Modeling for Personalized Services in Smart Cities
 
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital FootprintsA Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
 
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
eHealth, mHealth in Otorinolaringoiatria: innovazioni dirompenti o disastrose?
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
 
Il Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social NetworkIl Linguaggio dell'Odio sui Social Network
Il Linguaggio dell'Odio sui Social Network
 

Recently uploaded

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

Tutorial Mahout - Recommendation

  • 1. Apache Mahout - Tutorial Cataldo Musto, Ph.D. Corso di Accesso Intelligente all’Informazione ed Elaborazione del Linguaggio Naturale Università degli Studi di Bari – Dipartimento di Informatica – A.A. 2012/2013 11/01/2013
  • 2. Outline • What is Mahout ? – Overview • How to use Mahout ? – Java snippets
  • 3. Part 1 What is Mahout?
  • 4. Goal • Mahout is a Java library – Implementing Machine Learning techniques
  • 5. Goal • Mahout is a Java library – Implementing Machine Learning techniques • Clustering • Classification • Recommendation • Frequent ItemSet
  • 6. Goal • Mahout is a Java library – Implementing Machine Learning technique • Clustering • Classification • Recommendation • Frequent ItemSet – Scalable
  • 7. What can we do? • Currently Mahout supports mainly four use cases: – Recommendation - takes users' behavior and from that tries to find items users might like. – Clustering - takes e.g. text documents and groups them into groups of topically related documents. – Classification - learns from exisiting categorized documents what documents of a specific category look like and is able to assign unlabelled documents to the (hopefully) correct category. – Frequent itemset mining - takes a set of item groups (terms in a query session, shopping cart content) and identifies, which individual items usually appear together.
  • 8. Algorithms • Recommendation – User-based Collaborative Filtering – Item-based Collaborative Filering – SlopeOne Recommenders – Singular Value Decomposition
  • 9. Algorithms • Clustering – Canopy – K-Means – Fuzzy K-Means – Hierarchical Clustering – Minhash Clustering (and much more…)
  • 10. Algorithms • Classification – Logistic Regression – Bayes – Support Vector Machines – Perceptrons – Neural Networks – Restricted Boltzmann Machines – Hidden Markov Models (and much more…)
  • 11. Mahout in the Apache Software Foundation
  • 12. Focus • In this tutorial we will focus just on: – Recommendation
  • 13. Recommendation • Mahout implements a Collaborative Filtering framework – Popularized by Amazon and others – Uses hystorical data (ratings, clicks, and purchases) to provide recommendations • User-based: Recommend items by finding similar users. This is often harder to scale because of the dynamic nature of users. • Item-based: Calculate similarity between items and make recommendations. Items usually don't change much, so this often can be computed offline. • Slope-One: A very fast and simple item-based recommendation approach applicable when users have given ratings (and not just boolean preferences).
  • 15. Recommendation - Architecture Physical Storage
  • 16. Recommendation - Architecture Data Model Physical Storage
  • 17. Recommendation - Architecture Recommender Data Model Physical Storage
  • 18. Recommendation - Architecture External Application Recommender Data Model Physical Storage
  • 19. Recommendation in Mahout • Input: raw data (user preferences) • Output: preferences estimation • Step 1 – Mapping raw data into a DataModel Mahout- compliant • Step 2 – Tuning recommender components • Similarity measure, neighborhood, etc.
  • 20. Recommendation Components • Mahout implements interfaces to these key abstractions: – DataModel • Methods for mapping raw data to a Mahout-compliant form – UserSimilarity • Methods to calculate the degree of correlation between two users – ItemSimilarity • Methods to calculate the degree of correlation between two items – UserNeighborhood • Methods to define the concept of ‘neighborhood’ – Recommender • Methods to implement the recommendation step itself
  • 21. Components: DataModel • A DataModel is the interface to draw information about user preferences. • Which sources is it possible to draw? – Database • MySQLJDBCDataModel – External Files • FileDataModel – Generic (preferences directly feed through Java code) • GenericDataModel
  • 22. Components: DataModel • GenericDataModel – Feed through Java calls • FileDataModel – CSV (Comma Separated Values) • JDBCDataModel – JDBC Driver – Standard database structure
  • 24. Components: DataModel • Regardless the source, they all share a common implementation. • Basic object: Preference – Preference is a triple (user,item,score) – Stored in UserPreferenceArray
  • 25. Components: DataModel • Basic object: Preference – Preference is a triple (user,item,score) – Stored in UserPreferenceArray • Two implementations – GenericUserPreferenceArray • It stores numerical preference, as well. – BooleanUserPreferenceArray • It skips numerical preference values.
  • 26. Components: UserSimilarity • UserSimilarity defines a notion of similarity between two Users. – (respectively) ItemSimilarity defines a notion of similarity between two Items. • Which definition of similarity are available? – Pearson Correlation – Spearman Correlation – Euclidean Distance – Tanimoto Coefficient – LogLikelihood Similarity – Already implemented!
  • 28. Components: UserNeighborhood • UserSimilarity defines a notion of similarity between two Users. – (respectively) ItemSimilarity defines a notion of similarity between two Items. • Which definition of neighborhood are available? – Nearest N users • The first N users with the highest similarity are labeled as ‘neighbors’ – Tresholds • Users whose similarity is above a threshold are labeled as ‘neighbors’ – Already implemented!
  • 29. Components: Recommender • Given a DataModel, a definition of similarity between users (items) and a definition of neighborhood, a recommender produces as output an estimation of relevance for each unseen item • Which recommendation algorithms are implemented? – User-based CF – Item-based CF – SVD-based CF – SlopeOne CF (and much more…)
  • 31. Recap • Hundreds of possible implementations of a CF-based recommender! – 6 different recommendation algorithms – 2 different neighborhood definitions – 5 different similarity definitions • Evaluation fo the different implementations is actually very time-consuming – The strength of Mahout lies in that it is possible to save time in the evaluation of the different combinations of the parameters! – Standard interface for the evaluation of a Recommender System
  • 32. Evaluation • Mahout provides classes for the evaluation of a recommender system – Prediction-based measures • Mean Average Error • RMSE (Root Mean Square Error) – IR-based measures • Precision • Recall • F1-measure • F1@n • NDCG (ranking measure)
  • 33. Evaluation • Prediction-based Measures – Class: AverageAbsoluteDifferenceEvaluator – Method: evaluate() – Parameters: • Recommender implementation • DataModel implementation • TrainingSet size (e.g. 70%) • % of the data to use in the evaluation (smaller % for fast prototyping)
  • 34. Evaluation • IR-based Measures – Class: GenericRecommenderIRStatsEvaluator – Method: evaluate() – Parameters: • Recommender implementation • DataModel implementation • Relevance Threshold (mean+standard deviation) • % of the data to use in the evaluation (smaller % for fast prototyping)
  • 35. Part 2 How to use Mahout?
  • 36. Download Mahout • Official Release – The latest Mahout release is 0.7 – Available at: http://www.apache.org/dyn/closer.cgi/mahout • Java 1.6.x or greater. • Hadoop is not mandatory!
  • 37. Example 1: preferences import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.Preference; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreatePreferenceArray { private CreatePreferenceArray() { } public static void main(String[] args) { PreferenceArray user1Prefs = new GenericUserPreferenceArray(2); user1Prefs.setUserID(0, 1L); user1Prefs.setItemID(0, 101L); user1Prefs.setValue(0, 2.0f); user1Prefs.setItemID(1, 102L); user1Prefs.setValue(1, 3.0f); Preference pref = user1Prefs.get(1); System.out.println(pref); } }
  • 38. Example 1: preferences import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.Preference; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreatePreferenceArray { private CreatePreferenceArray() { } public static void main(String[] args) { PreferenceArray user1Prefs = new GenericUserPreferenceArray(2); user1Prefs.setUserID(0, 1L); user1Prefs.setItemID(0, 101L); user1Prefs.setValue(0, 2.0f); Score 2 for Item 101 user1Prefs.setItemID(1, 102L); user1Prefs.setValue(1, 3.0f); Preference pref = user1Prefs.get(1); System.out.println(pref); } }
  • 39. Example 2: data model • PreferenceArray stores the preferences of a single user • Where do the preferences of all the users are stored? – An HashMap? No. – Mahout introduces data structures optimized for recommendation tasks • HashMap are replaced by FastByIDMap
  • 40. Example 2: data model import org.apache.mahout.cf.taste.impl.common.FastByIDMap; import org.apache.mahout.cf.taste.impl.model.GenericDataModel; import org.apache.mahout.cf.taste.impl.model.GenericUserPreferenceArray; import org.apache.mahout.cf.taste.model.DataModel; import org.apache.mahout.cf.taste.model.PreferenceArray; class CreateGenericDataModel { private CreateGenericDataModel() { } public static void main(String[] args) { FastByIDMap<PreferenceArray> preferences = new FastByIDMap<PreferenceArray>(); PreferenceArray prefsForUser1 = new GenericUserPreferenceArray(10); prefsForUser1.setUserID(0, 1L); prefsForUser1.setItemID(0, 101L); prefsForUser1.setValue(0, 3.0f); prefsForUser1.setItemID(1, 102L); prefsForUser1.setValue(1, 4.5f); preferences.put(1L, prefsForUser1); DataModel model = new GenericDataModel(preferences); System.out.println(model); } }
  • 41. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 42. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); FileDataModel UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 43. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); 2 neighbours List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 44. Example 3: First Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("intro.csv")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(2, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 1); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } Top-1 Recommendation } for User 1
  • 45. Example 4: MovieLens Recommender • Download the GroupLens dataset (100k) – Its format is already Mahout compliant • Now we can run the recommendation framework against a state-of-the-art dataset
  • 46. Example 4: MovieLens Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("ua.base")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(1, 20); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } }
  • 47. Example 4: MovieLens Recommender import org.apache.mahout.cf.taste.impl.model.file.*; import org.apache.mahout.cf.taste.impl.neighborhood.*; import org.apache.mahout.cf.taste.impl.recommender.*; import org.apache.mahout.cf.taste.impl.similarity.*; import org.apache.mahout.cf.taste.model.*; import org.apache.mahout.cf.taste.neighborhood.*; import org.apache.mahout.cf.taste.recommender.*; import org.apache.mahout.cf.taste.similarity.*; class RecommenderIntro { private RecommenderIntro() { } public static void main(String[] args) throws Exception { DataModel model = new FileDataModel(new File("ua.base")); UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); Recommender recommender = new GenericUserBasedRecommender( model, neighborhood, similarity); List<RecommendedItem> recommendations = recommender.recommend(10, 50); for (RecommendedItem recommendation : recommendations) { System.out.println(recommendation); } } We can play with } parameters!
  • 48. Example 5: evaluation class EvaluatorIntro { Ensures the consistency between private EvaluatorIntro() { } different evaluation runs. public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } }
  • 49. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } }
  • 50. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } 70%training (evaluation on the whole dataset)
  • 51. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } Recommendation Engine
  • 52. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); RecommenderEvaluator rmse = new RMSEEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); } } We can add more measures
  • 53. Example 5: evaluation class EvaluatorIntro { private EvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderEvaluator evaluator = new AverageAbsoluteDifferenceRecommenderEvaluator(); RecommenderEvaluator rmse = new RMSEEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; double score = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); double rmse = evaluator.evaluate(recommenderBuilder, null, model, 0.7, 1.0); System.out.println(score); System.out.println(rmse); } }
  • 54. Example 6: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Precision@5 , Recall@5, etc. }
  • 55. Example 6: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(100, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Precision@5 , Recall@5, etc. }
  • 56. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Neighborhood Size
  • 57. Example 6b: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(200, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Neighborhood to 200 }
  • 58. Example 6b: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new PearsonCorrelationSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Neighborhood to 500 }
  • 59. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Similarity Measure (e.g. Euclidean One)
  • 60. Example 6c: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { UserSimilarity similarity = new EuclideanDistanceSimilarity(model); UserNeighborhood neighborhood = new NearestNUserNeighborhood(500, similarity, model); return new GenericUserBasedRecommender(model, neighborhood, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Set Euclidean Distance }
  • 61. Mahout Strengths • Fast-prototyping and evaluation – To evaluate a different configuration of the same algorithm we just need to update a parameter and run again. – Example • Different Recommendation Engine (e.g. SlopeOne)
  • 62. Example 6d: IR-based evaluation class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { return new SlopeOneRecommender(model); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } Change Recommendation } Algorithm
  • 63. Example 7: item-based recommender • Mahout provides Java classes for building an item-based recommender system – Amazon-like – Recommendations are based on similarities among items (generally pre-computed offline)
  • 64. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } }
  • 65. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } } ItemSimilarity
  • 66. Example 7: item-based recommender class IREvaluatorIntro { private IREvaluatorIntro() { } public static void main(String[] args) throws Exception { RandomUtils.useTestSeed(); DataModel model = new FileDataModel(new File("ua.base")); RecommenderIRStatsEvaluator evaluator = new GenericRecommenderIRStatsEvaluator(); // Build the same recommender for testing that we did last time: RecommenderBuilder recommenderBuilder = new RecommenderBuilder() { @Override public Recommender buildRecommender(DataModel model) throws TasteException { ItemSimilarity similarity = new PearsonCorrelationSimilarity(model); return new GenericItemBasedRecommender(model, similarity); } }; IRStatistics stats = evaluator.evaluate(recommenderBuilder, null, model, null, 5, GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD, 1,0); System.out.println(stats.getPrecision()); System.out.println(stats.getRecall()); System.out.println(stats.getF1()); } } No Neighborhood definition for item- based recommenders
  • 67. End. Do you want more?
  • 68. Do you want more? • Recommendation – Deploy of a Mahout-based Web Recommender – Integration with Hadoop – Integration of content-based information – Custom similarities, Custom recommenders, Re- scoring functions • Classification, Clustering and Pattern Mining