SlideShare a Scribd company logo
1 of 59
Download to read offline
Discovering Users’ Topics of Interest
in Recommender Systems
Gabriel Moreira - @gspmoreira
Gilmar Souza - @gilmarsouza
Track: Data Science
2016
Agenda
● Recommender Systems
● Topic Modeling
● Case: Smart Canvas - Corporative Collaboration
● Wrap up
Recommender Systems
An introduction
Life is too short!
Social recommendations
Recommendations by interaction
What should I
watch today?
What do you
like, my son?
"A lot of times, people don’t know what they want
until you show it to them."
Steve Jobs
"We are leaving the Information Age and entering the
Recommendation Age.".
Cris Anderson, "The long tail"
38% of sales
2/3 movie rentals
Recommendations are responsible for...
38% of top news
visualization
What else may I recommend?
What can a Recommender Systems do?
Prediction
Given an item, what is its relevance for
each user?
Recommendation
Given a user, produce an ordered list matching the
user needs
How it works
Content-Based Filtering
Similar content (e.g. actor)
Likes
Recommends
Advantages
● Does not depend upon other users
● May recommend new and unpopular items
● Recommendations can be easily explained
Drawbacks
● Overspecialization
● May not recommend to new users
May be difficult to extract attributes from audio,
movies or images
Content-Based Filtering
User-Based Collaborative Filtering
Similar interests
Likes
Recommends
Item-Based Collaborative Filtering
Likes Recommends
Who likes A also likes B
Likes
Likes
Collaborative Filtering
Advantages
● Works to any item kind (ignore attributes)
Drawbacks
● Cannot recommend items not already
rated/consumed
● Usually recommends more popular items
● Needs a minimum amount of users to match similar
users (cold start)
Hybrid Recommender Systems
Composite
Iterates by a chain of algorithm, aggregating
recommendations.
Weighted
Each algorithm has as a weight and the final
recommendations are defined by weighted averages.
Some approaches
UBCF Example (Java / Mahout)
// Loads user-item ratings
DataModel model = new FileDataModel(new File("input.csv"));
// Defines a similarity metric to compare users (Person's correlation coefficient)
UserSimilarity similarity = new PearsonCorrelationSimilarity(model);
// Threshold the minimum similarity to consider two users similar
UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1, similarity,
model);
// Create a User-Based Collaborative Filtering recommender
UserBasedRecommender recommender = new GenericUserBasedRecommender
(model, neighborhood, similarity);
// Return the top 3 recommendations for userId=2
List recommendations = recommender.recommend(2, 3);
User,Item,Rating1,
15,4.0
1,16,5.0
1,17,1.0
1,18,5.0
2,10,1.0
2,11,2.0
2,15,5.0
2,16,4.5
2,17,1.0
2,18,5.0
3,11,2.5
input.csv
User-Based Collaborative Filtering example (Mahout)
1
2
3
4
5
6
Frameworks - Recommender Systems
Python
Python / ScalaJava
.NET
Java
Books
Topic Modeling
An introduction
Topic Modeling
A simple way to analyze topics of large text collections (corpus).
What is this data about?
What are the main topics?
Topic
A Cluster of words that generally occur together and are related.
Example 2D topics visualization with pyLDAviz (topics are the circles in the left, main terms of selected topic are shown in the right)
Topics evolution
Visualization of topics evolution during across time
Topic Modeling applications
● Detect trends in the market and customer challenges of a industry from media
and social networks.
● Marketing SEO: Optimization of search keywords to improve page ranking in
searchers
● Identify users preferences to recommend relevant content, products, jobs, …
● Documents are composed by many topics
● Topics are composed by many words (tokens)
Topic Modeling
Documents Topics Words (Tokens)
Observable Latent Observable
Unsupervised learning technique, which does not require too much effort on pre-
processing (usually just Text Vectorization). In general, the only parameter is the
number of topics (k).
Topics in a document
Topics example from NYT
Topic analysis (LDA) from 1.8 millions of New York Times articles
How it works
Text vectorization
Represent each document as a feature vector in the vector space, where each
position represents a word (token) and the contained value is its relevance in the
document.
● BoW (Bag of words)
● TF-IDF (Term Frequency - Inverse Document Frequency)
Document Term Matrix - Bag of Words
Text vectorization
TF-IDF example with scikit-learn
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer(max_df=0.5, max_features=1000,
min_df=2, stop_words='english')
tfidf_corpus = vectorizer.fit_transform(text_corpus)
face person guide lock cat dog sleep micro pool gym
0 1 2 3 4 5 6 7 8 9
D1 0.05 0.25
D2 0.02 0.32 0.45
...
...
tokens
documents
TF-IDF sparse matrix example
Text vectorization
“Did you ever wonder how great it would be if you could write your jmeter tests in ruby ? This projects aims to
do so. If you use it on your project just let me now. On the Architecture Academy you can read how jmeter can
be used to validate your Architecture. definition | architecture validation | academia de arquitetura”
Example
Tokens (unigrams and bigrams) Weight
jmeter 0.466
architecture 0.380
validate 0.243
validation 0.242
definition 0.239
write 0.225
academia arquitetura 0.218
academy 0.216
ruby 0.213
tests 0.209
Relevant keywords (TF-IDF)
Visualization of the average TF-IDF vectors of posts at Google+ for Work (CI&T)
Text vectorization
See the more details about this social and text analytics at http://bit.ly/python4ds_nb
Main techniques
● Latent Dirichlet Allocation (LDA) -> Probabilistic
● Latent Semantic Indexing / Analysis (LSI / LSA) -> Matrix Factorization
● Non-Negative Matrix Factorization (NMF) -> Matrix Factorization
Topic Modeling example (Python / gensim)
from gensim import corpora, models, similarities
documents = ["Human machine interface for lab abc computer applications",
"A survey of user opinion of computer system response time",
"The EPS user interface management system", ...]
stoplist = set('for a of the and to in'.split())
texts = [[word for word in document.lower().split() if word not in stoplist] for document in documents]
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
tfidf = models.TfidfModel(corpus)
corpus_tfidf = tfidf[corpus]
lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=2)
corpus_lsi = lsi[corpus_tfidf]
lsi.print_topics(2)
topic #0(1.594): 0.703*"trees" + 0.538*"graph" + 0.402*"minors" + 0.187*"survey" + ...
topic #1(1.476): 0.460*"system" + 0.373*"user" + 0.332*"eps" + 0.328*"interface" + 0.320*"response" + ...
Example of topic modeling using LSI technique
Example of the 2 topics discovered in the corpus
1
2
3
4
5
6
7
8
9
10
11
12
13
from pyspark.mllib.clustering import LDA, LDAModel
from pyspark.mllib.linalg import Vectors
# Load and parse the data
data = sc.textFile("data/mllib/sample_lda_data.txt")
parsedData = data.map(lambda line: Vectors.dense([float(x) for x in line.strip().split(' ')]))
# Index documents with unique IDs
corpus = parsedData.zipWithIndex().map(lambda x: [x[1], x[0]]).cache()
# Cluster the documents into three topics using LDA
ldaModel = LDA.train(corpus, k=3)
Topic Modeling example (PySpark MLLib LDA)
Sometimes simpler is better!
Scikit-learn topic models worked better for us than Spark MLlib LDA distributed
implementation because we have thousands of users with hundreds of posts,
and running scikit-learn on workers for each person was much quicker than
running distributed Spark LDA for each person.
1
2
3
4
5
6
7
8
9
Cosine similarity
Similarity metric between two vectors is cosine among the angle between them
from sklearn.metrics.pairwise import cosine_similarity
cosine_similarity(tfidf_matrix[0:1], tfidf_matrix)
Example with scikit-learn
Example of relevant keywords for a person and people with similar interests
People similarity
Frameworks - Topic Modeling
Python Java
Stanford Topic Modeling Toolbox (Excel plugin)
http://nlp.stanford.edu/software/tmt/tmt-0.4/
Python / Scala
Smart Canvas©
Corporate Collaboration
http://www.smartcanvas.com/
Powered by Recommender Systems and Topic Modeling techniques
Case
Content recommendations
Discover channel - Content recommendations with personalized explanations, based on user’s topics of
interest (discovered from their contributed content and reads)
Person topics of interest
User profile - Topics of interest of users are from the content that they contribute and are presented as
tags in their profile.
Searching people interested in topics / experts...
User discovered tags are searchable, allowing to find experts or people with specific interests.
Similar people
People recommendation - Recommends people with similar interests, explaining which topics are shared.
How it works
Collaboration Graph Model
Smart Canvas ©
Graph Model: Dashed lines and bold
labels are the relationships inferred by usage of RecSys
and Topic Modeling techniques
Architecture Overview
Jobs Pipeline
Content
Vectorization
People
Topic Modeling
Boards
Topic Modeling
Contents
Recommendation
Boards
Recommendation
Similar ContentGoogle
Cloud
Storage
Google
Cloud
Storage
pickles
pickles
pickles
loads
loads
loads
loads
loads
Stores topics and
recommendations in the graph
1
2
3
4
5
6
loads
content
loads
touchpoints
1. Groups all touchpoints (user interactions) by person
2. For each person
a. Selects contents user has interacted
b. Clusters contents TF-IDF vectors to model (e.g. LDA, NMF, Kmeans) person’ topics of interest
(varying the k from a range (1-5) and selecting the model whose clusters best describe
cohesive topics, penalizing large k)
c. Weights clusters relevance (to the user) by summing touchpoints strength (view, like,
bookmark, …) of each cluster, with a time decay on interaction age (older interactions are less
relevant to current person interest)
d. Returns highly relevant topics vectors for the person,labeled by its three top keywords
3. For each people topic
a. Calculate the cosine similarity (optionally applying Pivoted Unique Pivoted Normalization)
among the topic vector and content TF-IDF vectors
b. Recommends more similar contents to person user topics, which user has not yet
People Topic Modeling and Content
Recommendations algorithm
Bloom Filter
● A space-efficient Probabilistic Data Structure used to test whether an element is a
member of a set.
● False positive matches are possible, but false negatives are not.
● An empty Bloom filter is a bit array of m bits, all set to 0.
● Each inserted key is mapped to bits (which are all set to 1) by k different hash functions.
The set {x,y,z} was inserted in Bloom Filter. w is not in the
set, because it hashes to one bit equal to 0
Bloom Filter - Example Using cross_bloomfilter - Pure Python and Java compatible implementation
GitHub: http://bit.ly/cross_bf
Bloom Filter - Complexity Analysis
Ordered List
Space Complexity: O(n)
where n is the number of items in the set
Check for key presence
Time Complexity (binary search): O(log(n))
assuming an ordered list
Example: For a list of 100,000 string keys, with an average length of 18 characters
Key list size: 1,788,900 bytes
Bloom filter size: 137,919 bytes (false positive rate = 0.5%)
Bloom filter size: 18,043 bytes (false positive rate = 5%)
(and you can gzip it!)
Bloom Filter
Space Complexity: O(t * ln(p))
where t is the maximum number of items to be
inserted and p the accepted false positive rate
Check for key presence
Time Complexity: O(k)
where k is a constant representing the number of
hash functions to be applied to a string key
● Distributed Graph Database
● Elastic and linear scalability for a growing data and user base
● Support for ACID and eventual consistency
● Backended by Cassandra or HBase
● Support for global graph data analytics, reporting, and ETL through integration
with Spark, Hadoop, Giraph
● Support for geo and full text search (ElasticSearch, Lucene, Solr)
● Native integration with TinkerPop (Gremlin query language, Gremlin Server, ...)
TinkerPop
Gremlin traversals
g.V().has('CONTENT','id', 'Post:123')
.outE('IS SIMILAR TO')
.has('strength', gte(0.1)).as('e').inV().as('v')
.select('e','v')
.map{['id': it.get().v.values('id').next(),
'strength': it.get().e.values('strength').next()]}
.limit(10)
Querying similar contents
[{‘id’: ‘Post:456’, ‘strength’: 0.672},
{‘id’: ‘Post:789’, ‘strength’: 0.453},
{‘id’: ‘Post:333’, ‘strength’: 0.235},
...]
Example output:
Content Content
IS SIMILAR TO
1
2
3
4
5
6
7
Gremlin traversals
g.V().has('PERSON','id','Person:gabrielpm@ciandt.com')
.out('IS ABOUT').has('number', 1)
.inE('IS RECOMMENDED TO').filter{ it.get().value('strength') >= 0.1}
.order().by('strength',decr).as('tc').outV().as('c')
.select('tc','c')
.map{ ['contentId': it.get().c.value('id'),
'recommendationStrength': it.get().tc.value('strength') *
getTimeDecayFactor(it.get().c.value('updatedOn')) ] }
Querying recommended contents for a person
[{‘contentId’: ‘Post:456’, ‘strength’: 0.672},
{‘contentId’: ‘Post:789’, ‘strength’: 0.453},
{‘contentId’: ‘Post:333’, ‘strength’: 0.235},
...]
Example output:
Person Topic
IS ABOUT
Content
IS RECOMMENDED TO
1
2
3
4
5
6
7
8
Recommender
Systems Topic Modeling
Improvement of
user experience
for greater
engagement
Users’ interests
segmentation
Datasets
summarization
Personalized
recommendations
Wrap up
http://www.smartcanvas.com/
Thanks!
Gabriel Moreira - @gspmoreira
Gilmar Souza - @gilmarsouza
Discovering
Users’ Topics of Interest
in Recommender Systems

More Related Content

What's hot

Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsYONG ZHENG
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systemsKapil Garg
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeAmazon Web Services
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architectureLiang Xiang
 
Learned Embeddings for Search and Discovery at Instacart
Learned Embeddings for  Search and Discovery at InstacartLearned Embeddings for  Search and Discovery at Instacart
Learned Embeddings for Search and Discovery at InstacartSharath Rao
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation SystemAnamta Sayyed
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Shrutika Oswal
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorizationLuis Serrano
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems BasicsJarin Tasnim Khan
 
Recommendation system
Recommendation systemRecommendation system
Recommendation systemAkshat Thakar
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewYONG ZHENG
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGrubhubTech
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초Hyungsoo Ryoo
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflowamesar0
 

What's hot (20)

Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 
Tutorial: Context In Recommender Systems
Tutorial: Context In Recommender SystemsTutorial: Context In Recommender Systems
Tutorial: Context In Recommender Systems
 
Movie lens recommender systems
Movie lens recommender systemsMovie lens recommender systems
Movie lens recommender systems
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-time
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 
Learned Embeddings for Search and Discovery at Instacart
Learned Embeddings for  Search and Discovery at InstacartLearned Embeddings for  Search and Discovery at Instacart
Learned Embeddings for Search and Discovery at Instacart
 
Recommendation System
Recommendation SystemRecommendation System
Recommendation System
 
Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence Movie Recommender System Using Artificial Intelligence
Movie Recommender System Using Artificial Intelligence
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Recommendation Systems Basics
Recommendation Systems BasicsRecommendation Systems Basics
Recommendation Systems Basics
 
Recommendation system
Recommendation systemRecommendation system
Recommendation system
 
Context-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick ViewContext-aware Recommendation: A Quick View
Context-aware Recommendation: A Quick View
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerce
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated Recommendations
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Developing Movie Recommendation System
Developing Movie Recommendation SystemDeveloping Movie Recommendation System
Developing Movie Recommendation System
 
Project presentation
Project presentationProject presentation
Project presentation
 
ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflow
 

Viewers also liked

Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Gabriel Moreira
 
Python for Data Science - Python Brasil 11 (2015)
Python for Data Science - Python Brasil 11 (2015)Python for Data Science - Python Brasil 11 (2015)
Python for Data Science - Python Brasil 11 (2015)Gabriel Moreira
 
Hadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, sonHadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, sonChengeng Ma
 
Molnnet csp
Molnnet cspMolnnet csp
Molnnet cspMolnnet
 
Contaminación ambiental
Contaminación ambientalContaminación ambiental
Contaminación ambientalAriannaRD
 
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...Altoros
 
Retail: interaction and integration of new technologies.
Retail: interaction and integration of new technologies.Retail: interaction and integration of new technologies.
Retail: interaction and integration of new technologies.Michael Mazzer
 
Who Lives in Our Garden?
Who Lives in Our Garden?Who Lives in Our Garden?
Who Lives in Our Garden?Altoros
 
Kathleen Breitman at the Hyperledger Meetup
Kathleen Breitman at the Hyperledger Meetup Kathleen Breitman at the Hyperledger Meetup
Kathleen Breitman at the Hyperledger Meetup Altoros
 
Image Recognition with TensorFlow
Image Recognition with TensorFlowImage Recognition with TensorFlow
Image Recognition with TensorFlowAltoros
 
A Perfect Cloud Foundry Engineer
A Perfect Cloud Foundry EngineerA Perfect Cloud Foundry Engineer
A Perfect Cloud Foundry EngineerAltoros
 
Banking on a Blockchain
Banking on a BlockchainBanking on a Blockchain
Banking on a BlockchainAltoros
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventuremylittleadventure
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowAltoros
 

Viewers also liked (20)

Python for Data Science - TDC 2015
Python for Data Science - TDC 2015Python for Data Science - TDC 2015
Python for Data Science - TDC 2015
 
Python for Data Science - Python Brasil 11 (2015)
Python for Data Science - Python Brasil 11 (2015)Python for Data Science - Python Brasil 11 (2015)
Python for Data Science - Python Brasil 11 (2015)
 
Guia de aprendizaje 4
Guia de aprendizaje 4 Guia de aprendizaje 4
Guia de aprendizaje 4
 
4. measure impact of learning.final
4. measure impact of learning.final4. measure impact of learning.final
4. measure impact of learning.final
 
Manual de talleres
Manual de talleresManual de talleres
Manual de talleres
 
Hadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, sonHadoop implementation for algorithms apriori, pcy, son
Hadoop implementation for algorithms apriori, pcy, son
 
Molnnet csp
Molnnet cspMolnnet csp
Molnnet csp
 
Habitat
HabitatHabitat
Habitat
 
Contaminación ambiental
Contaminación ambientalContaminación ambiental
Contaminación ambiental
 
lsrs15_ciandt
lsrs15_ciandtlsrs15_ciandt
lsrs15_ciandt
 
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
Capturing Data and Improving Outcomes for Humans and Machines Using the Inter...
 
Retail: interaction and integration of new technologies.
Retail: interaction and integration of new technologies.Retail: interaction and integration of new technologies.
Retail: interaction and integration of new technologies.
 
Who Lives in Our Garden?
Who Lives in Our Garden?Who Lives in Our Garden?
Who Lives in Our Garden?
 
Kathleen Breitman at the Hyperledger Meetup
Kathleen Breitman at the Hyperledger Meetup Kathleen Breitman at the Hyperledger Meetup
Kathleen Breitman at the Hyperledger Meetup
 
Image Recognition with TensorFlow
Image Recognition with TensorFlowImage Recognition with TensorFlow
Image Recognition with TensorFlow
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
A Perfect Cloud Foundry Engineer
A Perfect Cloud Foundry EngineerA Perfect Cloud Foundry Engineer
A Perfect Cloud Foundry Engineer
 
Banking on a Blockchain
Banking on a BlockchainBanking on a Blockchain
Banking on a Blockchain
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
 
Intro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlowIntro to the Distributed Version of TensorFlow
Intro to the Distributed Version of TensorFlow
 

Similar to Discovering User's Topics of Interest in Recommender Systems

Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Gabriel Moreira
 
ClassifyingIssuesFromSRTextAzureML
ClassifyingIssuesFromSRTextAzureMLClassifyingIssuesFromSRTextAzureML
ClassifyingIssuesFromSRTextAzureMLGeorge Simov
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Gabriel Moreira
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsBen DeMott
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Lucidworks
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Karen Thompson
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahoutaneeshabakharia
 
Map Reduce amrp presentation
Map Reduce amrp presentationMap Reduce amrp presentation
Map Reduce amrp presentationrenjan131
 
Multilayered paper prototyping for user concept modeling
Multilayered paper prototyping for user concept modelingMultilayered paper prototyping for user concept modeling
Multilayered paper prototyping for user concept modelingUKOLN (dev), University of Bath
 
ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxLokeshKumarReddy8
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender systemKapil Kumar
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-stepsMatteo Moci
 

Similar to Discovering User's Topics of Interest in Recommender Systems (20)

Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
Discovering User's Topics of Interest in Recommender Systems @ Meetup Machine...
 
ClassifyingIssuesFromSRTextAzureML
ClassifyingIssuesFromSRTextAzureMLClassifyingIssuesFromSRTextAzureML
ClassifyingIssuesFromSRTextAzureML
 
Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação Sistemas de Recomendação sem Enrolação
Sistemas de Recomendação sem Enrolação
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementations
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
 
Chatbot_Presentation
Chatbot_PresentationChatbot_Presentation
Chatbot_Presentation
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
 
Recsys 2016
Recsys 2016Recsys 2016
Recsys 2016
 
Map Reduce amrp presentation
Map Reduce amrp presentationMap Reduce amrp presentation
Map Reduce amrp presentation
 
Ad507
Ad507Ad507
Ad507
 
Multilayered paper prototyping for user concept modeling
Multilayered paper prototyping for user concept modelingMultilayered paper prototyping for user concept modeling
Multilayered paper prototyping for user concept modeling
 
ML crash course
ML crash courseML crash course
ML crash course
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
 
Social recommender system
Social recommender systemSocial recommender system
Social recommender system
 
1645 track 2 pafka
1645 track 2 pafka1645 track 2 pafka
1645 track 2 pafka
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Elasticsearch first-steps
Elasticsearch first-stepsElasticsearch first-steps
Elasticsearch first-steps
 

More from Gabriel Moreira

[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...Gabriel Moreira
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...Gabriel Moreira
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Gabriel Moreira
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...Gabriel Moreira
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceGabriel Moreira
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Gabriel Moreira
 
CI&T Tech Summit 2017 - Machine Learning para Sistemas de Recomendação
CI&T Tech Summit 2017 - Machine Learning para Sistemas de RecomendaçãoCI&T Tech Summit 2017 - Machine Learning para Sistemas de Recomendação
CI&T Tech Summit 2017 - Machine Learning para Sistemas de RecomendaçãoGabriel Moreira
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Gabriel Moreira
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsGabriel Moreira
 
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...Gabriel Moreira
 
Developing GeoGames for Education with Kinect and Android for ArcGIS Runtime
Developing GeoGames for Education with Kinect and Android for ArcGIS RuntimeDeveloping GeoGames for Education with Kinect and Android for ArcGIS Runtime
Developing GeoGames for Education with Kinect and Android for ArcGIS RuntimeGabriel Moreira
 
Dojo Imagem de Android - 19/06/2012
Dojo Imagem de Android - 19/06/2012Dojo Imagem de Android - 19/06/2012
Dojo Imagem de Android - 19/06/2012Gabriel Moreira
 
Agile Testing e outros amendoins
Agile Testing e outros amendoinsAgile Testing e outros amendoins
Agile Testing e outros amendoinsGabriel Moreira
 
ArcGIS Runtime For Android
ArcGIS Runtime For AndroidArcGIS Runtime For Android
ArcGIS Runtime For AndroidGabriel Moreira
 
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...Gabriel Moreira
 
Continuous Inspection - An effective approch towards Software Quality Product...
Continuous Inspection - An effective approch towards Software Quality Product...Continuous Inspection - An effective approch towards Software Quality Product...
Continuous Inspection - An effective approch towards Software Quality Product...Gabriel Moreira
 
An Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesAn Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesGabriel Moreira
 
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...METACOM – Uma análise de correlação entre métricas de produto e propensão à m...
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...Gabriel Moreira
 
Software Product Measurement and Analysis in a Continuous Integration Environ...
Software Product Measurement and Analysis in a Continuous Integration Environ...Software Product Measurement and Analysis in a Continuous Integration Environ...
Software Product Measurement and Analysis in a Continuous Integration Environ...Gabriel Moreira
 

More from Gabriel Moreira (19)

[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
[Phd Thesis Defense] CHAMELEON: A Deep Learning Meta-Architecture for News Re...
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019Deep Learning for Recommender Systems @ TDC SP 2019
Deep Learning for Recommender Systems @ TDC SP 2019
 
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
PAPIs LATAM 2019 - Training and deploying ML models with Kubeflow and TensorF...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018Deep Recommender Systems - PAPIs.io LATAM 2018
Deep Recommender Systems - PAPIs.io LATAM 2018
 
CI&T Tech Summit 2017 - Machine Learning para Sistemas de Recomendação
CI&T Tech Summit 2017 - Machine Learning para Sistemas de RecomendaçãoCI&T Tech Summit 2017 - Machine Learning para Sistemas de Recomendação
CI&T Tech Summit 2017 - Machine Learning para Sistemas de Recomendação
 
Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017Feature Engineering - Getting most out of data for predictive models - TDC 2017
Feature Engineering - Getting most out of data for predictive models - TDC 2017
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive modelsFeature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
 
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...
Using Neural Networks and 3D sensors data to model LIBRAS gestures recognitio...
 
Developing GeoGames for Education with Kinect and Android for ArcGIS Runtime
Developing GeoGames for Education with Kinect and Android for ArcGIS RuntimeDeveloping GeoGames for Education with Kinect and Android for ArcGIS Runtime
Developing GeoGames for Education with Kinect and Android for ArcGIS Runtime
 
Dojo Imagem de Android - 19/06/2012
Dojo Imagem de Android - 19/06/2012Dojo Imagem de Android - 19/06/2012
Dojo Imagem de Android - 19/06/2012
 
Agile Testing e outros amendoins
Agile Testing e outros amendoinsAgile Testing e outros amendoins
Agile Testing e outros amendoins
 
ArcGIS Runtime For Android
ArcGIS Runtime For AndroidArcGIS Runtime For Android
ArcGIS Runtime For Android
 
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...
EARLY-FIX: Um Framework para Predição de Manutenção Corretiva de Software uti...
 
Continuous Inspection - An effective approch towards Software Quality Product...
Continuous Inspection - An effective approch towards Software Quality Product...Continuous Inspection - An effective approch towards Software Quality Product...
Continuous Inspection - An effective approch towards Software Quality Product...
 
An Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesAn Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming Practices
 
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...METACOM – Uma análise de correlação entre métricas de produto e propensão à m...
METACOM – Uma análise de correlação entre métricas de produto e propensão à m...
 
Software Product Measurement and Analysis in a Continuous Integration Environ...
Software Product Measurement and Analysis in a Continuous Integration Environ...Software Product Measurement and Analysis in a Continuous Integration Environ...
Software Product Measurement and Analysis in a Continuous Integration Environ...
 

Recently uploaded

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Discovering User's Topics of Interest in Recommender Systems

  • 1. Discovering Users’ Topics of Interest in Recommender Systems Gabriel Moreira - @gspmoreira Gilmar Souza - @gilmarsouza Track: Data Science 2016
  • 2. Agenda ● Recommender Systems ● Topic Modeling ● Case: Smart Canvas - Corporative Collaboration ● Wrap up
  • 4. Life is too short!
  • 6. Recommendations by interaction What should I watch today? What do you like, my son?
  • 7. "A lot of times, people don’t know what they want until you show it to them." Steve Jobs "We are leaving the Information Age and entering the Recommendation Age.". Cris Anderson, "The long tail"
  • 8. 38% of sales 2/3 movie rentals Recommendations are responsible for... 38% of top news visualization
  • 9. What else may I recommend?
  • 10. What can a Recommender Systems do? Prediction Given an item, what is its relevance for each user? Recommendation Given a user, produce an ordered list matching the user needs
  • 12. Content-Based Filtering Similar content (e.g. actor) Likes Recommends
  • 13. Advantages ● Does not depend upon other users ● May recommend new and unpopular items ● Recommendations can be easily explained Drawbacks ● Overspecialization ● May not recommend to new users May be difficult to extract attributes from audio, movies or images Content-Based Filtering
  • 14. User-Based Collaborative Filtering Similar interests Likes Recommends
  • 15. Item-Based Collaborative Filtering Likes Recommends Who likes A also likes B Likes Likes
  • 16. Collaborative Filtering Advantages ● Works to any item kind (ignore attributes) Drawbacks ● Cannot recommend items not already rated/consumed ● Usually recommends more popular items ● Needs a minimum amount of users to match similar users (cold start)
  • 17. Hybrid Recommender Systems Composite Iterates by a chain of algorithm, aggregating recommendations. Weighted Each algorithm has as a weight and the final recommendations are defined by weighted averages. Some approaches
  • 18. UBCF Example (Java / Mahout) // Loads user-item ratings DataModel model = new FileDataModel(new File("input.csv")); // Defines a similarity metric to compare users (Person's correlation coefficient) UserSimilarity similarity = new PearsonCorrelationSimilarity(model); // Threshold the minimum similarity to consider two users similar UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.1, similarity, model); // Create a User-Based Collaborative Filtering recommender UserBasedRecommender recommender = new GenericUserBasedRecommender (model, neighborhood, similarity); // Return the top 3 recommendations for userId=2 List recommendations = recommender.recommend(2, 3); User,Item,Rating1, 15,4.0 1,16,5.0 1,17,1.0 1,18,5.0 2,10,1.0 2,11,2.0 2,15,5.0 2,16,4.5 2,17,1.0 2,18,5.0 3,11,2.5 input.csv User-Based Collaborative Filtering example (Mahout) 1 2 3 4 5 6
  • 19. Frameworks - Recommender Systems Python Python / ScalaJava .NET Java
  • 20. Books
  • 22. Topic Modeling A simple way to analyze topics of large text collections (corpus). What is this data about? What are the main topics?
  • 23. Topic A Cluster of words that generally occur together and are related. Example 2D topics visualization with pyLDAviz (topics are the circles in the left, main terms of selected topic are shown in the right)
  • 24. Topics evolution Visualization of topics evolution during across time
  • 25. Topic Modeling applications ● Detect trends in the market and customer challenges of a industry from media and social networks. ● Marketing SEO: Optimization of search keywords to improve page ranking in searchers ● Identify users preferences to recommend relevant content, products, jobs, …
  • 26. ● Documents are composed by many topics ● Topics are composed by many words (tokens) Topic Modeling Documents Topics Words (Tokens) Observable Latent Observable Unsupervised learning technique, which does not require too much effort on pre- processing (usually just Text Vectorization). In general, the only parameter is the number of topics (k).
  • 27. Topics in a document
  • 28. Topics example from NYT Topic analysis (LDA) from 1.8 millions of New York Times articles
  • 30. Text vectorization Represent each document as a feature vector in the vector space, where each position represents a word (token) and the contained value is its relevance in the document. ● BoW (Bag of words) ● TF-IDF (Term Frequency - Inverse Document Frequency) Document Term Matrix - Bag of Words
  • 32. TF-IDF example with scikit-learn from sklearn.feature_extraction.text import TfidfVectorizer vectorizer = TfidfVectorizer(max_df=0.5, max_features=1000, min_df=2, stop_words='english') tfidf_corpus = vectorizer.fit_transform(text_corpus) face person guide lock cat dog sleep micro pool gym 0 1 2 3 4 5 6 7 8 9 D1 0.05 0.25 D2 0.02 0.32 0.45 ... ... tokens documents TF-IDF sparse matrix example
  • 33. Text vectorization “Did you ever wonder how great it would be if you could write your jmeter tests in ruby ? This projects aims to do so. If you use it on your project just let me now. On the Architecture Academy you can read how jmeter can be used to validate your Architecture. definition | architecture validation | academia de arquitetura” Example Tokens (unigrams and bigrams) Weight jmeter 0.466 architecture 0.380 validate 0.243 validation 0.242 definition 0.239 write 0.225 academia arquitetura 0.218 academy 0.216 ruby 0.213 tests 0.209 Relevant keywords (TF-IDF)
  • 34. Visualization of the average TF-IDF vectors of posts at Google+ for Work (CI&T) Text vectorization See the more details about this social and text analytics at http://bit.ly/python4ds_nb
  • 35. Main techniques ● Latent Dirichlet Allocation (LDA) -> Probabilistic ● Latent Semantic Indexing / Analysis (LSI / LSA) -> Matrix Factorization ● Non-Negative Matrix Factorization (NMF) -> Matrix Factorization
  • 36. Topic Modeling example (Python / gensim) from gensim import corpora, models, similarities documents = ["Human machine interface for lab abc computer applications", "A survey of user opinion of computer system response time", "The EPS user interface management system", ...] stoplist = set('for a of the and to in'.split()) texts = [[word for word in document.lower().split() if word not in stoplist] for document in documents] dictionary = corpora.Dictionary(texts) corpus = [dictionary.doc2bow(text) for text in texts] tfidf = models.TfidfModel(corpus) corpus_tfidf = tfidf[corpus] lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=2) corpus_lsi = lsi[corpus_tfidf] lsi.print_topics(2) topic #0(1.594): 0.703*"trees" + 0.538*"graph" + 0.402*"minors" + 0.187*"survey" + ... topic #1(1.476): 0.460*"system" + 0.373*"user" + 0.332*"eps" + 0.328*"interface" + 0.320*"response" + ... Example of topic modeling using LSI technique Example of the 2 topics discovered in the corpus 1 2 3 4 5 6 7 8 9 10 11 12 13
  • 37. from pyspark.mllib.clustering import LDA, LDAModel from pyspark.mllib.linalg import Vectors # Load and parse the data data = sc.textFile("data/mllib/sample_lda_data.txt") parsedData = data.map(lambda line: Vectors.dense([float(x) for x in line.strip().split(' ')])) # Index documents with unique IDs corpus = parsedData.zipWithIndex().map(lambda x: [x[1], x[0]]).cache() # Cluster the documents into three topics using LDA ldaModel = LDA.train(corpus, k=3) Topic Modeling example (PySpark MLLib LDA) Sometimes simpler is better! Scikit-learn topic models worked better for us than Spark MLlib LDA distributed implementation because we have thousands of users with hundreds of posts, and running scikit-learn on workers for each person was much quicker than running distributed Spark LDA for each person. 1 2 3 4 5 6 7 8 9
  • 38. Cosine similarity Similarity metric between two vectors is cosine among the angle between them from sklearn.metrics.pairwise import cosine_similarity cosine_similarity(tfidf_matrix[0:1], tfidf_matrix) Example with scikit-learn
  • 39. Example of relevant keywords for a person and people with similar interests People similarity
  • 40. Frameworks - Topic Modeling Python Java Stanford Topic Modeling Toolbox (Excel plugin) http://nlp.stanford.edu/software/tmt/tmt-0.4/ Python / Scala
  • 41. Smart Canvas© Corporate Collaboration http://www.smartcanvas.com/ Powered by Recommender Systems and Topic Modeling techniques Case
  • 42. Content recommendations Discover channel - Content recommendations with personalized explanations, based on user’s topics of interest (discovered from their contributed content and reads)
  • 43. Person topics of interest User profile - Topics of interest of users are from the content that they contribute and are presented as tags in their profile.
  • 44. Searching people interested in topics / experts... User discovered tags are searchable, allowing to find experts or people with specific interests.
  • 45. Similar people People recommendation - Recommends people with similar interests, explaining which topics are shared.
  • 47. Collaboration Graph Model Smart Canvas © Graph Model: Dashed lines and bold labels are the relationships inferred by usage of RecSys and Topic Modeling techniques
  • 49. Jobs Pipeline Content Vectorization People Topic Modeling Boards Topic Modeling Contents Recommendation Boards Recommendation Similar ContentGoogle Cloud Storage Google Cloud Storage pickles pickles pickles loads loads loads loads loads Stores topics and recommendations in the graph 1 2 3 4 5 6 loads content loads touchpoints
  • 50. 1. Groups all touchpoints (user interactions) by person 2. For each person a. Selects contents user has interacted b. Clusters contents TF-IDF vectors to model (e.g. LDA, NMF, Kmeans) person’ topics of interest (varying the k from a range (1-5) and selecting the model whose clusters best describe cohesive topics, penalizing large k) c. Weights clusters relevance (to the user) by summing touchpoints strength (view, like, bookmark, …) of each cluster, with a time decay on interaction age (older interactions are less relevant to current person interest) d. Returns highly relevant topics vectors for the person,labeled by its three top keywords 3. For each people topic a. Calculate the cosine similarity (optionally applying Pivoted Unique Pivoted Normalization) among the topic vector and content TF-IDF vectors b. Recommends more similar contents to person user topics, which user has not yet People Topic Modeling and Content Recommendations algorithm
  • 51. Bloom Filter ● A space-efficient Probabilistic Data Structure used to test whether an element is a member of a set. ● False positive matches are possible, but false negatives are not. ● An empty Bloom filter is a bit array of m bits, all set to 0. ● Each inserted key is mapped to bits (which are all set to 1) by k different hash functions. The set {x,y,z} was inserted in Bloom Filter. w is not in the set, because it hashes to one bit equal to 0
  • 52. Bloom Filter - Example Using cross_bloomfilter - Pure Python and Java compatible implementation GitHub: http://bit.ly/cross_bf
  • 53. Bloom Filter - Complexity Analysis Ordered List Space Complexity: O(n) where n is the number of items in the set Check for key presence Time Complexity (binary search): O(log(n)) assuming an ordered list Example: For a list of 100,000 string keys, with an average length of 18 characters Key list size: 1,788,900 bytes Bloom filter size: 137,919 bytes (false positive rate = 0.5%) Bloom filter size: 18,043 bytes (false positive rate = 5%) (and you can gzip it!) Bloom Filter Space Complexity: O(t * ln(p)) where t is the maximum number of items to be inserted and p the accepted false positive rate Check for key presence Time Complexity: O(k) where k is a constant representing the number of hash functions to be applied to a string key
  • 54. ● Distributed Graph Database ● Elastic and linear scalability for a growing data and user base ● Support for ACID and eventual consistency ● Backended by Cassandra or HBase ● Support for global graph data analytics, reporting, and ETL through integration with Spark, Hadoop, Giraph ● Support for geo and full text search (ElasticSearch, Lucene, Solr) ● Native integration with TinkerPop (Gremlin query language, Gremlin Server, ...) TinkerPop
  • 55. Gremlin traversals g.V().has('CONTENT','id', 'Post:123') .outE('IS SIMILAR TO') .has('strength', gte(0.1)).as('e').inV().as('v') .select('e','v') .map{['id': it.get().v.values('id').next(), 'strength': it.get().e.values('strength').next()]} .limit(10) Querying similar contents [{‘id’: ‘Post:456’, ‘strength’: 0.672}, {‘id’: ‘Post:789’, ‘strength’: 0.453}, {‘id’: ‘Post:333’, ‘strength’: 0.235}, ...] Example output: Content Content IS SIMILAR TO 1 2 3 4 5 6 7
  • 56. Gremlin traversals g.V().has('PERSON','id','Person:gabrielpm@ciandt.com') .out('IS ABOUT').has('number', 1) .inE('IS RECOMMENDED TO').filter{ it.get().value('strength') >= 0.1} .order().by('strength',decr).as('tc').outV().as('c') .select('tc','c') .map{ ['contentId': it.get().c.value('id'), 'recommendationStrength': it.get().tc.value('strength') * getTimeDecayFactor(it.get().c.value('updatedOn')) ] } Querying recommended contents for a person [{‘contentId’: ‘Post:456’, ‘strength’: 0.672}, {‘contentId’: ‘Post:789’, ‘strength’: 0.453}, {‘contentId’: ‘Post:333’, ‘strength’: 0.235}, ...] Example output: Person Topic IS ABOUT Content IS RECOMMENDED TO 1 2 3 4 5 6 7 8
  • 57. Recommender Systems Topic Modeling Improvement of user experience for greater engagement Users’ interests segmentation Datasets summarization Personalized recommendations Wrap up
  • 59. Thanks! Gabriel Moreira - @gspmoreira Gilmar Souza - @gilmarsouza Discovering Users’ Topics of Interest in Recommender Systems