SlideShare a Scribd company logo
1 of 21
Location Embeddings for Next Trip
Recommendation
Amine Dadoun, Raphael Troncy,
Riccardo Petitti, Olivier Ratier
LocWeb19,
13 May 2019
LocWeb 2019 … Why?
LocationUser
Web, Social Media Recommendation, Travel
2
Travel … A great source of inspiration
John Doe
“I do not
know where
to go”
“Try this”
3
Use Case Description
Given a traveler, his demographics, his historical bookings and the
contextual data related to these bookings, we recommend him a
ranked list of destinations he would like to go to.
Traveler's Demographic Data
43 years old, Malaysian, Male, Nature, Museums
Time
Contextual Data
14/09/2016, Wednesday, 2 Days, Alone, etc.
21/12/2016, Friday, 14 Days, 4 persons in party, etc.
07/06/2017, Saturday, 10 Days, 2 persons in party, etc.
15/01/2017, Sunday, 5 Days, Alone, etc.
09/09/2018, Sunday, 4 Days, Alone, etc.
?
+
4
Scientific Problems
Given historical purchases made by a user (or user-item past interactions), plus the
context where the interaction was made, how can we accurately predict what will
be the next item the user is going to interact with?
Research Questions
1. What item to recommend to the user?
2. Can we integrate external data to improve the accuracy of a predictive model?
3. How can we evaluate the recommendation made to this user?
5
DKFM (our approach):
It combines Factorization
Machines in order to
represent contextual
information and the WDL
Recommender System in
order to have the user-
item interactions and the
content information. The
combination of these two
models are represented in
a DNN
6
State of the Art
Recommender
System
Collaborative
Filtering [1, 2, 3]
Implicit MF
Bayesian
Personalized MF
Neural
Collaborative
Filtering
Content-based
Filtering [4]
Item KNN
Hybrid Method [5]
Wide & Deep
Learning
Context-aware
Recommender
System [6, 7]
Factorization
Machines
Neural
Factorization
Machines
Knowledge-aware
Recommender
System [8]
Deep Knowledge
Factorization
Machines
Collaborative Fileting:
They are Matrix Factorization
methods based only on the user-
item interaction. They vary either on
the loss used in the training or in the
interaction function that computes
the recommendation probability.
Content-based Filtering:
Item KNN is a neighborhood based
collaborative filtering method, it
computes the k nearest neighbors
for each item.
Hybrid Method:
WDL is a DNN Model that computes
the probability to have a user-item
pair based on both user-item
interaction and the content of the
item
Context-aware Recommender System:
These two methods are based on
factorization machines algorithm
which take into account the context of
the recommendation in addition to
the user-item interaction
Our ModelSota & baselines
Recommender
Systems
7
Data integration to enrich the representation of destination
User
Items
𝑢𝑢1
𝑖𝑖1
𝑖𝑖2
𝑖𝑖3
...
User-Item Interactions
Age,
Nationality,
Gender,
Etc.
User’s Demographics
Date,
Session behavior,
Etc.
Interaction Information
Item description:
• Text
• Knowledge Graph
• Etc.
Content Information
8
Contribution: Deep Knowledge Factorization Machines (DKFM)
Deep Neural Network:
• Collaborative information
• Content information
• Contextual information
User
Items
𝑢𝑢1
𝑖𝑖1
𝑖𝑖2
𝑖𝑖3
...
User-Item Interactions
Item Description:
• Text
• Knowledge Graph
• Etc.
Content Information
Age,
Nationality,
Gender,
Etc.
User’s Demographics
Date,
Session Behavior,
Etc.
Interaction Information
9
Back to our problem … Next Trip Destination
Traveler's Demographic Data
43 years old, Malaysian, Male, Nature, Museums
14/09/2016
Wednesday
2 Days
Alone
21/12/2016
Friday
14 Days
4 persons in party
07/06/2017
Saturday
10 Days
2 persons in party
09/09/2018
Sunday
4 Days
Alone
?
Historical Bookings with contextual information Next Trip Recommendation
10
Traveller's Profiles Data
• Real Traveler’s Data • Number of Profiles: ~20M
• Number of Trips: ~15 M• Trip Type: One-way, Round-Trip, Multiple Journeys Trip
• Time range: February 2013- October 2019 • Number of Destinations: 1146
• Booking Creation Date
• Stay Duration
• Origin Airport
• Origin City
• Origin Country
• Origin Region
• Destination Airport
• Destination City
• Destination Country
• Destination Region
• Departure Date
• Departure Day of the Week
• Arrival Date
• Advanced Purchase
• Advanced Check-in
• Trip Number in Party
TripCustomer
• Age
• Customer Value
• Days to Next Bday
• Days to Next Flight
• Nationality
• Gender
• Last Booking Date
• Last Flown Date
• Type of Services
• Service Code
Trip
Services
Traveller
Data Pre-processing Pipeline
• Trips
• Traveler
demographics
Remove Travelers
with less than 5 Trips
• Remove Travelers
with less than 5 different Trips
• Remove Destinations visited less
than 20 times
Only 32% of the trips left Only 4% of the trips left
Business Leisure
Only 2% of the trips left
Number of Travelers 26K/20M (0.13%)
Number of Trips 300K/15M (2.1%)
Number of Destinations 119/1146 (10%)
Travelers Segmentation
11
12
Data Pre-processing: Data Filtering for Recommendation
• Remove Travelers with less than 5 Trips (Different Destinations)
• Remove Destinations that are visited less than 20 Times
Kuala Lumpur Sydney London New York Paris
Traveler 1 8 2 1 0 0
Traveler 2 4 0 1 0 1
Traveler 3 2 2 2 1 0
Traveler 4 4 0 0 0 2
Traveler 5 1 0 2 0 3
• Number of Trips: ~4.8 M bookings
• Number of Travelers: 814 919
• Number of Destinations: 763
R =
• Sparsity is defined as follows: 𝜌𝜌 𝑅𝑅 = 1 −
#𝐼𝐼 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼 𝐼𝐼𝐼𝐼
#𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 × #𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼
#Feedbacks #Interactions #Cities #Travelers Sparsity
610 515 361 412 135 31 205 92%
• 𝜌𝜌(Leisure_Trips) = 99.8%: Too sparse to build a Recommender System
• More than 65% of travelers have traveled only 2 times
• Interaction Matrix: 𝑅𝑅 ∈ 𝑁𝑁#𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 × #𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷
:
𝑟𝑟𝑢𝑢𝑢𝑢 = #𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑡𝑡𝑡𝑡 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑢𝑢
13
Data Pre-processing: Customer Segmentation
CEM Trips
Business Leisure
Historical Trips already
labeled B/L
Training
B/L Classifier
Prediction
Trips Data: 122 242 trips
Features used:
• Number of Passenger, Stay Duration,
Saturday Stay, Purchase Anticipation, Age, Gender
Time Range:
• Feb 2014 - Feb 2017
Distribution:
• 40-60 % B/L
Training
Random Forest Classifier
Grid Search on Training Data
5 Fold Cross Validation for evaluation with 75-25%
Training & Test Set
Accuracy = 0.87, Precision = 0.87, Recall = 0.91
Features Importance
#Feedbacks #Interactions #Cities #Travelers Sparsity
304 019 152 547 119 26 019 95%
14
Data Enrichment using Word Embeddings
Phuket
Adelaide
London
Etc.
Cities
…
Wikipedia Cities Content
1. Compute the TF-IDF of each word
the
a
Etc.
Pre-trained
Word Vectors [8]
2. London Textual Embedding:
Weighted sum of word vectors,
where the weight of each word vector corresponds to the term
frequency-inverse document frequency (TF-IDF) of the word
15
Data Enrichment using Knowledge Graph Embeddings
Knowledge Graph Embeddings (KGE)
Phuket
Adelaide
London
Etc.
Cities
TransE Model[9] :
Given a triple (h, r, t) in the graph,
the idea is to minimize the distance
between h and t embeddings
KGE_Phuket
KGE_Adelaide
KGE_London
Etc.
KGE Cities
Knowledge Graph
Embedding of Phuket
Semantic Trails Knowledge Graph:
The knowledge graph represents the interaction user-venue,
through the property ’visiting’ as well as the relations
between the venue and the other entities,
namely: category, schema and city
https://arxiv.org/abs/1812.04367
16
Deep Knowledge Factorization Machines
Deep Neural Network:
• Collaborative information
• Content information
• Contextual information
Semantic Trails Knowledge Graph
• What characterized a city the most?
• An Embedding of each city is constructed
based on TransE model
• TransE Model: Given a triple (h, r, t) in the
graph, the idea is to minimize the distance
between h and t embeddings
Wikipedia
• Representation of cities based on their textual description
in Wikipedia
• Each Wikipedia Document is encoded as a weighted sum of
word vectors
• We used pre-trained word vectors from fasttext (n-gram
model)
• N-gram model is similar to Skip-gram model, but instead of
learning a vector representation for a word, we learn a
representation for each character.
• Weights of the word vectors are their TF-IDF scores
Travelers' Profiles & Trips
External Data
Leave-one-out protocol: for each user, we remove the last destination he went to, and consider it as test set
17
Training Procedure and Evaluation ProtocolTime
Training Data
Test Data
Recommender
System
Non Existing
Traveler-Destination
pair
Recommender
System trained
Ranked list
of Destinations
Prediction
…
1.
2.
4.
3.
 Hitrate@K [3]
 MRR@K [7]
Adelaide
Osaka
Phuket
Brunei
18
Results: DKFM vs Baselines
Our Model
19
DKFM: what is the contribution of each input data?
Better
Deep Neural Network + Data Enrichment => Best results
Demographics
Data
Textual
Embedding
Knowledge
Graph
Embedding
HR@
10
MRR@
10
0.72 0.34
0.79 0.37
0.80 0.38
0.82 0.38
0.84 0.41
0.85 0.42
0.88 0.44
Input Contribution
20
Conclusion and Future Work
Future Work
• Enrich cities’ characteristics using visual embeddings
• Explore other loss functions such as pairwise loss
• Explore the use of similarity measure inside the DNN such as cosine similarity
Conclusions
• Combining different types of input improves remarkably recommendation results
• DKFM model outperforms state-of-the-art collaborative filtering methods
Open Science
• DKFM implementation available at
https://gitlab.eurecom.fr/amadeus/DKFM-recommendation
21
References
[1] Badrul Sarwar, George Karypis, Joseph A Konstan, and John Riedl. 2001. Item-based collaborative filtering
recommendation algorithms.
[2] Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets.
[3] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized
Ranking from Implicit Feedback.
[4] Steffen Rendle. 2010. Factorization Machines.
[5] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra,Hrishi Aradhye, Glen Anderson, Greg
Corrado, Wei Chai, Mustafa Ispir, RohanAnil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah.2016.
Wide & Deep Learning for Recommender Systems.
[6] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering.
[7] Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based
Neural Network for CTR Prediction.
[8] Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training
Distributed Word Representations.
[9] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating
Embeddings for Modeling Multi-relational Data.

More Related Content

Similar to Location Embeddings for Next Trip Recommendation

Bootcamp python-1
Bootcamp python-1Bootcamp python-1
Bootcamp python-1Era Wibowo
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesMaya Hristakeva
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018Tuan Hoang
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用台灣資料科學年會
 
Connecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsConnecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsRPO America
 
Presentation of PhD thesis on Location Data Fusion
Presentation of PhD thesis on Location Data Fusion Presentation of PhD thesis on Location Data Fusion
Presentation of PhD thesis on Location Data Fusion Alket Cecaj
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Pratibha Singh
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsOCLC
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsLynn Connaway
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POIIRJET Journal
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
I ii x_slides_albakour_online
I ii x_slides_albakour_onlineI ii x_slides_albakour_online
I ii x_slides_albakour_onlineDyaa AlBakour
 
Solving Real Life Problems using Data Science Part - 1
Solving Real Life Problems using Data Science Part - 1Solving Real Life Problems using Data Science Part - 1
Solving Real Life Problems using Data Science Part - 1Sohom Ghosh
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesConnected Data World
 
Smart Cities that don't go "bump" in the night: delivering interoperable smar...
Smart Cities that don't go "bump" in the night: delivering interoperable smar...Smart Cities that don't go "bump" in the night: delivering interoperable smar...
Smart Cities that don't go "bump" in the night: delivering interoperable smar...Rick Robinson
 
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning Phuc Nguyen
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sourcesCraig Knoblock
 

Similar to Location Embeddings for Next Trip Recommendation (20)

Bootcamp python-1
Bootcamp python-1Bootcamp python-1
Bootcamp python-1
 
Beyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research ArticlesBeyond Collaborative Filtering: Learning to Rank Research Articles
Beyond Collaborative Filtering: Learning to Rank Research Articles
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
VCCORP SoICT 2018
VCCORP SoICT 2018VCCORP SoICT 2018
VCCORP SoICT 2018
 
[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用[系列活動] 人工智慧與機器學習在推薦系統上的應用
[系列活動] 人工智慧與機器學習在推薦系統上的應用
 
Connecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario ToolsConnecting Scenario Approaches with Scenario Tools
Connecting Scenario Approaches with Scenario Tools
 
Presentation of PhD thesis on Location Data Fusion
Presentation of PhD thesis on Location Data Fusion Presentation of PhD thesis on Location Data Fusion
Presentation of PhD thesis on Location Data Fusion
 
Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...Comparative study of various approaches for transaction Fraud Detection using...
Comparative study of various approaches for transaction Fraud Detection using...
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and Residents
 
Studying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and ResidentsStudying information behavior: The Many Faces of Digital Visitors and Residents
Studying information behavior: The Many Faces of Digital Visitors and Residents
 
Survey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POISurvey on Location Based Recommendation System Using POI
Survey on Location Based Recommendation System Using POI
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
I ii x_slides_albakour_online
I ii x_slides_albakour_onlineI ii x_slides_albakour_online
I ii x_slides_albakour_online
 
Structural Implications of Destination Value System Networks
Structural Implications of Destination Value System NetworksStructural Implications of Destination Value System Networks
Structural Implications of Destination Value System Networks
 
Solving Real Life Problems using Data Science Part - 1
Solving Real Life Problems using Data Science Part - 1Solving Real Life Problems using Data Science Part - 1
Solving Real Life Problems using Data Science Part - 1
 
RDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the piecesRDF Data Quality Assessment - connecting the pieces
RDF Data Quality Assessment - connecting the pieces
 
Smart Cities that don't go "bump" in the night: delivering interoperable smar...
Smart Cities that don't go "bump" in the night: delivering interoperable smar...Smart Cities that don't go "bump" in the night: delivering interoperable smar...
Smart Cities that don't go "bump" in the night: delivering interoperable smar...
 
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
EmbNum: Semantic Labeling for Numerical Values with Deep Metric Learning
 
Assigning semantic labels to data sources
Assigning semantic labels to data sourcesAssigning semantic labels to data sources
Assigning semantic labels to data sources
 
Data mining
Data miningData mining
Data mining
 

More from Raphael Troncy

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyRaphael Troncy
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentRaphael Troncy
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningRaphael Troncy
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...Raphael Troncy
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Raphael Troncy
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Raphael Troncy
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...Raphael Troncy
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Raphael Troncy
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionRaphael Troncy
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Raphael Troncy
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webRaphael Troncy
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streamsRaphael Troncy
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdRaphael Troncy
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentRaphael Troncy
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksRaphael Troncy
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED OpeningRaphael Troncy
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingRaphael Troncy
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED OpeningRaphael Troncy
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsRaphael Troncy
 

More from Raphael Troncy (20)

K CAP 2019 Opening Ceremony
K CAP 2019 Opening CeremonyK CAP 2019 Opening Ceremony
K CAP 2019 Opening Ceremony
 
Semantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things EnvironmentSemantic Technologies for Connected Vehicles in a Web of Things Environment
Semantic Technologies for Connected Vehicles in a Web of Things Environment
 
HyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learningHyperTED: exploring video lectures at the fragment levels for enhancing learning
HyperTED: exploring video lectures at the fragment levels for enhancing learning
 
A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...A replication study of the top performing systems in SemEval twitter sentimen...
A replication study of the top performing systems in SemEval twitter sentimen...
 
Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014Contextualizing Events in TV News Shows - SNOW 2014
Contextualizing Events in TV News Shows - SNOW 2014
 
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
Modeling Geometry and Reference Systems on the Web of Data - LGD 2014
 
NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...NERD: an open source platform for extracting and disambiguating named entitie...
NERD: an open source platform for extracting and disambiguating named entitie...
 
Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013Deep-linking into Media Assets at the Fragment Level SMAM 2013
Deep-linking into Media Assets at the Fragment Level SMAM 2013
 
Describing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and DescriptionDescribing Media Assets: Media Fragment Specification and Description
Describing Media Assets: Media Fragment Specification and Description
 
Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013Semantics at the multimedia fragment level SSSW 2013
Semantics at the multimedia fragment level SSSW 2013
 
Semantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social webSemantic structuring and linking of event-centric data in the social web
Semantic structuring and linking of event-centric data in the social web
 
Live topic generation from event streams
Live topic generation from event streamsLive topic generation from event streams
Live topic generation from event streams
 
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the CrowdMediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
MediaFinder: Collect, Enrich and Visualize Media Memes Shared by the Crowd
 
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance ContentEventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
EventMedia Live: Exploring Events Connections in Real-Time to Enhance Content
 
Extracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social NetworksExtracting Media Items from Multiple Social Networks
Extracting Media Items from Multiple Social Networks
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
MediaEval 2012 SED Opening
MediaEval 2012 SED OpeningMediaEval 2012 SED Opening
MediaEval 2012 SED Opening
 
DeRiVE 2011 workshop opening
DeRiVE 2011 workshop openingDeRiVE 2011 workshop opening
DeRiVE 2011 workshop opening
 
MediaEval 2011 SED Opening
MediaEval 2011 SED OpeningMediaEval 2011 SED Opening
MediaEval 2011 SED Opening
 
ShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting EventsShareIt: Mining SocialMedia Activities for Detecting Events
ShareIt: Mining SocialMedia Activities for Detecting Events
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Location Embeddings for Next Trip Recommendation

  • 1. Location Embeddings for Next Trip Recommendation Amine Dadoun, Raphael Troncy, Riccardo Petitti, Olivier Ratier LocWeb19, 13 May 2019
  • 2. LocWeb 2019 … Why? LocationUser Web, Social Media Recommendation, Travel 2
  • 3. Travel … A great source of inspiration John Doe “I do not know where to go” “Try this” 3
  • 4. Use Case Description Given a traveler, his demographics, his historical bookings and the contextual data related to these bookings, we recommend him a ranked list of destinations he would like to go to. Traveler's Demographic Data 43 years old, Malaysian, Male, Nature, Museums Time Contextual Data 14/09/2016, Wednesday, 2 Days, Alone, etc. 21/12/2016, Friday, 14 Days, 4 persons in party, etc. 07/06/2017, Saturday, 10 Days, 2 persons in party, etc. 15/01/2017, Sunday, 5 Days, Alone, etc. 09/09/2018, Sunday, 4 Days, Alone, etc. ? + 4
  • 5. Scientific Problems Given historical purchases made by a user (or user-item past interactions), plus the context where the interaction was made, how can we accurately predict what will be the next item the user is going to interact with? Research Questions 1. What item to recommend to the user? 2. Can we integrate external data to improve the accuracy of a predictive model? 3. How can we evaluate the recommendation made to this user? 5
  • 6. DKFM (our approach): It combines Factorization Machines in order to represent contextual information and the WDL Recommender System in order to have the user- item interactions and the content information. The combination of these two models are represented in a DNN 6 State of the Art Recommender System Collaborative Filtering [1, 2, 3] Implicit MF Bayesian Personalized MF Neural Collaborative Filtering Content-based Filtering [4] Item KNN Hybrid Method [5] Wide & Deep Learning Context-aware Recommender System [6, 7] Factorization Machines Neural Factorization Machines Knowledge-aware Recommender System [8] Deep Knowledge Factorization Machines Collaborative Fileting: They are Matrix Factorization methods based only on the user- item interaction. They vary either on the loss used in the training or in the interaction function that computes the recommendation probability. Content-based Filtering: Item KNN is a neighborhood based collaborative filtering method, it computes the k nearest neighbors for each item. Hybrid Method: WDL is a DNN Model that computes the probability to have a user-item pair based on both user-item interaction and the content of the item Context-aware Recommender System: These two methods are based on factorization machines algorithm which take into account the context of the recommendation in addition to the user-item interaction Our ModelSota & baselines Recommender Systems
  • 7. 7 Data integration to enrich the representation of destination User Items 𝑢𝑢1 𝑖𝑖1 𝑖𝑖2 𝑖𝑖3 ... User-Item Interactions Age, Nationality, Gender, Etc. User’s Demographics Date, Session behavior, Etc. Interaction Information Item description: • Text • Knowledge Graph • Etc. Content Information
  • 8. 8 Contribution: Deep Knowledge Factorization Machines (DKFM) Deep Neural Network: • Collaborative information • Content information • Contextual information User Items 𝑢𝑢1 𝑖𝑖1 𝑖𝑖2 𝑖𝑖3 ... User-Item Interactions Item Description: • Text • Knowledge Graph • Etc. Content Information Age, Nationality, Gender, Etc. User’s Demographics Date, Session Behavior, Etc. Interaction Information
  • 9. 9 Back to our problem … Next Trip Destination Traveler's Demographic Data 43 years old, Malaysian, Male, Nature, Museums 14/09/2016 Wednesday 2 Days Alone 21/12/2016 Friday 14 Days 4 persons in party 07/06/2017 Saturday 10 Days 2 persons in party 09/09/2018 Sunday 4 Days Alone ? Historical Bookings with contextual information Next Trip Recommendation
  • 10. 10 Traveller's Profiles Data • Real Traveler’s Data • Number of Profiles: ~20M • Number of Trips: ~15 M• Trip Type: One-way, Round-Trip, Multiple Journeys Trip • Time range: February 2013- October 2019 • Number of Destinations: 1146 • Booking Creation Date • Stay Duration • Origin Airport • Origin City • Origin Country • Origin Region • Destination Airport • Destination City • Destination Country • Destination Region • Departure Date • Departure Day of the Week • Arrival Date • Advanced Purchase • Advanced Check-in • Trip Number in Party TripCustomer • Age • Customer Value • Days to Next Bday • Days to Next Flight • Nationality • Gender • Last Booking Date • Last Flown Date • Type of Services • Service Code Trip Services Traveller
  • 11. Data Pre-processing Pipeline • Trips • Traveler demographics Remove Travelers with less than 5 Trips • Remove Travelers with less than 5 different Trips • Remove Destinations visited less than 20 times Only 32% of the trips left Only 4% of the trips left Business Leisure Only 2% of the trips left Number of Travelers 26K/20M (0.13%) Number of Trips 300K/15M (2.1%) Number of Destinations 119/1146 (10%) Travelers Segmentation 11
  • 12. 12 Data Pre-processing: Data Filtering for Recommendation • Remove Travelers with less than 5 Trips (Different Destinations) • Remove Destinations that are visited less than 20 Times Kuala Lumpur Sydney London New York Paris Traveler 1 8 2 1 0 0 Traveler 2 4 0 1 0 1 Traveler 3 2 2 2 1 0 Traveler 4 4 0 0 0 2 Traveler 5 1 0 2 0 3 • Number of Trips: ~4.8 M bookings • Number of Travelers: 814 919 • Number of Destinations: 763 R = • Sparsity is defined as follows: 𝜌𝜌 𝑅𝑅 = 1 − #𝐼𝐼 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼 𝐼𝐼𝐼𝐼 #𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈 × #𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼 #Feedbacks #Interactions #Cities #Travelers Sparsity 610 515 361 412 135 31 205 92% • 𝜌𝜌(Leisure_Trips) = 99.8%: Too sparse to build a Recommender System • More than 65% of travelers have traveled only 2 times • Interaction Matrix: 𝑅𝑅 ∈ 𝑁𝑁#𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 × #𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 : 𝑟𝑟𝑢𝑢𝑢𝑢 = #𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑡𝑡𝑡𝑡 𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷𝐷 𝑖𝑖 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑢𝑢
  • 13. 13 Data Pre-processing: Customer Segmentation CEM Trips Business Leisure Historical Trips already labeled B/L Training B/L Classifier Prediction Trips Data: 122 242 trips Features used: • Number of Passenger, Stay Duration, Saturday Stay, Purchase Anticipation, Age, Gender Time Range: • Feb 2014 - Feb 2017 Distribution: • 40-60 % B/L Training Random Forest Classifier Grid Search on Training Data 5 Fold Cross Validation for evaluation with 75-25% Training & Test Set Accuracy = 0.87, Precision = 0.87, Recall = 0.91 Features Importance #Feedbacks #Interactions #Cities #Travelers Sparsity 304 019 152 547 119 26 019 95%
  • 14. 14 Data Enrichment using Word Embeddings Phuket Adelaide London Etc. Cities … Wikipedia Cities Content 1. Compute the TF-IDF of each word the a Etc. Pre-trained Word Vectors [8] 2. London Textual Embedding: Weighted sum of word vectors, where the weight of each word vector corresponds to the term frequency-inverse document frequency (TF-IDF) of the word
  • 15. 15 Data Enrichment using Knowledge Graph Embeddings Knowledge Graph Embeddings (KGE) Phuket Adelaide London Etc. Cities TransE Model[9] : Given a triple (h, r, t) in the graph, the idea is to minimize the distance between h and t embeddings KGE_Phuket KGE_Adelaide KGE_London Etc. KGE Cities Knowledge Graph Embedding of Phuket Semantic Trails Knowledge Graph: The knowledge graph represents the interaction user-venue, through the property ’visiting’ as well as the relations between the venue and the other entities, namely: category, schema and city https://arxiv.org/abs/1812.04367
  • 16. 16 Deep Knowledge Factorization Machines Deep Neural Network: • Collaborative information • Content information • Contextual information Semantic Trails Knowledge Graph • What characterized a city the most? • An Embedding of each city is constructed based on TransE model • TransE Model: Given a triple (h, r, t) in the graph, the idea is to minimize the distance between h and t embeddings Wikipedia • Representation of cities based on their textual description in Wikipedia • Each Wikipedia Document is encoded as a weighted sum of word vectors • We used pre-trained word vectors from fasttext (n-gram model) • N-gram model is similar to Skip-gram model, but instead of learning a vector representation for a word, we learn a representation for each character. • Weights of the word vectors are their TF-IDF scores Travelers' Profiles & Trips External Data
  • 17. Leave-one-out protocol: for each user, we remove the last destination he went to, and consider it as test set 17 Training Procedure and Evaluation ProtocolTime Training Data Test Data Recommender System Non Existing Traveler-Destination pair Recommender System trained Ranked list of Destinations Prediction … 1. 2. 4. 3.  Hitrate@K [3]  MRR@K [7] Adelaide Osaka Phuket Brunei
  • 18. 18 Results: DKFM vs Baselines Our Model
  • 19. 19 DKFM: what is the contribution of each input data? Better Deep Neural Network + Data Enrichment => Best results Demographics Data Textual Embedding Knowledge Graph Embedding HR@ 10 MRR@ 10 0.72 0.34 0.79 0.37 0.80 0.38 0.82 0.38 0.84 0.41 0.85 0.42 0.88 0.44 Input Contribution
  • 20. 20 Conclusion and Future Work Future Work • Enrich cities’ characteristics using visual embeddings • Explore other loss functions such as pairwise loss • Explore the use of similarity measure inside the DNN such as cosine similarity Conclusions • Combining different types of input improves remarkably recommendation results • DKFM model outperforms state-of-the-art collaborative filtering methods Open Science • DKFM implementation available at https://gitlab.eurecom.fr/amadeus/DKFM-recommendation
  • 21. 21 References [1] Badrul Sarwar, George Karypis, Joseph A Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. [2] Y. Hu, Y. Koren, and C. Volinsky. 2008. Collaborative Filtering for Implicit Feedback Datasets. [3] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback. [4] Steffen Rendle. 2010. Factorization Machines. [5] Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra,Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, RohanAnil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah.2016. Wide & Deep Learning for Recommender Systems. [6] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural Collaborative Filtering. [7] Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. [8] Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. [9] Antoine Bordes, Nicolas Usunier, Alberto Garcia-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data.