Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
3cixty - A New Platform for City Exploration
1. 3cixty: a New Platform
for City Exploration
Raphaël Troncy <raphael.troncy@eurecom.fr>
Multimedia Semantics, EURECOM
@rtroncy
… and many others who need to be credit:
Giuseppe Rizzo, Houda Khrouf, Julien Plu, Ahmad Assaf,
Oscar Corcho, Juan Carlos Ballesteros,
José Luis Redondo Gardia, Ghislain Atemezing, Vuk Milicic,
several students projects, etc.
4. An Evening in Milan
07/07/2015 - BMW Summer School - Lake Tegernsee - 4
5. “I’d like to take
a break from
Expo and visit
Milan. What’s
the best time
for a break,
and what
things in the
city could I go
to then?”
Expo and the City
07/07/2015 - BMW Summer School - Lake Tegernsee - 5
6. Our Solution: Apps Powered by 3cixty
Showcase app: ExplorMI 360 https://www.3cixty.com/
07/07/2015 - BMW Summer School - Lake Tegernsee - 6
7. From Technology Maturation To Business
First showcase in Milan (2014-2015)
Next showcases planned in London, Nice,
Bologna, Madrid (2015-2016)
07/07/2015 - BMW Summer School - Lake Tegernsee - 7
17. 07/07/2015 - BMW Summer School - Lake Tegernsee
wikidata:Q210022dbpedia:Expo_2015
- 17
18. 07/07/2015 - BMW Summer School - Lake Tegernsee
Linked Data Principles
Tim Berners Lee [2006] (Design Issues)
1. Use URIs to identify things
(anything, not just documents);
2. Use HTTP URIs
– globally unique names, distributed ownership –
so that people can look up those names;
3. Provide useful information in RDF –
when someone looks up a URI;
4. Include RDF links to other URIs –
to enable discovery of related information
- 18
19. 3cixty Architecture
07/07/2015 - BMW Summer School - Lake Tegernsee
Heterogeneous data sources
Data Crawling
Data Streams
RDF Conversion
RSS Update
- 19
21. Knowledge Graphs for Transportation
07/07/2015 - BMW Summer School - Lake Tegernsee - 21
22. Knowledge Graphs for Places / Events
07/07/2015 - BMW Summer School - Lake Tegernsee - 22
23. On the Importance of Having Good Maps
07/07/2015 - BMW Summer School - Lake Tegernsee - 23
24. On the Importance of Having Good Maps
07/07/2015 - BMW Summer School - Lake Tegernsee - 24
25. On the Importance of Having Good Maps
07/07/2015 - BMW Summer School - Lake Tegernsee - 25
26. 3cixty: a smart city and big data project
07/07/2015 - BMW Summer School - Lake Tegernsee - 26
27. Handling Highly Dynamic Data
Data streams
Live hotel rooms availability through the EAN network (Expedia)
Live position of the city buses in the city
Live state of bike sharing stations
Complex Event Processing: T-Rex / SPARQL Streams
T-Rex
TESLA
rules
Publishers Subscribers
Primitive
events
Primitive or
comp. events
E015 pull-push
adapter
E015
services
07/07/2015 - BMW Summer School - Lake Tegernsee - 27
28. CEP with T-Rex
T-Rex is a high-performance CEP engine, providing an ad-
hoc rule language (TESLA)
Examples of TESLA rules:
Define GrowingDelay(t_id: string, delay: int)
From TrainDelay(t_id => $t, delay => $d) as T1 and
last TrainDelay(t_id=$t, $d>delay) as T2 within 5min from T1
Where GrowingDelay.t_id := T1.t_id, GrowingDelay.delay:=T1.delay
Define LowBikeAvail(bike_stat: string, availNow: int, availAvg: int)
From BikeMiStat(avail_bikes<5) as S
Where bike_stat := S.stat_id, availNow := S.avail_bikes,
availAvg := AVG(BikeMiStat.avail_bikes) within 60min from S
07/07/2015 - BMW Summer School - Lake Tegernsee - 28
29. 3cixty Architecture
07/07/2015 - BMW Summer School - Lake Tegernsee
Real-time Reconciliation
- Category mapping
- Instance matching
Heterogeneous data sources
Data Crawling
Data Streams
RDF Conversion
RSS Update
- 29
30. Are those two venues the same?
07/07/2015 - BMW Summer School - Lake Tegernsee
Google Places: name, address,
geoLoc
Yelp: name, address,
geoLoc
➔ Slightly different POI name
➔ Different tel number
➔ Different locality (Milan or
Pero?)
➔ Different region (Lombardia
or MI?)
➔ Distance using the
Harvesine formula: 2,428m !
- 30
31. The events similarity is a mutual agreement of their factual properties
Based on top-k dependencies between properties
Data reconciliation (learning to align)
p1 p2
dependency
title1 title2 0.30
place1 place2 0.28
title1 agent2 0.26
agent1 agent2 0.21
description1 title2 0.16
Minimal conditions to fetch
similar events using SPARQL
1st level
Refine the results
2nd level
07/07/2015 - BMW Summer School - Lake Tegernsee - 31
32. 3cixty Architecture
07/07/2015 - BMW Summer School - Lake Tegernsee
Real-time Reconciliation
- Category mapping
- Instance matching
Heterogeneous data sources
Data Crawling
Data Streams
RDF Conversion
RSS Update
- 32
Web Applications
SPARQL
REST API (Elda)
33. ExplorMI 360 Web App
07/07/2015 - BMW Summer School - Lake Tegernsee - 33
34. ExplorMI 360 Mobile Guide (Android / iOS)
07/07/2015 - BMW Summer School - Lake Tegernsee - 34
36. A Service for Executing Mixed-Domain Queries
Execute queries in your app that combine diverse types
of information
A “Parallel Exploration” Graphical UI
Enable users to construct and save trees of interrelated
queries that enable them to explore several aspects of
the city simultaneously
A “Wish List” Service
Allow users to indicate where they may want to go and
(optionally) when
Store this information in the cloud so that it can be
accessed from any 3cixty app when the user is logged in
07/07/2015 - BMW Summer School - Lake Tegernsee - 36
Brief Descriptions of Services (1)
37. Brief Descriptions of Services (2)
A Mobility Profiling service
Track users’ movements within the city, including their
use of modes of transportation
Enable users to make queries with restrictions like “in a
location that I’ve never been to before”
Display data from the user’s mobility profile along with
other 3cixty information
Generic Crowdsourcing Platform
Efficiently write effective interfaces that enable users to
contribute information about aspects of the city
access such information provided by others
A Social Network Mining Service
Access what a user’s friends have done or said with
regard to particular things in the city
07/07/2015 - BMW Summer School - Lake Tegernsee - 37
39. Data Mining in the Knowledge Graph
07/07/2015 - BMW Summer School - Lake Tegernsee - 39
40. Extracting Patterns in the Knowledge Graph
Geosummly:
http://geosummly.eurecom.fr/
07/07/2015 - BMW Summer School - Lake Tegernsee - 40
Zooming in “Shop & Service”
41. 1 2
n
lat*,long* = latitude and longitude of the centroid of the cell(i). It allows to reduce the
observation noise of the single venues, and to reduce the data set sparsity.
1 lat* lng* f(1,1) f(1,2
)
2
3
...
n f(n,n)
Grid Sampling on Foursquare
07/07/2015 - BMW Summer School - Lake Tegernsee - 41
42. Definitions:
eps : reachable distance. We use the
Euclidean distance (points linked
with arrows)
minPts : min number of points to
have a cluster (given the example, it
can be 1...8)
Automatic parameter estimation:
eps : applying the Euclidean
distance, we compute the mean
Euclidean distance of the feature
values in the grid
minPts : considering each feature
independent each other, we assign
to minPts the mean value observed
in the entire grid, reduced a small
quantity based on the law of large
numbers
lng
lat
Parameter Estimation
07/07/2015 - BMW Summer School - Lake Tegernsee - 42
43. Clustering Algorithm
We propose GeoSubClu, a density based clustering
algorithm inspired by SubClu
Inputs:
eps, minPts
O = {o1 , . . . , om} of geographic objects located at the spatial
coordinates fx and fy. Each object corresponds to the centroid of a
cell in the sampled grid
F = {f1, …, fn} set of features. Each feature corresponds the
observed frequency of a given category in the area, normalized
intra-feature and per surface of the cell
Outputs:
S = {s1, …, sk} of k-dimensional subspaces sk. Each subspace
corresponds to a cluster that has k prominent different features
07/07/2015 - BMW Summer School - Lake Tegernsee - 43
44. Extracting and Linking Entities
An hybrid approach which combines the strength of a
linguistic-based method augmented by a high coverage
in the annotation obtained by using a large knowledge
base
07/07/2015 - BMW Summer School - Lake Tegernsee - 44
45. Extracting More Information from Reviews
Extracting sentiments: features engineering
Pre-processing, emotion dictionary (DAL), POS,
capitalization, time of the day, day of the week, weather,
social graph
SVM + kNN classifiers used jointly
Extracting
sub-categories
Japanese restaurant
Ramen or Teppanyaki
LDA approach
BoW with 1-ngram
07/07/2015 - BMW Summer School - Lake Tegernsee - 45
46. Semantic and Machine Learning
07/07/2015 - BMW Summer School - Lake Tegernsee - 46
Lise Getoor - Combining Statistics and
Semantics to Turn Data into Knowledge
ESWC 2015 Keynote
47. Big Data is not Flat
It is multi-modal, multi-relational, spatio-
temporal, multimedia
Machine Learning needs knowledge graphs
Knowledge Graphs needs machine learning
Deep learning vs Features Engineering
Key idea: Statistical Relational Learning (SRL)
Entity Resolution: determine which nodes refer to the
same underlying real world object
Link Prediction: infer the existence of new edges in the
graph
Classification: infer labels in a graph
07/07/2015 - BMW Summer School - Lake Tegernsee - 47