6. Data requirements
6Confidential
› Points of interests within a city (POI)
• Latitude, Longitude, Address, Opening Hours, Name, Category
› Time spent at POI
• Average time that must be spent at this location
› Distances between POIs
• Driving and Walking distances between locations
› Algorithm to compute the itinerary
7. Data requirements
7Confidential
› Points of interests within a city (POI)
• Latitude, Longitude, Address, Opening Hours, Name, Category
• Use Yahoo! Travel APIs to gather information about POIs
› Time spent at POI
• Average time that must be spent at this location
• Use Flickr photos to determine average time spent at POIs
› Distances between POIs
• Driving and Walking distances between locations
• Use Yahoo! Geo APIs to compute these distances
› Algorithm to compute the itinerary
8. Design and Architecture
8Confidential
Itinerary Generation is done in two phases
From PHASE 1
Phase 1 : Computing time spent at POI
Yahoo! Maps
Flickr DataYahoo! Travel
User StreamsPOI Data
Generate POI Graph for city
Phase 2 : Generate path between POIs
Start Location End Location Time Constraint
Compute most profitable Path
POI Graph
9. Phase 1 – Flickr Data Mining
§ Steps to compute time spent at POIs within a city
› Extract all geo-tagged Flickr images for a given POI
› Process the images ordered by click-time and author
› Deduce the time spent by the users at POIs using first & last timestamps
› Compute the mean of time spent by various users at a POI
§ Use Yahoo! Geo APIs travel time between POIs
§ Output : Weighted POI Graph for city
9Confidential
10. Phase 2 – Path Computation
§ Orienteering Problem
› Given an edge weighted graph G=(V,E,w), and a pair of nodes ‘s’ & ‘t’ - find s-t walk
of length at most ‘B’ and that maximizes some function ‘f’ on set of nodes in the path
• Here ‘V’ is vertex set, ‘E’ is Edge set, ‘w’ is weight function, ‘B’ is path budget, ‘f’ is reward
function
§ Reducing our problem to Orienteering Problem
• Each node in city graph is a POI, with cost = time spent, and price = popularity
• Each edge in city graph has weight = travel time between POIs
• ‘B’ denotes the maximum number of POIs allowed in a path
• Reward Function ‘f’ is proportional to Flickr Users for a POI & its popularity
§ Results
• The algorithm computes path between POIs at run time, in less than 2-3 seconds ( |V| < 30 )
10Confidential
11. References
§ Chandra Chekuri, Martin Pal. A Recursive Greedy Algorithm foe
Walks in Directed Graphs, IEEE Symposium 2005
§ Munmun De Chaudhary. DeConstructing Travel Itineraries from
tagged Geo Temporal Breadcrumbs WWW2010
§ Yahoo Geo Technologies http://developer.yahoo.com/geo/geoplanet/
§ Flickr APIs http://www.flickr.com/services/api/
§ Yahoo! Travel http://travel.yahoo.com
11Confidential