SlideShare a Scribd company logo
1 of 62
Download to read offline
Analysing the digital traces of Social Media users 
Muhammad Adnan, Guy Lansley, Paul Longley 
Consumer Research Data Centre, Department of Geography, University College London 
Web: www.uncertaintyofidentity.com ; www.cdrc.ac.uk 
Twitter: @gisandtech
Introduction 
•Past years have witnessed a rapid growth of the use of online services 
•Online shopping, bank transactions, social networking services 
•Issues related to cyber-crimes, identity frauds, and hacking 
•‘Uncertainty of Identity’ project: Combining real and virtual world datasets to better understand the identity of individuals 
•Real world (Census, Demographic Classifications) 
•Virtual world (Email addresses, Social media accounts)
Introduction 
•Geodemographics 
•Census data represent the night time geography 
•Social media datasets can be used to provide day and travel time geographies 
•Spatial and temporal analysis of social media users 
•Activity pattern analysis 
•Tweet content analysis 
•Develop tools for Identity analysis 
•E-mail addresses 
•Social media accounts
Outline 
•Some popular social media services 
•Twitter 
•Introduction 
•Case Study 1: Social Media Geodemographics 
•Case Study 2: Activity pattern analysis 
•Temporal analysis of Twitter activity around different world cities 
•Case Study 3: Twitter Geographic Profiler 
•An Uncertainty of Identity tool
Some popular social media services 
•Facebook 
•2 billion total users 
•1.28 billion active users 
•Google Plus 
•1.6 billion total users 
•540 million active users 
•Twitter 
•More than 1 billion total users 
•255 million active users 
(1) Mediabistro. 2014. Social Media Stats 2014. Retrieved 17th November, 2014 from http://www.mediabistro.com/alltwitter/social-media-statistics- 2014_b57746.
Twitter (www.twitter.com) 
•Online social networking and micro-blogging web service 
•Users can send messages of 140 characters or less 
•Approx. 500 million tweets daily 
•78% of Twitter’s active users are on mobile 
•44% of users have never sent a tweet (inactive users) 
•Twitter API: for downloading live tweets of data
Data available through the Twitter API 
•User Creation Date 
•Followers 
•Friends 
•User ID 
•Language 
•Location 
•Name 
•Screen Name 
•Time Zone 
•Geo Enabled 
•Latitude 
•Longitude 
•Tweet date and time 
•Tweet text 
•A database of 1.4 billion social media messages 
•September, 2012 – February, 2014 
•Geo-tagged tweets 
•Latitude / Longitude
Case Study 1: Social Media Geodemographics
Social Media Geodemographics 
•Geodemographics 
•Analysis of people by where they live” (2) 
•Night time characteristics of the population 
•Social Media Geodemographics 
•Moving beyond the night time geography 
•Who: Ethnicity, Gender, and Age of social media users 
•When: What time of day conversations happen 
•Where: Where social media conversations happen 
(2) Sleight, P. (2004). Targetting Customers-How to Use Geodemographic and Lifestyle Data in Your Business.
Twitter data for the case study 
•Approx. 8 million geo-tagged tweets (Jan – Dec, 2013) 
•Sent by 385,050 unique users 
•155,249 users sent 5 or more tweets (7.6 million tweets)
Flows of people and information 
•Entropy is a measure of uncertainty in a random variable 
•Shannon Entropy 
•7.6 million tweets were aggregated to 4,765 LSOAs 
•Entropy was calculated 
•High values indicate high flows of people and information 
퐻푋=− 푝푥푖log푏푝푥푖 푛 푖=1
Flows of people and information
Morning (6am – 11.59am) 
Afternoon (12pm – 5.59pm) 
Flows of people and information
Evening (6pm – 11.59pm) 
Afternoon (12 midnight – 6.59am) 
Flows of people and information
Variables for creating a geo-temporal classification 
1.Residence 
•Where twitter users live 
1.Ethnicity 
•Probable ethnic origins of Twitter users 
1.Age 
•Probable Age of Twitter users 
1.Land Use Category of a Tweet message 
•Residential; Non-domestic building; Park etc. 
2.Temporal Scales 
•Day, Afternoon, Night, Peak travel hours
Residence of Twitter Users 
•170m X 170m grid was used to find the probable residence of users 
•Probable residence was found for the 75,522 users
Extracting demographic attributes of Twitter users by using their forenames and surnames 
A name is a statement of the bearer’s cultural, ethnic, and linguistic identity (3) 
(3) Mateos P, Longley P A, O’Sullivan D 2011. Ethnicity and population structure in personal naming networks. PloS ONE (Public Library of Science) 6 (9) e22943.
Analysing Names on Twitter 
•Some examples of NAME variations on Twitter 
•Approx. 68% of the accounts have real names 
Fake Names Castor 5. WHAT IS LOVE? MysticMind KIRILL_aka_KID Vanessa Justin Bieber Home 
Real Names Kevin Hodge Andre Alves Jose de Franco Carolina Thomas, Dr. Prof. Martha Del Val Fabíola Sanchez Fernandes
Onomap: Names to Ethnicity classification 
•Onomap was created by clustering names of 1 billion individuals around the world 
•Applied ONOMAP (www.onomap.org) on forename – surname pairs 
Kevin Hodge (English) Pablo Mateos (Spanish) … … … …
Top 10 Ethnic Groups of Twitter Users 
•A total of 67 ethnic groups were identified
•Monica dataset provided by CACI Ltd, UK 
•Supplemented with UK birth certificate records 
Age estimation from ‘forenames’
Age distribution of Twitter users 
Twitter Users vs. 2011 Census (Greater London) 
(4) Longley, P., Adnan, M., Lansley, G. 2013. “The geo-temporal demographics of Twitter usage”. Environment and Planning A. (In Press)
Land-use Categories 
•Every tweet message was assigned a land-use category
Variables for creating a geo-temporal classification 
1. Residence V1: Tweet made near probable London residence V2: Tweeter lives ‘outside the UK’ V3: Tweeter lives in the rest of the UK outside London 
2. Total Number of Tweets V4: Total number of tweets made by the user 
3. Ethnicity V5: West European V6: East European V7: Greek or Turkish V8: South East Asian V9: Other Asian V10: African & Caribbean V11: Jewish V12: Chinese V13: Other minority 
4. Age V14: <=20 V15: 21 - 30 V16: 31 - 40 V17: 41 - 50 V18: 50+ 
5. Tweets outside the UK V19: In West Europe (not including UK) V20: In East Europe V21: In North America V22: In Central or South American V23: In Australasia V24: In Africa V25: In Middle East V26: In Asia V27: In Paris
Variables for creating a geo-temporal classification 
6. Number of countries visited V28: Number of countries tweeter has visited 
7. London Land Use Category V29: Residential location V30: Non-domestic buildings V31: Transport links and locations V32: Green-spaces V33: All other land uses 
8. 2011 London Output Area Classification V34: Intermediate Lifestyles V35: High Density and High Rise Flats V36: Settled Asians V37: Urban Elites V38: City Vibe V39: London Life-Cycle V40: Multi-Ethnic Suburbs V41: Ageing-City Fringe 
9. Temporal Scales V42: Morning Peak Hours V43: Week Day V44: Afternoon V45: Week Night V46: Weekend
•Segmentations were created by using K-means clustering algorithm 
•K-means tries to find cluster centroids by minimising 
•Seven clusters 
•Group A: London Residents 
•Group B: Commuting Professionals 
•Group C: Student Lifestyle 
•Group D: The Daily Grind 
•Group E: Spectators 
•Group F: Visitors 
•Group G: Workplace and tourist activity 
Computing the geo-temporal classifications 
   nxnyyxVz112)(
Group A: London Residents 
•Tweets made near primary residential locations 
•Tweets made on weeknights or weekends 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
V1 
V2 
V3 
V4 
V5 
V6 
V7 
V8 
V9 
V10 
V11 
V12 
V13 
V14 
V15 
V16 
V17 
V18 
V19 
V20 
V21 
V22 
V23 
V24 
V25 
V26 
V27 
V28 
V29 
V30 
V31 
V32 
V33 
V34 
V35 
V36 
V37 
V38 
V39 
V40 
V41 
V42 
V43 
V44 
V45 
V46
Group B: Commuting Professionals 
•Tweets made from 
•Transport locations 
•‘Urban Elites’ LOAC classification 
•Tweets made by individuals of intermediate age (21-30) 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
V1 
V2 
V3 
V4 
V5 
V6 
V7 
V8 
V9 
V10 
V11 
V12 
V13 
V14 
V15 
V16 
V17 
V18 
V19 
V20 
V21 
V22 
V23 
V24 
V25 
V26 
V27 
V28 
V29 
V30 
V31 
V32 
V33 
V34 
V35 
V36 
V37 
V38 
V39 
V40 
V41 
V42 
V43 
V44 
V45 
V46
Group F: Visitors 
•Tweeters live outside London 
•Tweets originated from residential land uses 
•Mixed age groups 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
V1 
V2 
V3 
V4 
V5 
V6 
V7 
V8 
V9 
V10 
V11 
V12 
V13 
V14 
V15 
V16 
V17 
V18 
V19 
V20 
V21 
V22 
V23 
V24 
V25 
V26 
V27 
V28 
V29 
V30 
V31 
V32 
V33 
V34 
V35 
V36 
V37 
V38 
V39 
V40 
V41 
V42 
V43 
V44 
V45 
V46
Group G: Workplace and tourist activity 
•Tweets sent from non-domestic buildings 
•Full range of Twitter age cohorts 
•Tweets originate from a mix of residents and international visitors 
0 
0.1 
0.2 
0.3 
0.4 
0.5 
0.6 
0.7 
0.8 
0.9 
1 
V1 
V2 
V3 
V4 
V5 
V6 
V7 
V8 
V9 
V10 
V11 
V12 
V13 
V14 
V15 
V16 
V17 
V18 
V19 
V20 
V21 
V22 
V23 
V24 
V25 
V26 
V27 
V28 
V29 
V30 
V31 
V32 
V33 
V34 
V35 
V36 
V37 
V38 
V39 
V40 
V41 
V42 
V43 
V44 
V45 
V46
Social Media Geodemographics 
•Geo-temporal demographic classifications 
•Census (night time geography) 
•Social media data (day and travel time geography) 
•Issues of representation 
•An insight into the residential and travel geographies of individuals 
•An insight into the spatial activity patterns of different kind of social media users
Case Study 2: Analysis of Twitter activity around world cities 
(5) Muhammad Adnan, Alistair Leak, Paul Longley. “A geocomputational analysis of Twitter activity around different world cities”. Geospatial Information Science.
Activity Pattern Analysis 
•Comparison of the use of Twitter between different cities 
•Weekly patterns of activity 
•Seasonal shifts 
•Data: 19th September, 2012 – 25th September, 2013 
•Point-in-polygon operations were performed to extract data for different city in the world 
•Approx. 170 million tweets were sent from the top 30 cities
Top 30 cities on Twitter 
0 
5 
10 
15 
20 
25 
30 
35 
40 
Number of Tweets (Millions) 
•Approx. 170 million tweets were sent from the following 30 cities.
Time zone issue 
•By default, Twitter API sends the data in local time zone 
•Data was converted from GMT to the corresponding time zones 
Date & Time (GMT) 
Date & Time (UTC +1) 
Wed Dec 05 00:04:23, 2012 
Wed Dec 05 01:04:23 2012 
Wed Dec 05 00:06:29, 2012 
Wed Dec 05 01:06:29 2012 
Wed Dec 05 00:07:35, 2012 
Wed Dec 05 01:07:35 2012
Temporal Analysis of Twitter Cities 
Jakarta Istanbul Paris 
Sao Paulo, Brazil New York City London
Temporal Analysis of Twitter Cities 
Riyadh Tokyo Madrid 
Buenos Aires, Argentina
Temporal Analysis of Twitter Cities 
London
Temporal Analysis of Twitter Cities 
London 
Paris
Temporal Analysis of Twitter Cities 
Jakarta
Temporal Analysis of Twitter Cities 
Jakarta 
Riyadh
Temporal Analysis of Twitter Cities 
New York City
Temporal Analysis of Twitter Cities 
New York City 
Tokyo
Case Study 3: Twitter Geographic Profiler (a part of Uncertainty of Identity Toolkit)
Introduction 
•Uncertainty of Identity Toolkit is a framework for the identification and profiling of individuals from their 
•Social media accounts 
•E-mail addresses 
•Twitter Geographic Profiler 
•Maps ethno-cultural communities of a person’s friends 
•Extracting identities of Twitter users 
•Mapping them to probable ethnic origins 
•Could have potential applications in targeted marketing
Twitter Geographic Profiler 
•Given an individual’s Twitter Username or ID 
•Extracts the information of individual’s friends 
•Extracts the forename-surname pairs of the friends 
•Maps forename-surname pairs to Onomap 
•Builds an ethno-cultural profile person’s friends 
•Maps the geographic distribution
Data available through the Twitter API 
•User ID 
•User Creation Date 
•Followers 
•Friends 
•Language 
•Location 
•Name 
•Screen Name or User Name 
•Time Zone 
•Geo Enabled 
•Latitude 
•Longitude 
•Tweet date and time 
•Tweet text
Twitter: getting the ids and usernames 
•Given a Twitter username of a person, we use the Twitter API to get the list of friends’ ids 
–A max of 15 requests every 15 minutes is allowed 
–Each query can get up to 5000 ids 
–Generally enough to download all the ids 
•Using the ids, we fetch the name associated to each id 
–Limited to 180 requests every 15 min 
–Returns a single string from which we need to extract the name and surname tokens 
–Not necessarily a valid forename + surname! 
•E.g., “University of Birmingham”, “John1965”, “ What is Love”, “Mystic_mind”
Twitter: getting forename-surname pairs 
•Name field was divided into different tokens 
•Forenames and Surnames were detected by matching the string tokens against the database of forename surnames pairs of 26 countries 
•Users discarded 
–where tokens were not matched against valid forename and surname
Onomap: from names to ethnicity 
•ONOMAP (www.onomap.org) was applied on forename – surname pairs 
Kevin Hodge (English) Pablo Mateos (Spanish) … … … …
Friends’ Ethnicity Histogram 
Once the entire list of friends name + surname pairs has been parsed, we can easily estimate the distribution over the set of possible ethno-cultural groups of the Twitter user's friends 
GEOGRAPHIC PROFILER cultural communities of a determine the distribution groups of the friends of a integrate information from two Note, that the same ideas other Online Social Foursquare1. However, around different and Foursquare’s venues. In this because of the general not restricted to a specific Facebook, information is username of the person being surname, forename) pairs of of names to a list of classification of Onomap. probable countries of estimate respectively the set of possible ethno- countries. In the following details of the tool and terms of users' privacy. Twitter is directed, in the necessarily reciprocated. associated with each user, following and one for the her followers. In this representing the list of a user's actually follow a limited number of profiles, which are then accessible even with the rate limitation in place. With the list of (surname, forename) pairs to hand, we query Onomap to get the ethno-cultural classification associated with each (surname, forename) pair, and the SearchSurnameTopCountries method to get the list of the countries where an instance of a given surname was observed. Figure 1: Screenshot of the Twitter Geographic Profiler. The bottom part of the screen shows the histogram of the Twitter user's friends ethno-cultural groups.
Friends’ Geographic Origins 
Map showing the geographic origin of the Twitter user's friends’ surnames as assigned by our tool. Below the map the user is shown a list of the top 10 countries with the respective frequency. 
pair among the extracted tokens. In this work we mark as invalid any string that is composed of a single token. If this is the case, we skip the profile of the corresponding friend. If the string contains two or more tokens, we take the first one to be the forename and the last one to be the surname. Moreover, when a (surname, forename) pair is sent to Onomap, an error distance matrix one can embed Euclidean space for the purpose similar ethno-cultural groups. However, note that we expect the ethno-cultural groups to vary across is, on average a resident of London spanning a wider spectrum of communities of Swansea4, due to the substantial in London. As a consequence, performed within a limited geographical been shown that roughly 50% assigned in their profile, and the are at town level [10], thus such feasible. Given the friendships distribution it is also possible to use outlier identify individuals or group of individuals of the ethno-cultural groups they also infer the ethnicity of an individual but for which a list of friend names To understand the extent of the we should stress that the default profile of a user as public. Although private, thus making it impossible profile, when testing our tool we profile. Consequently, we can download the list of names of a ethno-cultural profiling. As for the limitations of the current we observed that the Twitter data noise, which can considerably computation. The source of this of extracting the surname and string introduces unwanted uncertainty. Figure 2: Map showing the geographical origin of the Twitter user's friends’ surnames as assigned by our tool. Below the map the user is shown a list of the top 10 countries with the respective frequency.
Twitter Geographic Profiler 
•Potential applications include 
–Measure the level of segregation/integration of a given individual (community) as the Shannon entropy of the (average) friends’ ethnicity histogram 
–Outliers detection: identify uncommon behaviors, e.g., individuals that stand out in terms of the ethno-cultural groups they bond with 
•Limitations 
–Twitter data is very noisy 
–Request limits
•Social media datasets can be used to create Geo-temporal demographic classifications 
•Day and travel time geographies 
•Activity patterns 
•Temporal analysis can identify some interesting patterns of a geographical area 
•Weekly patterns of activity 
•Seasonal shifts 
•Twitter Geographic Profiler: Identification and profiling of ethno-cultural characteristics of individuals 
•From their Twitter accounts 
Conclusion
•Study of privacy implications on social media services 
•Facebook, FourSquare 
•Future work: Consumer Data Research Centre 
•Use of social media for retail sector 
•Spatial and temporal catchments of the social media users 
Conclusion
•E.g. Day-time catchment 
1.Identify the unique ID of users frequently transmitting from a particular location at a given time or date range 
2.Request their other activity through Twitter’s API, filter by time/date 
3.Aggregate 
Time catchments 
The Twitter work-day time catchment of Bishopsgate 
Activity at Bishopsgate in 2013
60 
Waterloo 
St Pancras 
Victoria 
Paddington 
London Bridge 
Liverpool Street 
Kings Cross 
Euston 
Natural History Museum
Residential catchment of Twitter users 
•First establish which users have tweeted from inside the building 
•Create a customer catchment by identifying all of these users Tweets sent from domestic land uses 
•E.g. ASDA in Clapham Junction 
The Twitter residential catchment of ASDA Supermarket at Clapham Junction
Any Questions ? 
Thanks for Listening

More Related Content

Viewers also liked

Multimedia Data Collection using Social Media Analysis
Multimedia Data Collection using Social Media Analysis Multimedia Data Collection using Social Media Analysis
Multimedia Data Collection using Social Media Analysis Benoit HUET
 
Friendship and mobility user movement in location based social networks
Friendship and mobility user movement in location based social networksFriendship and mobility user movement in location based social networks
Friendship and mobility user movement in location based social networksFread Mzee
 
Statistical analytical programming for social media analysis .
Statistical analytical programming for social media analysis .Statistical analytical programming for social media analysis .
Statistical analytical programming for social media analysis .Felicita Florence
 
20140329 modern logging and data analysis pattern on .NET
20140329 modern logging and data analysis pattern on .NET20140329 modern logging and data analysis pattern on .NET
20140329 modern logging and data analysis pattern on .NETTakayoshi Tanaka
 
A guide to realistic social media and measurement
A guide to realistic social media and measurementA guide to realistic social media and measurement
A guide to realistic social media and measurementAdam Vincenzini
 
Usage and consumption pattern of Social Media- Girish.Havale
Usage and consumption pattern of Social Media- Girish.HavaleUsage and consumption pattern of Social Media- Girish.Havale
Usage and consumption pattern of Social Media- Girish.HavaleGirish Havale
 
RSC: Mining and Modeling Temporal Activity in Social Media
RSC: Mining and Modeling Temporal Activity in Social MediaRSC: Mining and Modeling Temporal Activity in Social Media
RSC: Mining and Modeling Temporal Activity in Social MediaAlceu Ferraz Costa
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolFarida Vis
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
7 Hot Location-Based Apps You Should Know About
7 Hot Location-Based Apps You Should Know About7 Hot Location-Based Apps You Should Know About
7 Hot Location-Based Apps You Should Know AboutShauna Causey
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyMark Rittman
 
Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...Song Gao
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreWael Elrifai
 
Location Based services
Location Based servicesLocation Based services
Location Based servicesFraj Alshahibi
 
Digital, Social & Mobile in China in 2015
Digital, Social & Mobile in China in 2015Digital, Social & Mobile in China in 2015
Digital, Social & Mobile in China in 2015We Are Social Singapore
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachAndry Alamsyah
 
Social Media Analytics Demystified
Social Media Analytics DemystifiedSocial Media Analytics Demystified
Social Media Analytics DemystifiedDebra Askanase
 
Analytics for Social Media
Analytics for Social MediaAnalytics for Social Media
Analytics for Social MediaDavid King
 

Viewers also liked (20)

Multimedia Data Collection using Social Media Analysis
Multimedia Data Collection using Social Media Analysis Multimedia Data Collection using Social Media Analysis
Multimedia Data Collection using Social Media Analysis
 
Friendship and mobility user movement in location based social networks
Friendship and mobility user movement in location based social networksFriendship and mobility user movement in location based social networks
Friendship and mobility user movement in location based social networks
 
Statistical analytical programming for social media analysis .
Statistical analytical programming for social media analysis .Statistical analytical programming for social media analysis .
Statistical analytical programming for social media analysis .
 
20140329 modern logging and data analysis pattern on .NET
20140329 modern logging and data analysis pattern on .NET20140329 modern logging and data analysis pattern on .NET
20140329 modern logging and data analysis pattern on .NET
 
A guide to realistic social media and measurement
A guide to realistic social media and measurementA guide to realistic social media and measurement
A guide to realistic social media and measurement
 
Usage and consumption pattern of Social Media- Girish.Havale
Usage and consumption pattern of Social Media- Girish.HavaleUsage and consumption pattern of Social Media- Girish.Havale
Usage and consumption pattern of Social Media- Girish.Havale
 
RSC: Mining and Modeling Temporal Activity in Social Media
RSC: Mining and Modeling Temporal Activity in Social MediaRSC: Mining and Modeling Temporal Activity in Social Media
RSC: Mining and Modeling Temporal Activity in Social Media
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter School
 
Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
7 Hot Location-Based Apps You Should Know About
7 Hot Location-Based Apps You Should Know About7 Hot Location-Based Apps You Should Know About
7 Hot Location-Based Apps You Should Know About
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...
 
Social media with big data analytics
Social media with big data analyticsSocial media with big data analytics
Social media with big data analytics
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 
Social network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and moreSocial network analysis & Big Data - Telecommunications and more
Social network analysis & Big Data - Telecommunications and more
 
Location Based services
Location Based servicesLocation Based services
Location Based services
 
Digital, Social & Mobile in China in 2015
Digital, Social & Mobile in China in 2015Digital, Social & Mobile in China in 2015
Digital, Social & Mobile in China in 2015
 
Big Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network ApproachBig Data Analytics : A Social Network Approach
Big Data Analytics : A Social Network Approach
 
Social Media Analytics Demystified
Social Media Analytics DemystifiedSocial Media Analytics Demystified
Social Media Analytics Demystified
 
Analytics for Social Media
Analytics for Social MediaAnalytics for Social Media
Analytics for Social Media
 

Similar to Analysing the digital traces of Social Media users

Open Data: Analysis and Visualisation
Open Data: Analysis and VisualisationOpen Data: Analysis and Visualisation
Open Data: Analysis and VisualisationDr Muhammad Adnan
 
Phd Colloquium Spatial Analysis
Phd Colloquium Spatial AnalysisPhd Colloquium Spatial Analysis
Phd Colloquium Spatial Analysisalistairleak
 
The language of social media
The language of social mediaThe language of social media
The language of social mediaDiana Maynard
 
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...Guy Lansley
 
Going beyond google 2 philadelphia loss conference
Going beyond google 2 philadelphia loss conferenceGoing beyond google 2 philadelphia loss conference
Going beyond google 2 philadelphia loss conferencemikep007
 
Four Corners of the Big Tent
Four Corners of the Big TentFour Corners of the Big Tent
Four Corners of the Big TentJohn Bradley
 
Social Media News Revolution 2014
Social Media News Revolution 2014Social Media News Revolution 2014
Social Media News Revolution 2014Sue Robinson
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherSimon Bishop
 
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017Vittorio Romano
 
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...Dlazarow
 
#InternetPart12 Berntzen Day 1
#InternetPart12 Berntzen Day 1#InternetPart12 Berntzen Day 1
#InternetPart12 Berntzen Day 1Daria S
 
Sn@tch CNI Fall 2014
Sn@tch CNI Fall 2014Sn@tch CNI Fall 2014
Sn@tch CNI Fall 2014Martin Klein
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherSimon Bishop
 
Beginner's Clinic: Why Social Media
Beginner's Clinic: Why Social MediaBeginner's Clinic: Why Social Media
Beginner's Clinic: Why Social MediaConnectVA
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsAmit Sheth
 
Essential Online Tools for Historical Societies
Essential Online Tools for Historical SocietiesEssential Online Tools for Historical Societies
Essential Online Tools for Historical Societiesvtrural
 
LocWeb 2016 Workshop at WWW2016
LocWeb 2016 Workshop at WWW2016LocWeb 2016 Workshop at WWW2016
LocWeb 2016 Workshop at WWW2016Dirk Ahlers
 
GIS Analysis Of The Dequindre Cut
GIS Analysis Of The Dequindre CutGIS Analysis Of The Dequindre Cut
GIS Analysis Of The Dequindre CutJustin Lyons
 
Media Ecology Project slides from Open Repositories 2015
Media Ecology Project slides from Open Repositories 2015Media Ecology Project slides from Open Repositories 2015
Media Ecology Project slides from Open Repositories 2015nmdjohn
 

Similar to Analysing the digital traces of Social Media users (20)

Open Data: Analysis and Visualisation
Open Data: Analysis and VisualisationOpen Data: Analysis and Visualisation
Open Data: Analysis and Visualisation
 
Phd Colloquium Spatial Analysis
Phd Colloquium Spatial AnalysisPhd Colloquium Spatial Analysis
Phd Colloquium Spatial Analysis
 
The language of social media
The language of social mediaThe language of social media
The language of social media
 
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...
Evaluating the Utility of Geo-referenced Twitter Data as a Source of Reliable...
 
Going beyond google 2 philadelphia loss conference
Going beyond google 2 philadelphia loss conferenceGoing beyond google 2 philadelphia loss conference
Going beyond google 2 philadelphia loss conference
 
Four Corners of the Big Tent
Four Corners of the Big TentFour Corners of the Big Tent
Four Corners of the Big Tent
 
Social Media News Revolution 2014
Social Media News Revolution 2014Social Media News Revolution 2014
Social Media News Revolution 2014
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate Researcher
 
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
SoBigData - Exploring human mobility and migration with BigData @ NTTS2017
 
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...
NGOs responding to Crisis: Using Social Media to Meet New Challenges, The Cas...
 
#InternetPart12 Berntzen Day 1
#InternetPart12 Berntzen Day 1#InternetPart12 Berntzen Day 1
#InternetPart12 Berntzen Day 1
 
Sn@tch CNI Fall 2014
Sn@tch CNI Fall 2014Sn@tch CNI Fall 2014
Sn@tch CNI Fall 2014
 
Using Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate ResearcherUsing Twitter as a Postgraduate Researcher
Using Twitter as a Postgraduate Researcher
 
Beginner's Clinic: Why Social Media
Beginner's Clinic: Why Social MediaBeginner's Clinic: Why Social Media
Beginner's Clinic: Why Social Media
 
Citizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsCitizen Sensor Data Mining, Social Media Analytics and Applications
Citizen Sensor Data Mining, Social Media Analytics and Applications
 
Essential Online Tools for Historical Societies
Essential Online Tools for Historical SocietiesEssential Online Tools for Historical Societies
Essential Online Tools for Historical Societies
 
LocWeb 2016 Workshop at WWW2016
LocWeb 2016 Workshop at WWW2016LocWeb 2016 Workshop at WWW2016
LocWeb 2016 Workshop at WWW2016
 
Digital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella WisdomDigital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella Wisdom
 
GIS Analysis Of The Dequindre Cut
GIS Analysis Of The Dequindre CutGIS Analysis Of The Dequindre Cut
GIS Analysis Of The Dequindre Cut
 
Media Ecology Project slides from Open Repositories 2015
Media Ecology Project slides from Open Repositories 2015Media Ecology Project slides from Open Repositories 2015
Media Ecology Project slides from Open Repositories 2015
 

More from Dr Muhammad Adnan

Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity ToolsetUsing Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity ToolsetDr Muhammad Adnan
 
Geodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsGeodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsDr Muhammad Adnan
 
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...Dr Muhammad Adnan
 
Spatio-temporal linkage of real and virtual identity
Spatio-temporal linkage of real and virtual identitySpatio-temporal linkage of real and virtual identity
Spatio-temporal linkage of real and virtual identityDr Muhammad Adnan
 
Visualising large spatial databases and Building bespoke geodemographics
Visualising large spatial databases and Building bespoke geodemographicsVisualising large spatial databases and Building bespoke geodemographics
Visualising large spatial databases and Building bespoke geodemographicsDr Muhammad Adnan
 

More from Dr Muhammad Adnan (6)

Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity ToolsetUsing Digital Traces for User Profiling: the Uncertainty of Identity Toolset
Using Digital Traces for User Profiling: the Uncertainty of Identity Toolset
 
Geodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtodsGeodemographics: Open tools and mehtods
Geodemographics: Open tools and mehtods
 
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
A Geodemographic Analysis of Ethnicity and Identity of Twitter Users in Great...
 
Spatio-temporal linkage of real and virtual identity
Spatio-temporal linkage of real and virtual identitySpatio-temporal linkage of real and virtual identity
Spatio-temporal linkage of real and virtual identity
 
Visualising large spatial databases and Building bespoke geodemographics
Visualising large spatial databases and Building bespoke geodemographicsVisualising large spatial databases and Building bespoke geodemographics
Visualising large spatial databases and Building bespoke geodemographics
 
Real Time Geodemographics
Real Time GeodemographicsReal Time Geodemographics
Real Time Geodemographics
 

Recently uploaded

Film show investigation powerpoint for the site
Film show investigation powerpoint for the siteFilm show investigation powerpoint for the site
Film show investigation powerpoint for the siteAshtonCains
 
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceVellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Film show post-production powerpoint for site
Film show post-production powerpoint for siteFilm show post-production powerpoint for site
Film show post-production powerpoint for siteAshtonCains
 
Film show pre-production powerpoint for site
Film show pre-production powerpoint for siteFilm show pre-production powerpoint for site
Film show pre-production powerpoint for siteAshtonCains
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Nitya salvi
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfeliklein8
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundationsolidgbemi
 
Film show production powerpoint for site
Film show production powerpoint for siteFilm show production powerpoint for site
Film show production powerpoint for siteAshtonCains
 
Film the city investagation powerpoint :)
Film the city investagation powerpoint :)Film the city investagation powerpoint :)
Film the city investagation powerpoint :)AshtonCains
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingDigital Marketing Lab
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<Health
 
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Delhi Call girls
 
Interpreting the brief for the media IDY
Interpreting the brief for the media IDYInterpreting the brief for the media IDY
Interpreting the brief for the media IDYgalaxypingy
 
Production diary Film the city powerpoint
Production diary Film the city powerpointProduction diary Film the city powerpoint
Production diary Film the city powerpointAshtonCains
 
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdfSEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdfmacawdigitalseo2023
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRNeha Kajulkar
 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...SocioCosmos
 
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelInternational Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutioneliklein8
 

Recently uploaded (20)

Film show investigation powerpoint for the site
Film show investigation powerpoint for the siteFilm show investigation powerpoint for the site
Film show investigation powerpoint for the site
 
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceVellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Vellore Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Film show post-production powerpoint for site
Film show post-production powerpoint for siteFilm show post-production powerpoint for site
Film show post-production powerpoint for site
 
Film show pre-production powerpoint for site
Film show pre-production powerpoint for siteFilm show pre-production powerpoint for site
Film show pre-production powerpoint for site
 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
 
Capstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdfCapstone slidedeck for my capstone final edition.pdf
Capstone slidedeck for my capstone final edition.pdf
 
Marketing Plan - Social Media. The Sparks Foundation
Marketing Plan -  Social Media. The Sparks FoundationMarketing Plan -  Social Media. The Sparks Foundation
Marketing Plan - Social Media. The Sparks Foundation
 
Film show production powerpoint for site
Film show production powerpoint for siteFilm show production powerpoint for site
Film show production powerpoint for site
 
Film the city investagation powerpoint :)
Film the city investagation powerpoint :)Film the city investagation powerpoint :)
Film the city investagation powerpoint :)
 
Enhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content MarketingEnhancing Consumer Trust Through Strategic Content Marketing
Enhancing Consumer Trust Through Strategic Content Marketing
 
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
+971565801893>> ORIGINAL CYTOTEC ABORTION PILLS FOR SALE IN DUBAI AND ABUDHABI<<
 
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
Hire↠Young Call Girls in Hari Nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esco...
 
Interpreting the brief for the media IDY
Interpreting the brief for the media IDYInterpreting the brief for the media IDY
Interpreting the brief for the media IDY
 
Production diary Film the city powerpoint
Production diary Film the city powerpointProduction diary Film the city powerpoint
Production diary Film the city powerpoint
 
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdfSEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
SEO Expert in USA - 5 Ways to Improve Your Local Ranking - Macaw Digital.pdf
 
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIRBVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
BVG BEACH CLEANING PROJECTS- ORISSA , ANDAMAN, PORT BLAIR
 
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
Call Girls in Chattarpur (delhi) call me [9953056974] escort service 24X7
 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
 
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot ModelInternational Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
International Airport Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Capstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolutionCapstone slide deck on the TikTok revolution
Capstone slide deck on the TikTok revolution
 

Analysing the digital traces of Social Media users

  • 1. Analysing the digital traces of Social Media users Muhammad Adnan, Guy Lansley, Paul Longley Consumer Research Data Centre, Department of Geography, University College London Web: www.uncertaintyofidentity.com ; www.cdrc.ac.uk Twitter: @gisandtech
  • 2. Introduction •Past years have witnessed a rapid growth of the use of online services •Online shopping, bank transactions, social networking services •Issues related to cyber-crimes, identity frauds, and hacking •‘Uncertainty of Identity’ project: Combining real and virtual world datasets to better understand the identity of individuals •Real world (Census, Demographic Classifications) •Virtual world (Email addresses, Social media accounts)
  • 3. Introduction •Geodemographics •Census data represent the night time geography •Social media datasets can be used to provide day and travel time geographies •Spatial and temporal analysis of social media users •Activity pattern analysis •Tweet content analysis •Develop tools for Identity analysis •E-mail addresses •Social media accounts
  • 4. Outline •Some popular social media services •Twitter •Introduction •Case Study 1: Social Media Geodemographics •Case Study 2: Activity pattern analysis •Temporal analysis of Twitter activity around different world cities •Case Study 3: Twitter Geographic Profiler •An Uncertainty of Identity tool
  • 5.
  • 6. Some popular social media services •Facebook •2 billion total users •1.28 billion active users •Google Plus •1.6 billion total users •540 million active users •Twitter •More than 1 billion total users •255 million active users (1) Mediabistro. 2014. Social Media Stats 2014. Retrieved 17th November, 2014 from http://www.mediabistro.com/alltwitter/social-media-statistics- 2014_b57746.
  • 7. Twitter (www.twitter.com) •Online social networking and micro-blogging web service •Users can send messages of 140 characters or less •Approx. 500 million tweets daily •78% of Twitter’s active users are on mobile •44% of users have never sent a tweet (inactive users) •Twitter API: for downloading live tweets of data
  • 8. Data available through the Twitter API •User Creation Date •Followers •Friends •User ID •Language •Location •Name •Screen Name •Time Zone •Geo Enabled •Latitude •Longitude •Tweet date and time •Tweet text •A database of 1.4 billion social media messages •September, 2012 – February, 2014 •Geo-tagged tweets •Latitude / Longitude
  • 9.
  • 10.
  • 11. Case Study 1: Social Media Geodemographics
  • 12. Social Media Geodemographics •Geodemographics •Analysis of people by where they live” (2) •Night time characteristics of the population •Social Media Geodemographics •Moving beyond the night time geography •Who: Ethnicity, Gender, and Age of social media users •When: What time of day conversations happen •Where: Where social media conversations happen (2) Sleight, P. (2004). Targetting Customers-How to Use Geodemographic and Lifestyle Data in Your Business.
  • 13. Twitter data for the case study •Approx. 8 million geo-tagged tweets (Jan – Dec, 2013) •Sent by 385,050 unique users •155,249 users sent 5 or more tweets (7.6 million tweets)
  • 14. Flows of people and information •Entropy is a measure of uncertainty in a random variable •Shannon Entropy •7.6 million tweets were aggregated to 4,765 LSOAs •Entropy was calculated •High values indicate high flows of people and information 퐻푋=− 푝푥푖log푏푝푥푖 푛 푖=1
  • 15. Flows of people and information
  • 16. Morning (6am – 11.59am) Afternoon (12pm – 5.59pm) Flows of people and information
  • 17. Evening (6pm – 11.59pm) Afternoon (12 midnight – 6.59am) Flows of people and information
  • 18. Variables for creating a geo-temporal classification 1.Residence •Where twitter users live 1.Ethnicity •Probable ethnic origins of Twitter users 1.Age •Probable Age of Twitter users 1.Land Use Category of a Tweet message •Residential; Non-domestic building; Park etc. 2.Temporal Scales •Day, Afternoon, Night, Peak travel hours
  • 19. Residence of Twitter Users •170m X 170m grid was used to find the probable residence of users •Probable residence was found for the 75,522 users
  • 20. Extracting demographic attributes of Twitter users by using their forenames and surnames A name is a statement of the bearer’s cultural, ethnic, and linguistic identity (3) (3) Mateos P, Longley P A, O’Sullivan D 2011. Ethnicity and population structure in personal naming networks. PloS ONE (Public Library of Science) 6 (9) e22943.
  • 21. Analysing Names on Twitter •Some examples of NAME variations on Twitter •Approx. 68% of the accounts have real names Fake Names Castor 5. WHAT IS LOVE? MysticMind KIRILL_aka_KID Vanessa Justin Bieber Home Real Names Kevin Hodge Andre Alves Jose de Franco Carolina Thomas, Dr. Prof. Martha Del Val Fabíola Sanchez Fernandes
  • 22. Onomap: Names to Ethnicity classification •Onomap was created by clustering names of 1 billion individuals around the world •Applied ONOMAP (www.onomap.org) on forename – surname pairs Kevin Hodge (English) Pablo Mateos (Spanish) … … … …
  • 23. Top 10 Ethnic Groups of Twitter Users •A total of 67 ethnic groups were identified
  • 24. •Monica dataset provided by CACI Ltd, UK •Supplemented with UK birth certificate records Age estimation from ‘forenames’
  • 25. Age distribution of Twitter users Twitter Users vs. 2011 Census (Greater London) (4) Longley, P., Adnan, M., Lansley, G. 2013. “The geo-temporal demographics of Twitter usage”. Environment and Planning A. (In Press)
  • 26. Land-use Categories •Every tweet message was assigned a land-use category
  • 27. Variables for creating a geo-temporal classification 1. Residence V1: Tweet made near probable London residence V2: Tweeter lives ‘outside the UK’ V3: Tweeter lives in the rest of the UK outside London 2. Total Number of Tweets V4: Total number of tweets made by the user 3. Ethnicity V5: West European V6: East European V7: Greek or Turkish V8: South East Asian V9: Other Asian V10: African & Caribbean V11: Jewish V12: Chinese V13: Other minority 4. Age V14: <=20 V15: 21 - 30 V16: 31 - 40 V17: 41 - 50 V18: 50+ 5. Tweets outside the UK V19: In West Europe (not including UK) V20: In East Europe V21: In North America V22: In Central or South American V23: In Australasia V24: In Africa V25: In Middle East V26: In Asia V27: In Paris
  • 28. Variables for creating a geo-temporal classification 6. Number of countries visited V28: Number of countries tweeter has visited 7. London Land Use Category V29: Residential location V30: Non-domestic buildings V31: Transport links and locations V32: Green-spaces V33: All other land uses 8. 2011 London Output Area Classification V34: Intermediate Lifestyles V35: High Density and High Rise Flats V36: Settled Asians V37: Urban Elites V38: City Vibe V39: London Life-Cycle V40: Multi-Ethnic Suburbs V41: Ageing-City Fringe 9. Temporal Scales V42: Morning Peak Hours V43: Week Day V44: Afternoon V45: Week Night V46: Weekend
  • 29. •Segmentations were created by using K-means clustering algorithm •K-means tries to find cluster centroids by minimising •Seven clusters •Group A: London Residents •Group B: Commuting Professionals •Group C: Student Lifestyle •Group D: The Daily Grind •Group E: Spectators •Group F: Visitors •Group G: Workplace and tourist activity Computing the geo-temporal classifications    nxnyyxVz112)(
  • 30. Group A: London Residents •Tweets made near primary residential locations •Tweets made on weeknights or weekends 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
  • 31. Group B: Commuting Professionals •Tweets made from •Transport locations •‘Urban Elites’ LOAC classification •Tweets made by individuals of intermediate age (21-30) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
  • 32. Group F: Visitors •Tweeters live outside London •Tweets originated from residential land uses •Mixed age groups 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
  • 33. Group G: Workplace and tourist activity •Tweets sent from non-domestic buildings •Full range of Twitter age cohorts •Tweets originate from a mix of residents and international visitors 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
  • 34. Social Media Geodemographics •Geo-temporal demographic classifications •Census (night time geography) •Social media data (day and travel time geography) •Issues of representation •An insight into the residential and travel geographies of individuals •An insight into the spatial activity patterns of different kind of social media users
  • 35. Case Study 2: Analysis of Twitter activity around world cities (5) Muhammad Adnan, Alistair Leak, Paul Longley. “A geocomputational analysis of Twitter activity around different world cities”. Geospatial Information Science.
  • 36. Activity Pattern Analysis •Comparison of the use of Twitter between different cities •Weekly patterns of activity •Seasonal shifts •Data: 19th September, 2012 – 25th September, 2013 •Point-in-polygon operations were performed to extract data for different city in the world •Approx. 170 million tweets were sent from the top 30 cities
  • 37. Top 30 cities on Twitter 0 5 10 15 20 25 30 35 40 Number of Tweets (Millions) •Approx. 170 million tweets were sent from the following 30 cities.
  • 38. Time zone issue •By default, Twitter API sends the data in local time zone •Data was converted from GMT to the corresponding time zones Date & Time (GMT) Date & Time (UTC +1) Wed Dec 05 00:04:23, 2012 Wed Dec 05 01:04:23 2012 Wed Dec 05 00:06:29, 2012 Wed Dec 05 01:06:29 2012 Wed Dec 05 00:07:35, 2012 Wed Dec 05 01:07:35 2012
  • 39. Temporal Analysis of Twitter Cities Jakarta Istanbul Paris Sao Paulo, Brazil New York City London
  • 40. Temporal Analysis of Twitter Cities Riyadh Tokyo Madrid Buenos Aires, Argentina
  • 41. Temporal Analysis of Twitter Cities London
  • 42. Temporal Analysis of Twitter Cities London Paris
  • 43. Temporal Analysis of Twitter Cities Jakarta
  • 44. Temporal Analysis of Twitter Cities Jakarta Riyadh
  • 45. Temporal Analysis of Twitter Cities New York City
  • 46. Temporal Analysis of Twitter Cities New York City Tokyo
  • 47. Case Study 3: Twitter Geographic Profiler (a part of Uncertainty of Identity Toolkit)
  • 48. Introduction •Uncertainty of Identity Toolkit is a framework for the identification and profiling of individuals from their •Social media accounts •E-mail addresses •Twitter Geographic Profiler •Maps ethno-cultural communities of a person’s friends •Extracting identities of Twitter users •Mapping them to probable ethnic origins •Could have potential applications in targeted marketing
  • 49. Twitter Geographic Profiler •Given an individual’s Twitter Username or ID •Extracts the information of individual’s friends •Extracts the forename-surname pairs of the friends •Maps forename-surname pairs to Onomap •Builds an ethno-cultural profile person’s friends •Maps the geographic distribution
  • 50. Data available through the Twitter API •User ID •User Creation Date •Followers •Friends •Language •Location •Name •Screen Name or User Name •Time Zone •Geo Enabled •Latitude •Longitude •Tweet date and time •Tweet text
  • 51. Twitter: getting the ids and usernames •Given a Twitter username of a person, we use the Twitter API to get the list of friends’ ids –A max of 15 requests every 15 minutes is allowed –Each query can get up to 5000 ids –Generally enough to download all the ids •Using the ids, we fetch the name associated to each id –Limited to 180 requests every 15 min –Returns a single string from which we need to extract the name and surname tokens –Not necessarily a valid forename + surname! •E.g., “University of Birmingham”, “John1965”, “ What is Love”, “Mystic_mind”
  • 52. Twitter: getting forename-surname pairs •Name field was divided into different tokens •Forenames and Surnames were detected by matching the string tokens against the database of forename surnames pairs of 26 countries •Users discarded –where tokens were not matched against valid forename and surname
  • 53. Onomap: from names to ethnicity •ONOMAP (www.onomap.org) was applied on forename – surname pairs Kevin Hodge (English) Pablo Mateos (Spanish) … … … …
  • 54. Friends’ Ethnicity Histogram Once the entire list of friends name + surname pairs has been parsed, we can easily estimate the distribution over the set of possible ethno-cultural groups of the Twitter user's friends GEOGRAPHIC PROFILER cultural communities of a determine the distribution groups of the friends of a integrate information from two Note, that the same ideas other Online Social Foursquare1. However, around different and Foursquare’s venues. In this because of the general not restricted to a specific Facebook, information is username of the person being surname, forename) pairs of of names to a list of classification of Onomap. probable countries of estimate respectively the set of possible ethno- countries. In the following details of the tool and terms of users' privacy. Twitter is directed, in the necessarily reciprocated. associated with each user, following and one for the her followers. In this representing the list of a user's actually follow a limited number of profiles, which are then accessible even with the rate limitation in place. With the list of (surname, forename) pairs to hand, we query Onomap to get the ethno-cultural classification associated with each (surname, forename) pair, and the SearchSurnameTopCountries method to get the list of the countries where an instance of a given surname was observed. Figure 1: Screenshot of the Twitter Geographic Profiler. The bottom part of the screen shows the histogram of the Twitter user's friends ethno-cultural groups.
  • 55. Friends’ Geographic Origins Map showing the geographic origin of the Twitter user's friends’ surnames as assigned by our tool. Below the map the user is shown a list of the top 10 countries with the respective frequency. pair among the extracted tokens. In this work we mark as invalid any string that is composed of a single token. If this is the case, we skip the profile of the corresponding friend. If the string contains two or more tokens, we take the first one to be the forename and the last one to be the surname. Moreover, when a (surname, forename) pair is sent to Onomap, an error distance matrix one can embed Euclidean space for the purpose similar ethno-cultural groups. However, note that we expect the ethno-cultural groups to vary across is, on average a resident of London spanning a wider spectrum of communities of Swansea4, due to the substantial in London. As a consequence, performed within a limited geographical been shown that roughly 50% assigned in their profile, and the are at town level [10], thus such feasible. Given the friendships distribution it is also possible to use outlier identify individuals or group of individuals of the ethno-cultural groups they also infer the ethnicity of an individual but for which a list of friend names To understand the extent of the we should stress that the default profile of a user as public. Although private, thus making it impossible profile, when testing our tool we profile. Consequently, we can download the list of names of a ethno-cultural profiling. As for the limitations of the current we observed that the Twitter data noise, which can considerably computation. The source of this of extracting the surname and string introduces unwanted uncertainty. Figure 2: Map showing the geographical origin of the Twitter user's friends’ surnames as assigned by our tool. Below the map the user is shown a list of the top 10 countries with the respective frequency.
  • 56. Twitter Geographic Profiler •Potential applications include –Measure the level of segregation/integration of a given individual (community) as the Shannon entropy of the (average) friends’ ethnicity histogram –Outliers detection: identify uncommon behaviors, e.g., individuals that stand out in terms of the ethno-cultural groups they bond with •Limitations –Twitter data is very noisy –Request limits
  • 57. •Social media datasets can be used to create Geo-temporal demographic classifications •Day and travel time geographies •Activity patterns •Temporal analysis can identify some interesting patterns of a geographical area •Weekly patterns of activity •Seasonal shifts •Twitter Geographic Profiler: Identification and profiling of ethno-cultural characteristics of individuals •From their Twitter accounts Conclusion
  • 58. •Study of privacy implications on social media services •Facebook, FourSquare •Future work: Consumer Data Research Centre •Use of social media for retail sector •Spatial and temporal catchments of the social media users Conclusion
  • 59. •E.g. Day-time catchment 1.Identify the unique ID of users frequently transmitting from a particular location at a given time or date range 2.Request their other activity through Twitter’s API, filter by time/date 3.Aggregate Time catchments The Twitter work-day time catchment of Bishopsgate Activity at Bishopsgate in 2013
  • 60. 60 Waterloo St Pancras Victoria Paddington London Bridge Liverpool Street Kings Cross Euston Natural History Museum
  • 61. Residential catchment of Twitter users •First establish which users have tweeted from inside the building •Create a customer catchment by identifying all of these users Tweets sent from domestic land uses •E.g. ASDA in Clapham Junction The Twitter residential catchment of ASDA Supermarket at Clapham Junction
  • 62. Any Questions ? Thanks for Listening