SlideShare a Scribd company logo
1 of 23
Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter March 29, 2011 SoME 2011 (In Conjunction with WWW 2011) Hemant Purohit1, Yiye Ruan2, Amruta Joshi2, Srinivasan Parthasarthy2, Amit Sheth1 1Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University, USA 2Dept. of Computer Science & Engineering Ohio State University, USA
Outline (What) User-Community Engagement (Why) Motivation (How) Problem Formalization Approach Terminology Definition Analysis Framework People-Content-Network Analysis (PCNA) Experiments  Datasets and Event Categorization Features Results Insights Conclusion & Future work 2
User-communityEngagement Multiple topics surrounding events being discussed on social media Each topic constitutes a community of users discussing about it e.g., Japan Earthquake community How do we understand  the phenomenon of user participation (engagement) in topic discussions 3 Image: http://itcilo.wordpress.com
Motivation User Engagement Analysis Business How communities form during the product launch? What factors can attract users to engage in these communities, therefore further spreading the message? Crisis Management Effective communication: How quickly we can disseminate information between resource providers and people in need of resources? 4
Problem formalization: Approach User engagement has been studied in many forms:  Community Formation & Detection, Information Propagation, Link Prediction etc. It involves a three-dimensional dynamic at play:  Content: topic of interest,  People: participants who engages in discussion about the topic, and Network: community structure formed around the topic discussion Rather than limiting to one dimension, we propose multidimensional approach Case Study on Twitter 5
Earlier Approaches 6 Content: Topic of Interest OR People: Participant of the discussion OR Network: Community  around topic Images: tupper-lake.com/.../uploads/Community.jpg                 http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
Our Approach 7 Content: Topic of Interest AND People: Participant of the discussion AND Network: Community  around topic Images: tupper-lake.com/.../uploads/Community.jpg                 http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
Problem formalization: Terminology Event-Oriented Community An implicit group of social network users who have joined discussion (by message posting) on topic about an event. Slice Collection of messages relevant to topic of discussion, posted during a fixed-length time window. Snapshot State of the network at a certain point of time at which user profile and connection information are crawled. Active Window: freshness matters! Active Community 8
Problem formalization: Definition Binary classification problem for user-community link prediction. User Engagement Prediction Problem: Given  1) an event-oriented community C formed around a topic of discussion;  2) a Twitter user U ε C,  Predict whether U will be engaged in C (by composing a new tweet or retweeting an existing tweet which contains keywords or hashtag related to C's underlying event) in a future slice. If so, U is said to be a positive record. Otherwise, it is a negative record. 9
Analysis Framework:People-Content-network Analysis (PCNA) 10
Experiments: Dataset & Event categorization Study on Twitter data Events have various characteristics and we hypothesize the user engagement analysis for them being affected by different variables No standard event categorization is available, so we categorize the events observing data over a time, as follows: Global (G) vs. Local (L)[e.g., Japan Earthquake vs. Iowa State Fair] Deterministic (D) vs. Unexpected (U)[e.g., Emmy Awards vs. Japan Earthquake] Compact (C) vs. Loose (Ls)[e.g., ISWC conference vs. Japan Earthquake] Transient (T) vs. Lasting (Lt)[e.g., President’s Speech vs. Egypt Revolution] 11
ClevelandShowPremiere: Second Season premiere of animated TV series Cleveland Show. September 26. Global, loose, deterministic, transient. DiscoveryBuildingCrisis: Hostage crisis at the head- quarters of Discovery Channel, Maryland. September 1. Local, loose, unexpected, transient. EmmyAwards: 62nd Prime-time Emmy Awards. August 29. Global, loose, deterministic, lasting. GoogleInstantSearch: Launch of Google Instant in United States. September 8. Global, loose, unexpected, transient. HeismanTrophy: Reggie Bush’s announcement to forfeit 2005 Heisman Trophy. September 14. Local, compact, unexpected, lasting. IowaStateFair: Iowa State Fair. August 12-22. Local, loose, deterministic, lasting. JewishNewYear: Jewish New Year 5771. September 8-10. Global, compact, deterministic, transient. LindsayLohanHearing: LindsayLohan’s hearing on probation revocation and verdict. September 24. Local, loose, deterministic, transient. LinuxCon: Annual convention organized by Linux Foundation. August 10-12. Global, compact, deterministic, lasting. LondonTubeStrike: London tube strike. September 6. Local, loose, deterministic, transient. RichCroninDeath: Death of singer and songwriter Rich Cronin. September 8. Local, loose, unexpected, transient. ScottPilgrimRelease: Release of movie Scott Pilgrim vs. the World. Aug 13. Global, loose, deterministic, lasting. SESSanFrancisco: Search Engine Strategies 2010 at San Francisco. August 16-20. Global, compact, deterministic, lasting. StuxnetWorm: Confirmation of Stuxnet worm at- tack on Iranian nuclear program. September 24. Global, loose, unexpected, lasting. 12 Events  (labeled as per our categorization)
Experiments: Features Organized in the PCNA framework: Node/Author features (P), Content features (C), Community features (N) Extracted for each potential community member (U) in each slice, where U belongs to the union of follower lists of each active community member 13 Followee of (U) Whole Topic Community Potential new Member (U) EDGE: Active Community A B If B follows A
Experiments: Features (cont.) ,[object Object],wccSize: size of the weakly-connected component (WCC) which U’s friends belongs to in the active network. wccPercent: ratio of wccSizeto the size of the active network. connectivity: number of active friends (i.e. followees) in the community. communitySize: size of the active community. Author features [Characteristics of friends that U is following]: Only friends in the active community are considered. logFollower: logarithm of follower count logFollowee: logarithm of followee count Klout[1]: a integrated measure of user influence and popularity Other profile information and activity history[2]. [1] http://www.klout.com [2] Future works 14
Experiments: Features (cont.) Content features[Characteristics of tweets posted by active friends of U]: keywords: number of event-relevant keywords hashtags: number of event-relevant hashtags retweet: number of retweets mention: number of mentions url: number of relevancy-adjust hyperlinks Irrelevant hyperlink is given number -1 subjectivity: Subjectivity scores for words and emoticons Linguistic Cues (LIWC1 analysis): Features for the language usage. Top-3 transformed features using Principle Component Analysis (PCA) extracted 15 1http://www.liwc.net
Wait a minute!  Not all contents have been viewed! Novelty and Attention: User is likely to see new or recent content/tweet and then join the community Apply temporal weighting on the features  Dataset imbalance: too many negative records! Alleviated by SMOTE method Over-sampling on positive records and under-sampling on negative ones Not all users are active! Apply weighting on activity level based on last activity[1] 16 [1] Future works
Experiments We run the following experiment groups: allFeatures (All): contains all three feature groups onlyContent (Con.): contains only content feature onlyAuthor (Aut.): contains only author feature onlyCommunity (Com.): contains only community feature SVM classifier LibSVM, RBF Kernel, gamma=8, c=32 17
Experiments: Results 18 Event-Type Summary of Prediction Accuracy (%) Statistical significant results are in bold
Insights Performance of onlyCommunity classifiers is worst The latent nature of network features makes it difficult to be perceived by a user directly. The onlyContent classifiers give the best performance over other single feature groups Some users end up participating in a discussion based on observing the information from the public timeline, and therefore, these ad-hoc users are hard to observe via network analysis only. Content is engaging by its quality and nature (information sharing or call for an action or crowd sourcing). For example, link to an image or video (an evidential content) about Reggie Bush's surrender of Heisman Trophy in September, 2010 is likely to provoke lot more thoughts in a user's mind to engage in the discussion. 19
Insights (Cont.) Comparable performance of onlyAuthor classifiers as onlyContent classifiers for some of the topics Impact of the effective presence of influential people in the discussion group Insufficiency in content features, reflected by low average connectivity, can be compensated by author features (e.g., Rich Cronin Death). Statistical significance testing method shows allFeatures classifiers have better or equivalent performance over any single feature group classifier for 12 out of 14 topics The advantage of using all features is dominant, where degree of randomness in individual dimensions can be really high (e.g., Discovery Building attack). 20
Insights (Cont.) No significant correlation between selection of feature groups and the event types: lasting vs. transient.  Possibility of the shift in the characteristics over time Advantage of allFeatures over other factor groups is generally stronger on the unexpected topics than the deterministic ones. Degree of randomness being high in discussions surrounding unexpected events 21
Conclusion & Future Work Every dimension (People, Content, Network) cannot be expected to perform well in all types of topic discussions, and hence, a strong need can be felt to study dynamics of user engagement by using the PCNA framework. Experiments with a more refined event types taxonomy and user engagement factors, with consideration of shift in the event characteristics over time Semantic Analysis of content to enhance content features Experiment on other social networks: Forums, DBLP 22
Questions? Paper at: http://knoesis.org/library/resource.php?id=1095 More on Social Media @ Kno.e.sis at http://knoesis.org/research/semweb/projects/socialmedia/ 23

More Related Content

What's hot

2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media snaMarc Smith
 
Expectations for Electronic Debate Platforms as a Function of Application Domain
Expectations for Electronic Debate Platforms as a Function of Application DomainExpectations for Electronic Debate Platforms as a Function of Application Domain
Expectations for Electronic Debate Platforms as a Function of Application DomainIJERA Editor
 
Paper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPaper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPietro Speroni di Fenizio
 
Bowling Alone and Trust Decline in Social Network Sites
Bowling Alone and  Trust Decline in  Social Network SitesBowling Alone and  Trust Decline in  Social Network Sites
Bowling Alone and Trust Decline in Social Network SitesPaolo Massa
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...Marc Smith
 
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)WeGov project
 
Principles of New Media - Essay
Principles of New Media - EssayPrinciples of New Media - Essay
Principles of New Media - EssayMaria Gomez
 
A Question of Complexity - Measuring the Maturity of Online Enquiry Communities
A Question of Complexity - Measuring the Maturity of Online Enquiry CommunitiesA Question of Complexity - Measuring the Maturity of Online Enquiry Communities
A Question of Complexity - Measuring the Maturity of Online Enquiry CommunitiesGregoire Burel
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...ACMBangalore
 
Social media for PR Communications - Success measurement plan
Social media for PR Communications - Success measurement planSocial media for PR Communications - Success measurement plan
Social media for PR Communications - Success measurement planJose Sanchez
 
The Rogue in the Lovely Black Dress: Intimacy in World of Warcraft
The Rogue in the Lovely Black Dress: Intimacy in World of WarcraftThe Rogue in the Lovely Black Dress: Intimacy in World of Warcraft
The Rogue in the Lovely Black Dress: Intimacy in World of WarcraftTyler Pace
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...Marc Smith
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsMarc Smith
 

What's hot (16)

2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna
 
Expectations for Electronic Debate Platforms as a Function of Application Domain
Expectations for Electronic Debate Platforms as a Function of Application DomainExpectations for Electronic Debate Platforms as a Function of Application Domain
Expectations for Electronic Debate Platforms as a Function of Application Domain
 
Paper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting ProposalsPaper at ePart 2011: System Generated Requests for Rewriting Proposals
Paper at ePart 2011: System Generated Requests for Rewriting Proposals
 
Bowling Alone and Trust Decline in Social Network Sites
Bowling Alone and  Trust Decline in  Social Network SitesBowling Alone and  Trust Decline in  Social Network Sites
Bowling Alone and Trust Decline in Social Network Sites
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
 
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
“What is WeGov” - User Guide for the Phase 2 Evaluation (in English)
 
Principles of New Media - Essay
Principles of New Media - EssayPrinciples of New Media - Essay
Principles of New Media - Essay
 
A Question of Complexity - Measuring the Maturity of Online Enquiry Communities
A Question of Complexity - Measuring the Maturity of Online Enquiry CommunitiesA Question of Complexity - Measuring the Maturity of Online Enquiry Communities
A Question of Complexity - Measuring the Maturity of Online Enquiry Communities
 
Nannobloging pr
Nannobloging prNannobloging pr
Nannobloging pr
 
Group link
Group linkGroup link
Group link
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
Social media for PR Communications - Success measurement plan
Social media for PR Communications - Success measurement planSocial media for PR Communications - Success measurement plan
Social media for PR Communications - Success measurement plan
 
The Rogue in the Lovely Black Dress: Intimacy in World of Warcraft
The Rogue in the Lovely Black Dress: Intimacy in World of WarcraftThe Rogue in the Lovely Black Dress: Intimacy in World of Warcraft
The Rogue in the Lovely Black Dress: Intimacy in World of Warcraft
 
20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...20121010 marc smith - mapping collections of connections in social media with...
20121010 marc smith - mapping collections of connections in social media with...
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming Skills
 

Similar to Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Artificial Intelligence Institute at UofSC
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfBalasundaramSr
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social mediarangesharp
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsMatthew Rowe
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignJonathan Stray
 
Feedback Effects Between Similarity And Social Influence In Online Communities
Feedback Effects Between Similarity And Social Influence In Online CommunitiesFeedback Effects Between Similarity And Social Influence In Online Communities
Feedback Effects Between Similarity And Social Influence In Online CommunitiesPaolo Massa
 
Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Thomas Ryberg
 
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...Rick Vogel
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebFabrizio Orlandi
 
Social Media Boot Camp at PACOM 3
Social Media Boot Camp at PACOM 3Social Media Boot Camp at PACOM 3
Social Media Boot Camp at PACOM 3Eric Schwartzman
 
Ieml social recommendersystems
Ieml social recommendersystemsIeml social recommendersystems
Ieml social recommendersystemsAntonio Medina
 
Identification and Characterization of Events in Social Media
Identification and Characterization of Events in Social MediaIdentification and Characterization of Events in Social Media
Identification and Characterization of Events in Social MediaHila Becker
 
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...Dobusch Leonhard
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network AnalysisMarc Smith
 
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, WikisSocial Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, WikisShelly D. Farnham, Ph.D.
 
Trust influence and social media
Trust influence and social mediaTrust influence and social media
Trust influence and social mediaDawn Dawson
 
Can you trust everything?
Can you trust everything?Can you trust everything?
Can you trust everything?Colin Lieu
 

Similar to Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter (20)

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Pe...
 
SocialCom09-tutorial.pdf
SocialCom09-tutorial.pdfSocialCom09-tutorial.pdf
SocialCom09-tutorial.pdf
 
Data mining for social media
Data mining for social mediaData mining for social media
Data mining for social media
 
Anticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community ForumsAnticipating Discussion Activity on Community Forums
Anticipating Discussion Activity on Community Forums
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
Feedback Effects Between Similarity And Social Influence In Online Communities
Feedback Effects Between Similarity And Social Influence In Online CommunitiesFeedback Effects Between Similarity And Social Influence In Online Communities
Feedback Effects Between Similarity And Social Influence In Online Communities
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 
Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0Interactive Innovation Through Social Software And Web 2.0
Interactive Innovation Through Social Software And Web 2.0
 
Research
ResearchResearch
Research
 
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...
An Empirical Study On IMDb And Its Communities Based On The Network Of Co-Rev...
 
Profiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic WebProfiling User Interests on the Social Semantic Web
Profiling User Interests on the Social Semantic Web
 
Social Media Boot Camp at PACOM 3
Social Media Boot Camp at PACOM 3Social Media Boot Camp at PACOM 3
Social Media Boot Camp at PACOM 3
 
Smbc pacom-2
Smbc pacom-2Smbc pacom-2
Smbc pacom-2
 
Ieml social recommendersystems
Ieml social recommendersystemsIeml social recommendersystems
Ieml social recommendersystems
 
Identification and Characterization of Events in Social Media
Identification and Characterization of Events in Social MediaIdentification and Characterization of Events in Social Media
Identification and Characterization of Events in Social Media
 
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...
Openness as an Organizing Principle: Open for Diversity or Exclusionary Openn...
 
2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis2010 Catalyst Conference - Trends in Social Network Analysis
2010 Catalyst Conference - Trends in Social Network Analysis
 
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, WikisSocial Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
Social Web .20 Class Week 6: Lightweight Authoring, Blogs, Wikis
 
Trust influence and social media
Trust influence and social mediaTrust influence and social media
Trust influence and social media
 
Can you trust everything?
Can you trust everything?Can you trust everything?
Can you trust everything?
 

Recently uploaded

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 

Recently uploaded (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter

  • 1. Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter March 29, 2011 SoME 2011 (In Conjunction with WWW 2011) Hemant Purohit1, Yiye Ruan2, Amruta Joshi2, Srinivasan Parthasarthy2, Amit Sheth1 1Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), Wright State University, USA 2Dept. of Computer Science & Engineering Ohio State University, USA
  • 2. Outline (What) User-Community Engagement (Why) Motivation (How) Problem Formalization Approach Terminology Definition Analysis Framework People-Content-Network Analysis (PCNA) Experiments Datasets and Event Categorization Features Results Insights Conclusion & Future work 2
  • 3. User-communityEngagement Multiple topics surrounding events being discussed on social media Each topic constitutes a community of users discussing about it e.g., Japan Earthquake community How do we understand the phenomenon of user participation (engagement) in topic discussions 3 Image: http://itcilo.wordpress.com
  • 4. Motivation User Engagement Analysis Business How communities form during the product launch? What factors can attract users to engage in these communities, therefore further spreading the message? Crisis Management Effective communication: How quickly we can disseminate information between resource providers and people in need of resources? 4
  • 5. Problem formalization: Approach User engagement has been studied in many forms: Community Formation & Detection, Information Propagation, Link Prediction etc. It involves a three-dimensional dynamic at play: Content: topic of interest, People: participants who engages in discussion about the topic, and Network: community structure formed around the topic discussion Rather than limiting to one dimension, we propose multidimensional approach Case Study on Twitter 5
  • 6. Earlier Approaches 6 Content: Topic of Interest OR People: Participant of the discussion OR Network: Community around topic Images: tupper-lake.com/.../uploads/Community.jpg http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
  • 7. Our Approach 7 Content: Topic of Interest AND People: Participant of the discussion AND Network: Community around topic Images: tupper-lake.com/.../uploads/Community.jpg http://www.iconarchive.com/show/people-icons-by-aha-soft/user-icon.html
  • 8. Problem formalization: Terminology Event-Oriented Community An implicit group of social network users who have joined discussion (by message posting) on topic about an event. Slice Collection of messages relevant to topic of discussion, posted during a fixed-length time window. Snapshot State of the network at a certain point of time at which user profile and connection information are crawled. Active Window: freshness matters! Active Community 8
  • 9. Problem formalization: Definition Binary classification problem for user-community link prediction. User Engagement Prediction Problem: Given 1) an event-oriented community C formed around a topic of discussion; 2) a Twitter user U ε C, Predict whether U will be engaged in C (by composing a new tweet or retweeting an existing tweet which contains keywords or hashtag related to C's underlying event) in a future slice. If so, U is said to be a positive record. Otherwise, it is a negative record. 9
  • 11. Experiments: Dataset & Event categorization Study on Twitter data Events have various characteristics and we hypothesize the user engagement analysis for them being affected by different variables No standard event categorization is available, so we categorize the events observing data over a time, as follows: Global (G) vs. Local (L)[e.g., Japan Earthquake vs. Iowa State Fair] Deterministic (D) vs. Unexpected (U)[e.g., Emmy Awards vs. Japan Earthquake] Compact (C) vs. Loose (Ls)[e.g., ISWC conference vs. Japan Earthquake] Transient (T) vs. Lasting (Lt)[e.g., President’s Speech vs. Egypt Revolution] 11
  • 12. ClevelandShowPremiere: Second Season premiere of animated TV series Cleveland Show. September 26. Global, loose, deterministic, transient. DiscoveryBuildingCrisis: Hostage crisis at the head- quarters of Discovery Channel, Maryland. September 1. Local, loose, unexpected, transient. EmmyAwards: 62nd Prime-time Emmy Awards. August 29. Global, loose, deterministic, lasting. GoogleInstantSearch: Launch of Google Instant in United States. September 8. Global, loose, unexpected, transient. HeismanTrophy: Reggie Bush’s announcement to forfeit 2005 Heisman Trophy. September 14. Local, compact, unexpected, lasting. IowaStateFair: Iowa State Fair. August 12-22. Local, loose, deterministic, lasting. JewishNewYear: Jewish New Year 5771. September 8-10. Global, compact, deterministic, transient. LindsayLohanHearing: LindsayLohan’s hearing on probation revocation and verdict. September 24. Local, loose, deterministic, transient. LinuxCon: Annual convention organized by Linux Foundation. August 10-12. Global, compact, deterministic, lasting. LondonTubeStrike: London tube strike. September 6. Local, loose, deterministic, transient. RichCroninDeath: Death of singer and songwriter Rich Cronin. September 8. Local, loose, unexpected, transient. ScottPilgrimRelease: Release of movie Scott Pilgrim vs. the World. Aug 13. Global, loose, deterministic, lasting. SESSanFrancisco: Search Engine Strategies 2010 at San Francisco. August 16-20. Global, compact, deterministic, lasting. StuxnetWorm: Confirmation of Stuxnet worm at- tack on Iranian nuclear program. September 24. Global, loose, unexpected, lasting. 12 Events (labeled as per our categorization)
  • 13. Experiments: Features Organized in the PCNA framework: Node/Author features (P), Content features (C), Community features (N) Extracted for each potential community member (U) in each slice, where U belongs to the union of follower lists of each active community member 13 Followee of (U) Whole Topic Community Potential new Member (U) EDGE: Active Community A B If B follows A
  • 14.
  • 15. Experiments: Features (cont.) Content features[Characteristics of tweets posted by active friends of U]: keywords: number of event-relevant keywords hashtags: number of event-relevant hashtags retweet: number of retweets mention: number of mentions url: number of relevancy-adjust hyperlinks Irrelevant hyperlink is given number -1 subjectivity: Subjectivity scores for words and emoticons Linguistic Cues (LIWC1 analysis): Features for the language usage. Top-3 transformed features using Principle Component Analysis (PCA) extracted 15 1http://www.liwc.net
  • 16. Wait a minute! Not all contents have been viewed! Novelty and Attention: User is likely to see new or recent content/tweet and then join the community Apply temporal weighting on the features Dataset imbalance: too many negative records! Alleviated by SMOTE method Over-sampling on positive records and under-sampling on negative ones Not all users are active! Apply weighting on activity level based on last activity[1] 16 [1] Future works
  • 17. Experiments We run the following experiment groups: allFeatures (All): contains all three feature groups onlyContent (Con.): contains only content feature onlyAuthor (Aut.): contains only author feature onlyCommunity (Com.): contains only community feature SVM classifier LibSVM, RBF Kernel, gamma=8, c=32 17
  • 18. Experiments: Results 18 Event-Type Summary of Prediction Accuracy (%) Statistical significant results are in bold
  • 19. Insights Performance of onlyCommunity classifiers is worst The latent nature of network features makes it difficult to be perceived by a user directly. The onlyContent classifiers give the best performance over other single feature groups Some users end up participating in a discussion based on observing the information from the public timeline, and therefore, these ad-hoc users are hard to observe via network analysis only. Content is engaging by its quality and nature (information sharing or call for an action or crowd sourcing). For example, link to an image or video (an evidential content) about Reggie Bush's surrender of Heisman Trophy in September, 2010 is likely to provoke lot more thoughts in a user's mind to engage in the discussion. 19
  • 20. Insights (Cont.) Comparable performance of onlyAuthor classifiers as onlyContent classifiers for some of the topics Impact of the effective presence of influential people in the discussion group Insufficiency in content features, reflected by low average connectivity, can be compensated by author features (e.g., Rich Cronin Death). Statistical significance testing method shows allFeatures classifiers have better or equivalent performance over any single feature group classifier for 12 out of 14 topics The advantage of using all features is dominant, where degree of randomness in individual dimensions can be really high (e.g., Discovery Building attack). 20
  • 21. Insights (Cont.) No significant correlation between selection of feature groups and the event types: lasting vs. transient. Possibility of the shift in the characteristics over time Advantage of allFeatures over other factor groups is generally stronger on the unexpected topics than the deterministic ones. Degree of randomness being high in discussions surrounding unexpected events 21
  • 22. Conclusion & Future Work Every dimension (People, Content, Network) cannot be expected to perform well in all types of topic discussions, and hence, a strong need can be felt to study dynamics of user engagement by using the PCNA framework. Experiments with a more refined event types taxonomy and user engagement factors, with consideration of shift in the event characteristics over time Semantic Analysis of content to enhance content features Experiment on other social networks: Forums, DBLP 22
  • 23. Questions? Paper at: http://knoesis.org/library/resource.php?id=1095 More on Social Media @ Kno.e.sis at http://knoesis.org/research/semweb/projects/socialmedia/ 23

Editor's Notes

  1. Active window: due to concept of ‘novelty and attention’ in today’s social media systems’ design ---- (reason is information overload for a user to keep up with!!)
  2. Observe the data of active window Find the followers base of the active community members as the samples for the prediction problem Analyze features for them, and classify, whether the follower U is going to join the community C
  3. Global (G) vs. Local (L) – Scale (how many people will it attract)[e.g., Japan Earthquake vs. Iowa State Fair]Deterministic (D) vs. Unexpected (U) -- (expected)[e.g., Emmy Awards vs. Japan Earthquake]Compact (C) vs. Loose (Ls) -- (If people already know each other, so they are connected tightly)[e.g., ISWC conference vs. Japan Earthquake]Transient (T) vs. Lasting (Lt) -- (Event lasts for just a small time vs. long time)[e.g., President’s Speech vs. Egypt Revolution]
  4. Essentially 3 major types for features:For Community surrounding the event size growth rateFor usersUsers that you follow, mention or retweet fromUsers that share a large overlap of friends with youUsers that share similar profilesFor tweet content[only from ‘followees’ of new users joining the community]url, mentions, RT, # Linguistics analysis features
  5. Using style of writing for authorhashtags usage affects -- attention