SlideShare a Scribd company logo
1 of 92
Download to read offline
Monitoring and Analysis of Online
Communities

Harith Alani
Knowledge Media institute,
The Open University, UK


          http://twitter.com/halani
          http://delicious.com/halani
          http://www.linkedin.com/pub/harith-alani/9/739/534



                                                Web Science Summer School
                                                       Galway, 2011         1
Market value of Web Analytics




                                2
Agenda
•  Community monitoring


•  Offline and online social networking


•  Modeling and tracking behaviour


•  Analysing community features


•  Predicting discussion activity


                                          3
Online community monitoring
•  Analysing and understanding activities and dynamics
•  Studying impact of social and technical features
•  Forecast future growth and evolution
•  Tracking behaviour and influence
•  Tracking reputation and buzz
•  Listening to customer opinion
•  Profiling the user base
•  Gauging customer sentiment


                                                         4
Measuring social media




  Deloitte, Beeline Labs, & Society for New Communication Research surveyed 140 companies
  with online communities, 2008
                                                                                            5
Measuring social media




  Deloitte, Beeline Labs, & Society for New Communication Research surveyed 140 companies
  with online communities, 2008
                                                                                            6
Measuring social media




 “B2B Marketing Goes Social: A White Horse Survey Report” – March 2010 – study of 104 companies
                                                                                                  7
Measuring social media




 “Social media usage, attitudes and measurability: What do marketers think?” – KingFishMedia,
 2010                                                                                           8
Tools for monitoring social media




                                    9
•  Analytics:
  –  Mention volume
  –  Sentiment
  –  Discussion clouds
  –  Activity graphs and
     metrics
  –  Language and
     geolocation filtering
  –  Filter by social
     platform
  –  Comparisons



                               10
      http://www.ubervu.com/
•  Analytics:
  –  Influencing users
  –  Sentiment and opinion analysis
  –  Viral content analysis
  –  Detecting sales leads
  –  Filter by geo-location




                                                         11
                         http://www.viralheat.com/home
                                                     !
Monitoring and Analysis of
Online Communities
With a Web Science flavour




                             12
Online vs. Offline social
networking




                            13
Online vs. offline social networking: The Bad News!

•  Digital social networking
   increases physical social
   isolation
•  Causes
    –  Genetic alterations
    –  Weakened immune system
    –  Less resistant to cancer
    –  Higher risk of heart disease
    –  Higher blood pressure
    –  Faster dementia
    –  Narrower arteries


Aric Sigman, “Well Connected? The Biological
Implications of 'Social Networking’”, Biologist, 56
(1), 2009                                             14
Online vs. offline social networking: The Good News!

•  Digital networking increase social interaction
    –  Transforms little boxed societies to networked and networking
       societies
    –  Create more opportunities to network
    –  New methods to communicate, easily, and widely
    –  Supports and increases F2F contact!
    –  The stronger the offline social tie, the more intense the online
       communication
    –  The stronger the offline social tie, the more diverse online
       communications
    –  F2F is medium of choice in weaker social ties


Keith Hampton and Barry Wellman, Long Distance Community in the Network Society: Contact and
Support Beyond Netville, American Behavioral Scientist 45 (3), November, 2001.

Barry Wellman, The Glocal Village: Internet and Community, Idea’s - The Arts & Science Review,
                                                                                                 15
University of Toronto, 1(1),2004
Physical online & digital offline




                                    16
Sensor & Social Networks




                           17
Sensor & Social Networks
 www.nabaztag.com




                    The Canine Twitterer




                      “Having my daily workout.
                      Already did 15 leg lifts!”


                                                   18
Location Sensors & Social Networking


  Tag-Along Marketing
  The New York Times,
  November 6, 2010




                “Everything is in place for location-based social
                networking to be the next big thing. Tech
                companies are building the platforms, venture
                capitalists are providing the cash and marketers
                are eager to develop advertising. “

                                                                    19
Monitoring online/offline social activity
              Where	
  is	
  everybody?	
  




                                              20
Monitoring online/offline social activity


•  Generating
   opportunities for
   F2F networking




                                            21
Monitoring online/offline social activity




 “There are more than 250 million active users
 currently accessing Facebook through their mobile
 devices“

 “People that use Facebook on their mobile devices
 are twice as active on Facebook than non-mobile
 users”
                http://www.facebook.com/press/info.php?statistics
                                                                    22
Tracking of F2F contact networks
                            Sociometer, MIT, 2002
                            -    F2F and productivity
                            -    F2F dynamics
                            -    Who are key players?
                            -    F2F and office distance




   TraceEncounters - 2004




                                                           23
SocioPatterns platform




          http://www.sociopatterns.org/!   24
Offline social networks




                          From a small conference
                          at ISI, Turin




                                by Ciro Cattuto
                                                  25
Offline social networks

•  Similarity             students
   features
  –  Country of
     origin
                                     SR
  –  Seniority
  –  .. Age? Role?
     Projects?
     Interests?
•  What other        JR
   info can we
   get to help us                         students

   understand
   these network                     SR
   dynamics?
                                                     26
Offline + online social networking
                                Who should
                   Anyone I     I talk to?   Where have I
                   know here?                met this guy?
    Where
    should I go?




   ESWC2010                                                  27
Live Social Semantics (LSS):
     RFIDs + Social Web + Semantic Web
                                    <?xml version="1.0"?>!
                                    <rdf:RDF!
                                        xmlns="http://
                                    tagora.ecs.soton.ac.uk/schemas/
                                    tagging#"!
                                        xmlns:rdf="http://www.w3.org/
                                    1999/02/22-rdf-syntax-ns#"!
                                        xmlns:xsd="http://www.w3.org/2001/
                                    XMLSchema#"!
                                        xmlns:rdfs="http://www.w3.org/
                                    2000/01/rdf-schema#"!
                                        xmlns:owl="http://www.w3.org/
                                    2002/07/owl#"!
                                      xml:base="http://
                                    tagora.ecs.soton.ac.uk/schemas/
                                    tagging">!
                                      <owl:Ontology rdf:about=""/>!
                                      <owl:Class rdf:ID="Post"/>!
                                      <owl:Class rdf:ID="TagInfo"/>!
                                      <owl:Class
                                    rdf:ID="GlobalCooccurrenceInfo"/>!
                                      <owl:Class
                                    rdf:ID="DomainCooccurrenceInfo"/>!
                                      <owl:Class rdf:ID="UserTag"/>!
                                      <owl:Class
                                    rdf:ID="UserCooccurrenceInfo"/>!
                                      <owl:Class rdf:ID="Resource"/>!
                                      <owl:Class rdf:ID="GlobalTag"/>!
                                      <owl:Class rdf:ID="Tagger"/>!
                                      <owl:Class rdf:ID="DomainTag"/>!
                                      <owl:ObjectProperty
                                    rdf:ID="hasPostTag">!
                                        <rdfs:domain
                                    rdf:resource="#TagInfo"/>!
                                      </owl:ObjectProperty>!
                                      <owl:ObjectProperty
                                    rdf:ID="hasDomainTag">!
                                        <rdfs:domain
                                    rdf:resource="#UserTag"/>!
                                      </owl:ObjectProperty>!
                                      <owl:ObjectProperty
                                    rdf:ID="isFilteredTo">!

•    Integration of physical presence and online information
                                        <rdfs:range
                                    rdf:resource="#GlobalTag"/>!
                                        <rdfs:domain

•    Semantic user profile generation
                                    rdf:resource="#GlobalTag"/>!
                                      </owl:ObjectProperty>!
                                      <owl:ObjectProperty

•    Logging of face-to-face contactrdf:ID="hasResource">!
                                        <rdfs:domain rdf:resource="#Post"/>!
                                        <rdfs:range =…!

•    Social network browsing
•    Analysis of online vs offline social networks
SW sources




                               conference



             chair                     proceedings




                     chair
                                      author

                             CoP




                                               29
Social and information networks




                                  30
Merging social networks




                  FOAF    31
Tag Filtering Service




                        Semantic modeling
                        Semantic analysis
                        Collective intelligence
                        Statistical analysis
                        Syntactical analysis
                                                  32
Tag Filtering Service




                        33
From Tags to Semantics




                         34
Tags to User Interests




                         35
From raw tags and social relations
to Structured Data



                       Collective
                       intelligence


           User raw                   Semantic
           data                       data




                                                 Structured
                                                 data
                       ontologies




                                                       36
RFIDs for tracking social contact




                                    37
Convergence with online social networks




                                          38
People contact à RFID à RDF Triples



                                                   foaf#Person1
                             contactWith	
  


  Place

                                                      hasContact	
  
                                                                       foaf#Person2
          contactPlace	
           F2FContact



                 contactDate	
                   contactDura0on	
  



           XMLSchema#date	
  
                                               XMLSchema#0me	
                        39
40
41
Real-time F2F networks with SNS links




                                           42
            http://www.vimeo.com/6590604
Live Social Semantics
 Deployed at:




Data analysis
•  Face-to-face interactions across scientific conferences
•  Networking behaviour of frequent users
•  Correlations between scientific seniority and social networking
•  Comparison of F2F contact network with Twitter and Facebook
•  Social networking with online and offline friends
                                                                     43
Analysis of LSS Results




The New Yorker 2/11/2008

                           44
Characteristics of F2F contact network
  Network              ESWC 2009        HT 2009         ESWC 2010
  characteristics
  Number of users          175             113              158
  Average degree           54               39               55
  Avg. strength (mn)       143             123              130
  Avg. weight (mn)         2.65            3.15             2.35


  Weights ≤ 1 mn           70%             67%              74%


  Weights ≤ 5 mn           90%             89%              93%


  Weights ≤ 10 mn          95%             94%              96%

•  Degree is number of people with whom the person had at least one F2F
   contact
•  Strength is the time spent in a F2F contact
•  Edge weight is total time spent by a pair of users in F2F contact
                                                                          45
Characteristics of F2F contact events
 Contact              ESWC 2009           HT 2009          ESWC 2010
 characteristics
 Number of                16258             9875               14671
 contact events
 Average contact           46                 42                 42
 length (s)

 Contacts ≤ 1mn           87%                89%                88%

 Contacts ≤ 2mn           94%                96%                95%

 Contacts ≤ 5mn           99%                99%                99%

 Contacts ≤ 10mn          99.8%             99.8%              99.8%


      F2F contact pattern is very similar for all three conferences
F2F contacts of returning users
                                                            Degree
•  Degree: number of other                       10
                                                      2

   participants with whom an attendee
   has interacted
                                                      1
                                                     10 1                                              2
                                                       10                                     10
•  Total time: total time spent in




                                          ESWC2010
                                                            Total interaction time
   interaction by an attendee                         4
                                                 10

                                                      3
                                                 10 3                                 4                          5
                                                   10                                10                         10
•  Link weight: total time spent in F2F               4     Links’ weights
                                                 10
   interaction by a pair of returning               3
                                                 10
   attendees in 2010, versus the same              2
                                                 10
   quantity measured in 2009                        1
                                                 10 1                   2                 3        4             5
                                                   10                 10             10       10                10
 ESWC 2009 &        Pearson Correlation                                        ESWC2009
 ESWC 2010
 Degree                      0.37                     Time spent on F2F networking by frequent
                                                      users is stable, even when the list of
 Total F2F                   0.76
 interaction time                                     people they networked with changed
 Link weight                 0.75
                                                                                                           47
Average seniority of neighbours in F2F networks

•    No clear pattern is observed                                     5
     if the unweighted average                                             senn
                                                                           Avg seniority of the neighbours
     over all neighbours in the




                                     Average seniority of neighbors
                                                                           senn,w
                                                                           with weighted averages
     aggregated network is                                            4
     considered
                                                                           senn,max
                                                                           Seniority of user with strongest link



•    A correlation is observed                                        3
     when each neighbour is
     weighted by the time spent
     with the main person
                                                                      2
•    The correlation becomes
     much stronger when                                               1
     considering for each
     individual only the neighbour
     with whom the most time was
     spent                                                            0
                                                                       0                          5                     10
                                                                                         seniority (number of papers)


            Conference attendees tend to networks with others of similar
            levels of scientific seniority
                                                                                                                             48
Presence	
  of	
  A<endees	
  HT2009	
  




              Importance	
  of	
  the	
  bar?	
  	
  
              Popularity	
  of	
  sessions?	
  	
  par0cular	
  talks?	
  
Number	
  of	
  cliques	
  HT2009	
  
Offline networking vs online networking
                                                                 Twitterers                Spearman
                                                                                           Correlation (ρ)
                                                                 Tweets – F2F Degree           - 0.15

                                                                 Tweets – F2F Strength         - 0.15

                                                                 Twitter Following – F2F       - 0.21
                                                                 Degree




                                                                            users

                    Users with Facebook and Twitter accounts in ESWC 2010

  •    people who have a large number of friends on Twitter and/or Facebook don’t seem to
       be the most socially active in the offline world in comparison to other SNS users

             No strong correlation between amount of F2F
             contact activity and size of online social networks                                     51
Scientific seniority vs Twitter followers
                                                          Twitter users                          Correlation
                                                          H-index – Twitter Followers               0.32
      (#$"


                                                          H-index – Tweets                         - 0.13
        ("




      !#'"




                                                                             *+,-./"01221+./3"
      !#&"
                                                                             45678.9"
                                                                             *+..:3"


      !#%"




      !#$"




        !"
             ("   &"   (("    (&"    $("    $&"    )("    )&"    %("      users


 •    Comparison between people’s scientific seniority and the number of people following
      them on Twitter

 People who have the highest number of Twitter followers are not
 necessarily the most scientifically senior, although they do have high
 visibility and experience                                                                                  52
Conference Chairs
                                    all     chairs    all     chairs
                               participants 2009 participants 2010
                                  2009              2010
average degree                       55            77.7            54           77.6
average strength                    8590          19590           7807         22520
average weight                       159            500           141          674
average number of                    3.44            8            3.37         12
events per edge

   •  Conf chairs interact with more distinct people (larger average degree)

   •  Conf chairs spend more time in F2F interaction (almost three times as much
      as a random participant)
Networking with online and offline ‘friends’
Characteristics             all users       coauthors        Facebook         Twitter
                                                              friends        followers
average contact                 42               75               63              72
duration (s)
average edge weight            141              4470             830            1010
(s)
average number of              3.37              60               13              14
events per edge
   •  Individuals sharing an online or professional social link meet much more
      often than other individuals
   •  Average number of encounters, and total time spent in interaction, is highest
      for co-authors

  F2F contacts with Facebook & Twitter friends were respectively %50 and
  %71 longer, and %286 and %315 more frequent than with others

  They spent %79 more time in F2F contacts with their co-authors, and they
  met them %1680 more times than they met non co-authors
Twitterers vs Non-Twitterers


•  Time spent in conference rooms
  –  Twitter users spent on average 11.4% more time in the
     conf rooms than non-twitter users (mean is 26% higher)


•  Number of people met F2F during the conference
  –  Twitter users met on average 9% more people F2F
     (mean 8% higher)


•  Duration of F2F contacts
  –  Twitter users spent on average 63% more time in F2F
     contact than non twitter users (mean is 20% higher)


                                                              55
Analysis of behaviour in online
communities




                 Web Science Summer School
                        Galway, 2011         56
Behaviour of individuals – micro level analysis
(#$"


 6DD1">?@20AB?M"                                                                                                     89O1209>M"PQM"12R2<DE27>#"
;01">D?@;<">@60;<>""                            @0"K88"92;L"                                                       S:DT>"9:2"0239">9;7"72>2;7?:27N"


  ("




!#'"




!#&"



                                  :2;<9:=">?@20AB?"C"
                                  >D?@;<"E7DB<2>#"F72G"
                                      ?:;@7>HIJ>"
!#%"




!#$"



                            DO9>@127M"
                              :@6:"                           >:="
                             E7DB<2"                       >?@20A>9N"
  !"
       ("              )"            *"              (+"                (,"   $("           $)"            $*"              ++"             +,"       %("        %)"
                                                                                -./0123"   4$4"526722"   4$4"8972069:"
                                                                                                                                                            57
Why monitor behaviour?
•  Understand impact of behaviour on community evolution
•  Forecast community future
•  Learn when intervention might be needed
•  Learn which behaviour should be encouraged or
   discouraged
•  Find what could trigger certain behaviours
•  What is the best mix of behaviour to increase
   engagement in the community
•  To see which users need more support, which ones
   should be confined, and which ones should be promoted


                                                           58
Behaviour analysis

Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing discussion forums using
common user roles. In Proc. Web Science Conf. (WebSci10), Raleigh, NC: US, 2010


•  Behaviour compositions in Boards.ie:
Ontology
Encoding Rules in Ontologies with SPIN
Approach for inferring User Roles
Structural, social network,               Feature levels change with the
reciprocity, persistence, participation   dynamics of the community




Run our rules over each user’s features   Associate Roles with a collection of
and derive the role composition           feature-to-level Mappings
                                          e.g. in-degree -> high, out-degree ->
                                          high


                                                                                  62
Data from Boards.ie
•  Forum 246 (Commuting and Transport): Demonstrates a clear increase in
   activity over time.
•  Forum 388 (Rugby): Exhibits periodic increase and decrease in activity and
   hence it provides good examples of healthy/unhealthy evolutions.
•  Forum 411 (Mobile Phones and PDAs): Increase in activity over time with
   some fluctuation - i.e. reduction and increase over various time windows.
•  For the time in 2004-01 to 2006-12
Features

•  In-degree Ratio: The proportion of users U that reply to user υi, thus
   indicating the concentration of users that reply to υi
•  Posts Replied Ratio: Proportion of posts by user υi that yield a reply, used
   to gauge the popularity of the user’s content based on replies
•  Thread Initiation Ratio: Proportion of threads that have been started by υi.
•  Bi-directional Threads Ratio: Proportion of threads where user υi replies to
   a user and receives a reply, thus forming a reciprocal communication
•  Bi-directional Neighbours Ratio: The proportion of neighbours where a
   reciprocal interaction has taken place - e.g. υi replied to υi and υi replied to υi.
•  Average Posts per Thread: The average number of posts made in every
   thread that user υi has participated in
•  Standard Deviation of Posts per Thread: The standard deviation of the
   number of posts in every thread that user υi has participated in. This gauges
   the distribution of the discussion lengths.
Role Skeleton
Results
Commuting and Transport           Rugby                Mobile Phones and PDAs




•  Correlation of individual features in each of the three forums
(a) Forum 246: Commuting and Transport


                                         Results




                                                                         (b) Forum 388: Rugby
                                                                         (c) Forum 411: Mobile Phones and PDAs
                                         •  Variation in behaviour
                                            composition & activity
                                         •  Behaviour composition in/
                                            stability influences forum
                                            activity
Prediction analysis – preliminary results!
•  Predicting rise/fall in post submission numbers
•  Binary classification
•  Features : Community composition, roles and percentages of users
   associated with each
              Forum         P       R       F1       ROC

               246         0.799   0.769   0.780     0.800

               388         0.603   0.615   0.605     0.775

               411         0.765   0.692   0.714     0.617

                All        0.583   0.667   0.607     0.466



 •  Cross-community predictions are less reliable than individual
    community analysis due to the idiosyncratic behaviour observed in
    each individual community
Observations so far
•  Growing communities contain more elitists and popular participants


•  Shrinking communities contain many taciturns and ignored users


•  A stable composition, with a mix of roles, is associated with
   increased community activity


•  Different communities may require different behaviour compositions
   to increase activity/health
What features make online
communities tick
•  How many do you
   recognise? Use?

•  Which ones still exist?

•  Which are strong and
   healthy?

•  Which are aging and
   withering?

•  What health signs should
   we look for?

•  How can we predict their
   future evolution?



                              71
Rise and fall of social networks




                                   72
Predicting engagement



•  Which posts will receive a reply?
  –  What are the most influential features here?




•  How much discussion will it generate?
  –  What are the key factors of lengthy discussions?




                                                        73
user attributes - describing the reputation of the user - and attributes of a post’s
    content - generally referred to as content features. In Table 1 we define user and

Common online communityFeatures      features
    content features and study their influence on the discussion “continuation”.
           Table 1. User and Content
                                                User Features
           In Degree:    Number of followers of U                                              #
         Out Degree:     Number of users U follows                                             #
         List Degree:    Number of lists U appears on. Lists group users by topic              #
         Post Count:     Total number of posts the user has ever posted                        #
            User Age:    Number of minutes from user join date                                 #
                                                                                         P ostCount
          Post Rate:     Posting frequency of the user                                    U serAge
                                           Content Features
         Post length: Length of the post in characters                                         #
         Complexity: Cumulative entropy of the unique words in post p λ
                                                                                    i∈[1,n] pi(log λ−log pi)
                         of total word length n and pi the frequency of each word             λ
     Uppercase count:    Number of uppercase words                                         #
         Readability:    Gunning fog index using average sentence length (ASL)             [7]
                         and the percentage of complex words (PCW).                 0.4(ASL + P CW )
         Verb Count:     Number of verbs                                                   #
         Noun Count:     Number of nouns                                                   #
     Adjective Count:    Number of adjectives                                              #
      Referral Count:    Number of @user                                                   #
     Time in the day:    Normalised time in the day measured in minutes                    #
     Informativeness:    Terminological novelty of the post wrt other posts
                         The cumulative tfIdf value of each term t in post p            t∈p   tf idf (t, p)
             Polarity:   Cumulation of polar term weights in p (using
                                                                                          P o+N e
                         Sentiwordnet3 lexicon) normalised by polar terms count           |terms|




•  How do all these features influence activity generation in an online
     4.2 Experiments
   community? are intended to test the performance of different classification mod-
     Experiments
   – els in identifying seed posts. Therefore we used four classifiers: discriminative
      Such knowledge leads to better use and management of the community                                      74

    classifiers Perceptron and SVM, the generative classifier Naive Bayes and the
Experiment for identifying seed posts


 •  Twitter data on the Haiti earthquake, and the Union
    Address


     Dataset         Users    Tweets     Seeds   Non-seeds   Replies

     Haiti           44,497   65,022     1,405    60,686      2,931

     Union Address   66,300   80,272     7,228    55,169     17,875




 •  Evaluated a binary classification task
   –  Is this post a seed post or not?


                                                                       75
first report on the results obtained from our model selection phase, before moving
   Identifying seeds with different type of
onto our results from using the best model with the top-k features.

   features
Table 3. Results from the classification of seed posts using varying feature sets and
classification models
              (a) Haiti Dataset                       (b) Union Address Dataset
                     P       R      F1     ROC                  P     R     F1    ROC
       User   Perc 0.794   0.528   0.634   0.727  User   Perc 0.658 0.697 0.677   0.673
              SVM 0.843    0.159   0.267   0.566         SVM 0.510 0.946 0.663    0.512
              NB   0.948   0.269   0.420   0.785         NB   0.844 0.086 0.157   0.707
              J48  0.906   0.679   0.776   0.822         J48  0.851 0.722 0.782   0.830
      Content Perc 0.875   0.077   0.142   0.606 Content Perc 0.467 0.698 0.560   0.457
              SVM 0.552    0.727   0.627   0.589         SVM 0.650 0.589 0.618    0.638
              NB   0.721   0.638   0.677   0.769         NB   0.762 0.212 0.332   0.649
              J48  0.685   0.705   0.695   0.711         J48  0.740 0.533 0.619   0.736
        All   Perc 0.794   0.528   0.634   0.726   All   Perc 0.630 0.762 0.690   0.672
              SVM 0.483    0.996   0.651   0.502         SVM 0.499 0.990 0.664    0.506
              NB   0.962   0.280   0.434   0.852         NB   0.874 0.212 0.341   0.737
              J48  0.824   0.775   0.798   0.836         J48  0.890 0.810 0.848   0.877


4.3     Results
Our•  findings from Table 3 demonstrate the effectiveness of using solely user
       User features are most important in Twitter
features for identifying seed posts. Infeatures gives best results Address datasets
    •  But combining user & content both the Haiti and Union
training a classification model using user features shows improved performance76
over the same models trained using content features. In the case of the Union
Impact of different features
which we found to be 0.674 indicating a good correlation between the two lists
and• their respective ranks.the highest impact on identification of seed
      What features have
      posts?
TableRank features by information gainGain Ratio wrt Seed Post class label. The
    •  4. Features ranked by Information ratio wrt seed post class label
feature name is paired within its IG in brackets.

         Rank   Haiti                             Union Address
          1     user-list-degree (0.275)          user-list-degree (0.319)
          2     user-in-degree (0.221)            content-time-in-day (0.152)
          3     content-informativeness (0.154)   user-in-degree (0.133)
          4     user-num-posts (0.111)            user-num-posts (0.104)
          5     content-time-in-day (0.089)       user-post-rate (0.075)
          6     user-post-rate (0.075)            user-out-degree (0.056)
          7     content-polarity (0.064)          content-referral-count (0.030)
          8     user-out-degree (0.040)           user-age (0.015)
          9     content-referral-count (0.038)    content-polarity (0.015)
          10    content-length (0.020)            content-length (0.010)
          11    content-readability (0.018)       content-complexity (0.004)
          12    user-age (0.015)                  content-noun-count (0.002)
          13    content-uppercase-count (0.012)   content-readability (0.001)
          14    content-noun-count (0.010)        content-verb-count (0.001)
          15    content-adj-count (0.005)         content-adj-count (0.0)
          16    content-complexity (0.0)          content-informativeness (0.0)
          17    content-verb-count (0.0)          content-uppercase-count (0.0)
                                                                                   77
7     content-polarity (0.064)          content-referral-count (0.030)
                             8     user-out-degree (0.040)           user-age (0.015)
                             9     content-referral-count (0.038)    content-polarity (0.015)

Positive/negative impact of features
                             10
                             11
                             12
                                   content-length (0.020)
                                   content-readability (0.018)
                                   user-age (0.015)
                                                                     content-length (0.010)
                                                                     content-complexity (0.004)
                                                                     content-noun-count (0.002)
                             13    content-uppercase-count (0.012)   content-readability (0.001)
                             14    content-noun-count (0.010)        content-verb-count (0.001)
•  What is the correlation between seed posts and features?
                             15
                             16
                                   content-adj-count (0.005)
                                   content-complexity (0.0)
                                                                     content-adj-count (0.0)
                                                                     content-informativeness (0.0)
                             17    content-verb-count (0.0)          content-uppercase-count (0.0)
  Haiti
  Union Address




                  Fig. 3. Contributions of top-5 features to identifying Non-seeds (N ) and Seeds(S).
                  Upper plots are for the Haiti dataset and the lower plots are for the Union Address   78
                  dataset.
Identifying Seed Posts
•  Can we identify seed posts using the top-k features?

  –  Stability is reached with
     5 features



  –  Classification with 5
     features is sufficient for
     identifying posts that
     generate responses




                                                          79
Predicting Discussion Activity
•  Reply rates:
  –  Haiti 1-74 responses, Union Address 1-75 responses
•  Compare rankings
  –  Ground truth vs predicted
•  Experiments
  –  Using Haiti and Union Address datasets
  –  Evaluate predicted rank k where k={1,5,10,20,50,100)
  –  Support Vector Regression with user, content, user+content
     features

         Dataset         Training   Test size   Test Vol   Test Vol SD
                           size                  Mean
         Haiti             980        210        1.664       3.017

         Union Address    5,067      1,161       1.761       2.342       80
Predicting Discussion Activity

    Haiti dataset                              Union Address dataset




           •  Content features are key for top ranks
           •  Use features more important for higher ranks


                                                                       81
Identifying Seed Posts in Boards.ie

•  Used the same features as before
  –  User features
     •  In-degree, out-degree, post count, user age, post rate
  –  Content features
     •  Post Length, complexity, readability, referral count, time in day,
        informativeness, polarity

•  New features designed to capture user affinity
  –  Forum Entropy
     •  Concentration of forum activity
     •  Higher entropy = large forum spread
  –  Forum Likelihood
     •  Likelihood of forum post given user history
     •  Combines post history with incoming data



                                                                             82
Experiment for identifying seed posts
•  Used all posts from Boards.ie in 2006
•  Built features using a 6-month window prior to seed post date

         Posts           Seeds    Non-Seeds   Replies     Users

         1,942,030       90,765    21,800     1,829,465   29,908




•  Evaluated a binary classification task
   –  Is this post a seed post or not?
   –  Precision, Recall, F1 and Accuracy
   –  Tested: user, content, focus features, and their combinations




                                                                      83
h the features (i.e., user                               TABLE II
om t − 188 to t − 1. In        R ESULTS FROMTHE CLASSIFICATION OF SEED POSTS USING

       Identifying seeds with different type of
he features compiled for
  outcomes and will not
                                 VARYING FEATURE SETS AND CLASSIFICATION MODELS



       features
  user may increase their
                     User SVM
                                    P
                                  0.775
                                          R
                                        0.810
                                               F
                                              0.774
                                                    ROC
                                                    0.581
                                                                                1

ich would not be a true                         Naive Bayes   0.691   0.767   0.719   0.540
ime the post was made.                          Max Ent       0.776   0.806   0.722   0.556
                                                J48           0.778   0.809   0.734   0.582
e number of posts (seeds,         Content       SVM           0.739   0.804   0.729   0.511
tained within.                                  Naive Bayes   0.730   0.794   0.740   0.616
                                                Max Ent       0.758   0.806   0.730   0.678
TING   S EED P OSTS                             J48           0.795   0.822   0.783   0.617
 ls are often hindered by         Focus         SVM           0.649   0.805   0.719   0.500
                                                Naive Bayes   0.710   0.737   0.722   0.588
We alleviate this problem                       Max Ent       0.649   0.805   0.719   0.586
  and non-seeds through a                       J48           0.649   0.805   0.719   0.500
posts have been identified     User + Content    SVM           0.790   0.808   0.727   0.509
                                                Naive Bayes   0.712   0.772   0.732   0.593
   of discussion that such                      Max Ent       0.767   0.807   0.734   0.671
ook for the best classifier                      J48           0.795   0.821   0.779   0.675
 ts and then search for the    User + Focus     SVM           0.776   0.810   0.776   0.583
                                                Naive Bayes   0.699   0.778   0.724   0.585
 guishing seed posts from                       Max Ent       0.771   0.806   0.722   0.607
atures that are associated                      J48           0.777   0.810   0.742   0.617
                              Content + Focus   SVM           0.750   0.805   0.729   0.511
                                                Naive Bayes   0.732   0.787   0.746   0.658
                                                Max Ent       0.762   0.807   0.731   0.692
                                                J48           0.798   0.823   0.787   0.662
 the previously described           All         SVM           0.791   0.808   0.727   0.510
ntaining both seeds and                         Naive Bayes   0.724   0.780   0.740   0.637
                                                Max Ent       0.768   0.808   0.733   0.688
r collection of posts we                        J48           0.798   0.824   0.792   0.692
tures listed in section III                                                                   84
Positive/negative impact of features on Boards.ie
                                                       TABLE III
                             R EDUCTION  IN F1 LEVELS AS INDIVIDUAL FEATURES ARE
                                        DROPPED FROM THE J 48 CLASSIFIER

•  What are the most
                                   Feature Dropped                      F1
   important features for          -                                  0.815
   predicting seed posts?          Post Count
                                   In-Degree
                                                                      0.815
                                                                      0.811*
                                   Out-Degree                         0.811*
                                   User Age                         0.807***
                                   Post Rate                          0.815
                                   Forum Entropy                      0.815
•  Correlations:                   Forum Likelihood                 0.798***
                                   Post Length                       0.810**
  –  Referral counts (non-seeds)   Complexity                        0.811**
  –  Forum likelihood (seeds)      Readability                      0.802***
                                   Referral Count                   0.793***
  –  Informativeness (non-seeds)   Time in Day                       0.810**
                                   Informativeness                  0.801***
  –  Readability (seeds)           Polarity                         0.808***
                                   Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 .
  –  User age (non-seeds)


                          hyperlinks (e.g., ads and spams). This contrasts with work in
                          Twitter which found that tweets containing many links were
                                                                                             85
Predicting Discussion Activity in Boards.ie

•  Can we predict the level of
   discussion activity?




                                              86
Predicting Discussion Activity in Boards.ie

•  What impact do features have on discussion length?
  –  Assessed Linear Regression model with focus and content
     features

  –  Forum Likelihood (pos)
  –  Content Length (+/neutral)
  –  Complexity (pos)
  –  Readability (+/neutral)
  –  Referral Count (neg)
  –  Time in Day (+/neutral)
  –  Informativeness (-/neutral)
  –  Polarity (neg)




                                                               87
Stay tuned
•  More communities
  –  SAP, IBM, StackOverflow, Reddit
  –  Compare impact of features on their dynamics


•  Better behaviour analysis
  –  Less features, more forums/communities, more graphs!
  –  Healthy? posts, reciprocation, discussions, sentiment mixture


•  Churn analysis
  –  Correlation of features/behaviour to ‘bounce rate’


•  Intervention!
  –  Opportunities and mechanisms to influence behaviour             88
Upcoming events

             Social Object Networks
              IEEE Social Computing, 2011
                October 9-10, Boston, USA

  http://ir.ii.uam.es/socialobjects2011/
                                       !
                Deadline: August 5, 2011



  Intelligent Web Services Meet Social Computing
             AAAI Spring Symposium 2012,
             March 26-28, Stanford, California

    http://vitvar.com/events/aaai-ss12
                Deadline: Octover 7, 2011

                                                   89
Questionnaire on user needs


http://socsem.open.ac.uk/limesurvey/index.php?sid=55487


Questionnaire is to identify the needs that community users have within online
communities and to learn the factors and issues that influence those needs.




                                                                                 90
Thanks to
    My social semantics team                       Live Social Semantics team




  Sofia Angeletou                                Ciro Cattuto     Wouter van Den Broeck
                        Matthew Rowe
 Research Associate                               ISI, Turin            ISI, Turin
                      Research Associate




Acknowledgements
                                               Alain Barrat           Martin Szomszor
                                            CPT Marseille & ISI    CeRC, City University, UK




                                             Gianluca Correndo, Uni Southampton
                                                  Ivan Cantador, UAM, Madrid
                                                          STI International
                                           ESWC09/10 & HT09 chairs and organisers
                                                        All LSS participants


                                                                                               91
Monitoring and Analysis of Online Communities

More Related Content

What's hot

Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...ACMBangalore
 
2010 sept - mobile web africa - marc smith - says who - mapping social medi...
2010   sept - mobile web africa - marc smith - says who - mapping social medi...2010   sept - mobile web africa - marc smith - says who - mapping social medi...
2010 sept - mobile web africa - marc smith - says who - mapping social medi...Marc Smith
 
2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media snaMarc Smith
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...Marc Smith
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...Marc Smith
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formattedMarc Smith
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Shalin Hai-Jew
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)Lora Aroyo
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis Jari Jussila
 
Predicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebPredicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebMatthew Rowe
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXLMarc Smith
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsMarc Smith
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNAMarc Smith
 
Social media mining hicss 46 part 1
Social media mining   hicss 46 part 1Social media mining   hicss 46 part 1
Social media mining hicss 46 part 1Dave King
 
Social Web 2014: Final Presentations (Part II)
Social Web 2014: Final Presentations (Part II)Social Web 2014: Final Presentations (Part II)
Social Web 2014: Final Presentations (Part II)Lora Aroyo
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network AnalysisMichel Bruley
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social mediaFarida Vis
 

What's hot (20)

Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
2010 sept - mobile web africa - marc smith - says who - mapping social medi...
2010   sept - mobile web africa - marc smith - says who - mapping social medi...2010   sept - mobile web africa - marc smith - says who - mapping social medi...
2010 sept - mobile web africa - marc smith - says who - mapping social medi...
 
2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna2015 pdf-marc smith-node xl-social media sna
2015 pdf-marc smith-node xl-social media sna
 
Social Media Mining and Analytics
Social Media Mining and AnalyticsSocial Media Mining and Analytics
Social Media Mining and Analytics
 
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
2015 #MMeasure-Marc Smith-NodeXL Mapping social media using social network ma...
 
20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...20151001 charles university prague - marc smith - node xl-picturing political...
20151001 charles university prague - marc smith - node xl-picturing political...
 
2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted2013 passbac-marc smith-node xl-sna-social media-formatted
2013 passbac-marc smith-node xl-sna-social media-formatted
 
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
Understanding Public Sentiment: Conducting a Related-Tags Content Network Ext...
 
CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)CrowdTruth @VU Faculty Colloquium (June 2015)
CrowdTruth @VU Faculty Colloquium (June 2015)
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis
 
Predicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic WebPredicting Discussions on the Social Semantic Web
Predicting Discussions on the Social Semantic Web
 
2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL2014 TheNextWeb-Mapping connections with NodeXL
2014 TheNextWeb-Mapping connections with NodeXL
 
Think Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming SkillsThink Link: Network Insights with No Programming Skills
Think Link: Network Insights with No Programming Skills
 
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
2016 SocialMedia.Org Marc Smith-NodeXL-Social Media SNA
 
About the Social Semantic Web
About the Social Semantic WebAbout the Social Semantic Web
About the Social Semantic Web
 
Social media mining hicss 46 part 1
Social media mining   hicss 46 part 1Social media mining   hicss 46 part 1
Social media mining hicss 46 part 1
 
Social Web 2014: Final Presentations (Part II)
Social Web 2014: Final Presentations (Part II)Social Web 2014: Final Presentations (Part II)
Social Web 2014: Final Presentations (Part II)
 
Social Search Arnaud Fischer
Social Search Arnaud FischerSocial Search Arnaud Fischer
Social Search Arnaud Fischer
 
Big Data: Social Network Analysis
Big Data: Social Network AnalysisBig Data: Social Network Analysis
Big Data: Social Network Analysis
 
The evolution of research on social media
The evolution of research on social mediaThe evolution of research on social media
The evolution of research on social media
 

Viewers also liked

On Line Quality Monitoring.PDF
On Line Quality Monitoring.PDFOn Line Quality Monitoring.PDF
On Line Quality Monitoring.PDFSunil Kumar Sharma
 
Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup CommunityJohn Breslin
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...John Breslin
 
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...GUANGYUAN PIAO
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...GUANGYUAN PIAO
 
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...GUANGYUAN PIAO
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksJohn Breslin
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...GUANGYUAN PIAO
 
Analysis of bioreactor parameters online and offline
Analysis of bioreactor parameters online and offlineAnalysis of bioreactor parameters online and offline
Analysis of bioreactor parameters online and offlinevikash_94
 
Dornier as Air Jet Weaving Machines
Dornier as Air Jet Weaving MachinesDornier as Air Jet Weaving Machines
Dornier as Air Jet Weaving MachinesKEVSER CARPET
 

Viewers also liked (11)

On Line Quality Monitoring.PDF
On Line Quality Monitoring.PDFOn Line Quality Monitoring.PDF
On Line Quality Monitoring.PDF
 
Growing Galway's Startup Community
Growing Galway's Startup CommunityGrowing Galway's Startup Community
Growing Galway's Startup Community
 
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
Intellectual Property: Protecting Ideas, Designs and Brands in the Real World...
 
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
SAC2016-Measuring Semantic Distance for Linked Open Data-enabled Recommender ...
 
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
SEMANTiCS2016 - Exploring Dynamics and Semantics of User Interests for User ...
 
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
UMAP2016 - Analyzing Aggregated Semantics-enabled User Modeling on Google+ an...
 
Innovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and TricksInnovation and Entrepreneurship: Tips, Tools and Tricks
Innovation and Entrepreneurship: Tips, Tools and Tricks
 
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
EKAW2016 - Interest Representation, Enrichment, Dynamics, and Propagation: A ...
 
Process control and yarn quality in spinning woodhead publishing-karthik
Process control and yarn quality in spinning woodhead publishing-karthikProcess control and yarn quality in spinning woodhead publishing-karthik
Process control and yarn quality in spinning woodhead publishing-karthik
 
Analysis of bioreactor parameters online and offline
Analysis of bioreactor parameters online and offlineAnalysis of bioreactor parameters online and offline
Analysis of bioreactor parameters online and offline
 
Dornier as Air Jet Weaving Machines
Dornier as Air Jet Weaving MachinesDornier as Air Jet Weaving Machines
Dornier as Air Jet Weaving Machines
 

Similar to Monitoring and Analysis of Online Communities

Harith Alani's presentation at SSSW 2011
Harith Alani's presentation at SSSW 2011Harith Alani's presentation at SSSW 2011
Harith Alani's presentation at SSSW 2011sssw2011
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsMatthew Rowe
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network AnalysisMarc Smith
 
Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)Lora Aroyo
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsAdam Papendieck
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social MediaSeth Grimes
 
The State of Social Media (and How to Use It and Not Lose Your Job)
The State of Social Media (and How to Use It and Not Lose Your Job)The State of Social Media (and How to Use It and Not Lose Your Job)
The State of Social Media (and How to Use It and Not Lose Your Job)Andrew Krzmarzick
 
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMichael Mathioudakis
 
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...Fabien Gandon
 
Information Skills in a Global 2.0 World
Information Skills in a Global 2.0 WorldInformation Skills in a Global 2.0 World
Information Skills in a Global 2.0 WorldKelly Lambert
 
STM Master Class Presentation: The Evolving Journal
STM Master Class Presentation: The Evolving JournalSTM Master Class Presentation: The Evolving Journal
STM Master Class Presentation: The Evolving JournalAnn Michael
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...learjk
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...Shalin Hai-Jew
 
Hr roundtable slides
Hr roundtable slidesHr roundtable slides
Hr roundtable slidesdaniherro
 
Social Media By Apurv
Social Media By ApurvSocial Media By Apurv
Social Media By ApurvApurv MODI
 
The Business Value of Social Media
The Business Value of Social MediaThe Business Value of Social Media
The Business Value of Social MediaNikhil Jagtiani
 
Enhancing the Web Experience
Enhancing the Web ExperienceEnhancing the Web Experience
Enhancing the Web ExperienceJohn Breslin
 
Kurt voelker let's make an impact with the web
Kurt voelker   let's make an impact with the webKurt voelker   let's make an impact with the web
Kurt voelker let's make an impact with the webForum One
 

Similar to Monitoring and Analysis of Online Communities (20)

Harith Alani's presentation at SSSW 2011
Harith Alani's presentation at SSSW 2011Harith Alani's presentation at SSSW 2011
Harith Alani's presentation at SSSW 2011
 
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web SystemsUsing Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
 
2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis2009 - Connected Action - Marc Smith - Social Media Network Analysis
2009 - Connected Action - Marc Smith - Social Media Network Analysis
 
Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)
 
Emerging Trends in Crisis Informatics
Emerging Trends in Crisis InformaticsEmerging Trends in Crisis Informatics
Emerging Trends in Crisis Informatics
 
Knowledge Extraction from Social Media
Knowledge Extraction from Social MediaKnowledge Extraction from Social Media
Knowledge Extraction from Social Media
 
How Communities Learn
How Communities LearnHow Communities Learn
How Communities Learn
 
The State of Social Media (and How to Use It and Not Lose Your Job)
The State of Social Media (and How to Use It and Not Lose Your Job)The State of Social Media (and How to Use It and Not Lose Your Job)
The State of Social Media (and How to Use It and Not Lose Your Job)
 
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slidesMining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
Mining the Social Web - Lecture 1 - T61.6020 lecture-01-slides
 
Lee Rainie
Lee Rainie Lee Rainie
Lee Rainie
 
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
One Web of pages, One Web of peoples, One Web of Services, One Web of Data, O...
 
Information Skills in a Global 2.0 World
Information Skills in a Global 2.0 WorldInformation Skills in a Global 2.0 World
Information Skills in a Global 2.0 World
 
STM Master Class Presentation: The Evolving Journal
STM Master Class Presentation: The Evolving JournalSTM Master Class Presentation: The Evolving Journal
STM Master Class Presentation: The Evolving Journal
 
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting...
Hashtag Conversations, Eventgraphs, and User Ego Neighborhoods: Extracting...
 
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods:  Extracting So...
Hashtag Conversations,Eventgraphs, and User Ego Neighborhoods: Extracting So...
 
Hr roundtable slides
Hr roundtable slidesHr roundtable slides
Hr roundtable slides
 
Social Media By Apurv
Social Media By ApurvSocial Media By Apurv
Social Media By Apurv
 
The Business Value of Social Media
The Business Value of Social MediaThe Business Value of Social Media
The Business Value of Social Media
 
Enhancing the Web Experience
Enhancing the Web ExperienceEnhancing the Web Experience
Enhancing the Web Experience
 
Kurt voelker let's make an impact with the web
Kurt voelker   let's make an impact with the webKurt voelker   let's make an impact with the web
Kurt voelker let's make an impact with the web
 

More from The Open University

Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleThe Open University
 
Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies The Open University
 
SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”The Open University
 
Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)The Open University
 
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.The Open University
 
H2020 COMRADES project introduction
H2020 COMRADES project introduction H2020 COMRADES project introduction
H2020 COMRADES project introduction The Open University
 
Radicalisation detection on social media
Radicalisation detection on social mediaRadicalisation detection on social media
Radicalisation detection on social mediaThe Open University
 
Analysing the dark side of Social Media
Analysing the dark side of Social MediaAnalysing the dark side of Social Media
Analysing the dark side of Social MediaThe Open University
 
Detecting online grooming and radicalisation
Detecting online grooming and radicalisationDetecting online grooming and radicalisation
Detecting online grooming and radicalisationThe Open University
 
Detecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaDetecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaThe Open University
 
Semantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebSemantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebThe Open University
 

More from The Open University (15)

Misinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing BattleMisinformation vs Fact-Checks: The Ongoing Battle
Misinformation vs Fact-Checks: The Ongoing Battle
 
knod22-Alani.pdf
knod22-Alani.pdfknod22-Alani.pdf
knod22-Alani.pdf
 
Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies Co-Creating Misinformation Resilient Societies
Co-Creating Misinformation Resilient Societies
 
SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”SASIG Workshop on “Improving the digital landscape for our children”
SASIG Workshop on “Improving the digital landscape for our children”
 
COMRADES summary
COMRADES summaryCOMRADES summary
COMRADES summary
 
COMRADES project introduction
COMRADES project introduction COMRADES project introduction
COMRADES project introduction
 
Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)Co-Inform (Co-Creating Misinformation Resilient Societies)
Co-Inform (Co-Creating Misinformation Resilient Societies)
 
COMRADES ICT2018
COMRADES ICT2018COMRADES ICT2018
COMRADES ICT2018
 
Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.Crisis Information Processing - with the power of A.I.
Crisis Information Processing - with the power of A.I.
 
H2020 COMRADES project introduction
H2020 COMRADES project introduction H2020 COMRADES project introduction
H2020 COMRADES project introduction
 
Radicalisation detection on social media
Radicalisation detection on social mediaRadicalisation detection on social media
Radicalisation detection on social media
 
Analysing the dark side of Social Media
Analysing the dark side of Social MediaAnalysing the dark side of Social Media
Analysing the dark side of Social Media
 
Detecting online grooming and radicalisation
Detecting online grooming and radicalisationDetecting online grooming and radicalisation
Detecting online grooming and radicalisation
 
Detecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social MediaDetecting Grooming Behaviour on Social Media
Detecting Grooming Behaviour on Social Media
 
Semantics, Sensors, and the Social Web
Semantics, Sensors, and the Social WebSemantics, Sensors, and the Social Web
Semantics, Sensors, and the Social Web
 

Recently uploaded

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxAmita Gupta
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Recently uploaded (20)

Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Monitoring and Analysis of Online Communities

  • 1. Monitoring and Analysis of Online Communities Harith Alani Knowledge Media institute, The Open University, UK http://twitter.com/halani http://delicious.com/halani http://www.linkedin.com/pub/harith-alani/9/739/534 Web Science Summer School Galway, 2011 1
  • 2. Market value of Web Analytics 2
  • 3. Agenda •  Community monitoring •  Offline and online social networking •  Modeling and tracking behaviour •  Analysing community features •  Predicting discussion activity 3
  • 4. Online community monitoring •  Analysing and understanding activities and dynamics •  Studying impact of social and technical features •  Forecast future growth and evolution •  Tracking behaviour and influence •  Tracking reputation and buzz •  Listening to customer opinion •  Profiling the user base •  Gauging customer sentiment 4
  • 5. Measuring social media Deloitte, Beeline Labs, & Society for New Communication Research surveyed 140 companies with online communities, 2008 5
  • 6. Measuring social media Deloitte, Beeline Labs, & Society for New Communication Research surveyed 140 companies with online communities, 2008 6
  • 7. Measuring social media “B2B Marketing Goes Social: A White Horse Survey Report” – March 2010 – study of 104 companies 7
  • 8. Measuring social media “Social media usage, attitudes and measurability: What do marketers think?” – KingFishMedia, 2010 8
  • 9. Tools for monitoring social media 9
  • 10. •  Analytics: –  Mention volume –  Sentiment –  Discussion clouds –  Activity graphs and metrics –  Language and geolocation filtering –  Filter by social platform –  Comparisons 10 http://www.ubervu.com/
  • 11. •  Analytics: –  Influencing users –  Sentiment and opinion analysis –  Viral content analysis –  Detecting sales leads –  Filter by geo-location 11 http://www.viralheat.com/home !
  • 12. Monitoring and Analysis of Online Communities With a Web Science flavour 12
  • 13. Online vs. Offline social networking 13
  • 14. Online vs. offline social networking: The Bad News! •  Digital social networking increases physical social isolation •  Causes –  Genetic alterations –  Weakened immune system –  Less resistant to cancer –  Higher risk of heart disease –  Higher blood pressure –  Faster dementia –  Narrower arteries Aric Sigman, “Well Connected? The Biological Implications of 'Social Networking’”, Biologist, 56 (1), 2009 14
  • 15. Online vs. offline social networking: The Good News! •  Digital networking increase social interaction –  Transforms little boxed societies to networked and networking societies –  Create more opportunities to network –  New methods to communicate, easily, and widely –  Supports and increases F2F contact! –  The stronger the offline social tie, the more intense the online communication –  The stronger the offline social tie, the more diverse online communications –  F2F is medium of choice in weaker social ties Keith Hampton and Barry Wellman, Long Distance Community in the Network Society: Contact and Support Beyond Netville, American Behavioral Scientist 45 (3), November, 2001. Barry Wellman, The Glocal Village: Internet and Community, Idea’s - The Arts & Science Review, 15 University of Toronto, 1(1),2004
  • 16. Physical online & digital offline 16
  • 17. Sensor & Social Networks 17
  • 18. Sensor & Social Networks www.nabaztag.com The Canine Twitterer “Having my daily workout. Already did 15 leg lifts!” 18
  • 19. Location Sensors & Social Networking Tag-Along Marketing The New York Times, November 6, 2010 “Everything is in place for location-based social networking to be the next big thing. Tech companies are building the platforms, venture capitalists are providing the cash and marketers are eager to develop advertising. “ 19
  • 20. Monitoring online/offline social activity Where  is  everybody?   20
  • 21. Monitoring online/offline social activity •  Generating opportunities for F2F networking 21
  • 22. Monitoring online/offline social activity “There are more than 250 million active users currently accessing Facebook through their mobile devices“ “People that use Facebook on their mobile devices are twice as active on Facebook than non-mobile users” http://www.facebook.com/press/info.php?statistics 22
  • 23. Tracking of F2F contact networks Sociometer, MIT, 2002 -  F2F and productivity -  F2F dynamics -  Who are key players? -  F2F and office distance TraceEncounters - 2004 23
  • 24. SocioPatterns platform http://www.sociopatterns.org/! 24
  • 25. Offline social networks From a small conference at ISI, Turin by Ciro Cattuto 25
  • 26. Offline social networks •  Similarity students features –  Country of origin SR –  Seniority –  .. Age? Role? Projects? Interests? •  What other JR info can we get to help us students understand these network SR dynamics? 26
  • 27. Offline + online social networking Who should Anyone I I talk to? Where have I know here? met this guy? Where should I go? ESWC2010 27
  • 28. Live Social Semantics (LSS): RFIDs + Social Web + Semantic Web <?xml version="1.0"?>! <rdf:RDF! xmlns="http:// tagora.ecs.soton.ac.uk/schemas/ tagging#"! xmlns:rdf="http://www.w3.org/ 1999/02/22-rdf-syntax-ns#"! xmlns:xsd="http://www.w3.org/2001/ XMLSchema#"! xmlns:rdfs="http://www.w3.org/ 2000/01/rdf-schema#"! xmlns:owl="http://www.w3.org/ 2002/07/owl#"! xml:base="http:// tagora.ecs.soton.ac.uk/schemas/ tagging">! <owl:Ontology rdf:about=""/>! <owl:Class rdf:ID="Post"/>! <owl:Class rdf:ID="TagInfo"/>! <owl:Class rdf:ID="GlobalCooccurrenceInfo"/>! <owl:Class rdf:ID="DomainCooccurrenceInfo"/>! <owl:Class rdf:ID="UserTag"/>! <owl:Class rdf:ID="UserCooccurrenceInfo"/>! <owl:Class rdf:ID="Resource"/>! <owl:Class rdf:ID="GlobalTag"/>! <owl:Class rdf:ID="Tagger"/>! <owl:Class rdf:ID="DomainTag"/>! <owl:ObjectProperty rdf:ID="hasPostTag">! <rdfs:domain rdf:resource="#TagInfo"/>! </owl:ObjectProperty>! <owl:ObjectProperty rdf:ID="hasDomainTag">! <rdfs:domain rdf:resource="#UserTag"/>! </owl:ObjectProperty>! <owl:ObjectProperty rdf:ID="isFilteredTo">! •  Integration of physical presence and online information <rdfs:range rdf:resource="#GlobalTag"/>! <rdfs:domain •  Semantic user profile generation rdf:resource="#GlobalTag"/>! </owl:ObjectProperty>! <owl:ObjectProperty •  Logging of face-to-face contactrdf:ID="hasResource">! <rdfs:domain rdf:resource="#Post"/>! <rdfs:range =…! •  Social network browsing •  Analysis of online vs offline social networks
  • 29. SW sources conference chair proceedings chair author CoP 29
  • 30. Social and information networks 30
  • 32. Tag Filtering Service Semantic modeling Semantic analysis Collective intelligence Statistical analysis Syntactical analysis 32
  • 34. From Tags to Semantics 34
  • 35. Tags to User Interests 35
  • 36. From raw tags and social relations to Structured Data Collective intelligence User raw Semantic data data Structured data ontologies 36
  • 37. RFIDs for tracking social contact 37
  • 38. Convergence with online social networks 38
  • 39. People contact à RFID à RDF Triples foaf#Person1 contactWith   Place hasContact   foaf#Person2 contactPlace   F2FContact contactDate   contactDura0on   XMLSchema#date   XMLSchema#0me   39
  • 40. 40
  • 41. 41
  • 42. Real-time F2F networks with SNS links 42 http://www.vimeo.com/6590604
  • 43. Live Social Semantics Deployed at: Data analysis •  Face-to-face interactions across scientific conferences •  Networking behaviour of frequent users •  Correlations between scientific seniority and social networking •  Comparison of F2F contact network with Twitter and Facebook •  Social networking with online and offline friends 43
  • 44. Analysis of LSS Results The New Yorker 2/11/2008 44
  • 45. Characteristics of F2F contact network Network ESWC 2009 HT 2009 ESWC 2010 characteristics Number of users 175 113 158 Average degree 54 39 55 Avg. strength (mn) 143 123 130 Avg. weight (mn) 2.65 3.15 2.35 Weights ≤ 1 mn 70% 67% 74% Weights ≤ 5 mn 90% 89% 93% Weights ≤ 10 mn 95% 94% 96% •  Degree is number of people with whom the person had at least one F2F contact •  Strength is the time spent in a F2F contact •  Edge weight is total time spent by a pair of users in F2F contact 45
  • 46. Characteristics of F2F contact events Contact ESWC 2009 HT 2009 ESWC 2010 characteristics Number of 16258 9875 14671 contact events Average contact 46 42 42 length (s) Contacts ≤ 1mn 87% 89% 88% Contacts ≤ 2mn 94% 96% 95% Contacts ≤ 5mn 99% 99% 99% Contacts ≤ 10mn 99.8% 99.8% 99.8% F2F contact pattern is very similar for all three conferences
  • 47. F2F contacts of returning users Degree •  Degree: number of other 10 2 participants with whom an attendee has interacted 1 10 1 2 10 10 •  Total time: total time spent in ESWC2010 Total interaction time interaction by an attendee 4 10 3 10 3 4 5 10 10 10 •  Link weight: total time spent in F2F 4 Links’ weights 10 interaction by a pair of returning 3 10 attendees in 2010, versus the same 2 10 quantity measured in 2009 1 10 1 2 3 4 5 10 10 10 10 10 ESWC 2009 & Pearson Correlation ESWC2009 ESWC 2010 Degree 0.37 Time spent on F2F networking by frequent users is stable, even when the list of Total F2F 0.76 interaction time people they networked with changed Link weight 0.75 47
  • 48. Average seniority of neighbours in F2F networks •  No clear pattern is observed 5 if the unweighted average senn Avg seniority of the neighbours over all neighbours in the Average seniority of neighbors senn,w with weighted averages aggregated network is 4 considered senn,max Seniority of user with strongest link •  A correlation is observed 3 when each neighbour is weighted by the time spent with the main person 2 •  The correlation becomes much stronger when 1 considering for each individual only the neighbour with whom the most time was spent 0 0 5 10 seniority (number of papers) Conference attendees tend to networks with others of similar levels of scientific seniority 48
  • 49. Presence  of  A<endees  HT2009   Importance  of  the  bar?     Popularity  of  sessions?    par0cular  talks?  
  • 50. Number  of  cliques  HT2009  
  • 51. Offline networking vs online networking Twitterers Spearman Correlation (ρ) Tweets – F2F Degree - 0.15 Tweets – F2F Strength - 0.15 Twitter Following – F2F - 0.21 Degree users Users with Facebook and Twitter accounts in ESWC 2010 •  people who have a large number of friends on Twitter and/or Facebook don’t seem to be the most socially active in the offline world in comparison to other SNS users No strong correlation between amount of F2F contact activity and size of online social networks 51
  • 52. Scientific seniority vs Twitter followers Twitter users Correlation H-index – Twitter Followers 0.32 (#$" H-index – Tweets - 0.13 (" !#'" *+,-./"01221+./3" !#&" 45678.9" *+..:3" !#%" !#$" !" (" &" ((" (&" $(" $&" )(" )&" %(" users •  Comparison between people’s scientific seniority and the number of people following them on Twitter People who have the highest number of Twitter followers are not necessarily the most scientifically senior, although they do have high visibility and experience 52
  • 53. Conference Chairs all chairs all chairs participants 2009 participants 2010 2009 2010 average degree 55 77.7 54 77.6 average strength 8590 19590 7807 22520 average weight 159 500 141 674 average number of 3.44 8 3.37 12 events per edge •  Conf chairs interact with more distinct people (larger average degree) •  Conf chairs spend more time in F2F interaction (almost three times as much as a random participant)
  • 54. Networking with online and offline ‘friends’ Characteristics all users coauthors Facebook Twitter friends followers average contact 42 75 63 72 duration (s) average edge weight 141 4470 830 1010 (s) average number of 3.37 60 13 14 events per edge •  Individuals sharing an online or professional social link meet much more often than other individuals •  Average number of encounters, and total time spent in interaction, is highest for co-authors F2F contacts with Facebook & Twitter friends were respectively %50 and %71 longer, and %286 and %315 more frequent than with others They spent %79 more time in F2F contacts with their co-authors, and they met them %1680 more times than they met non co-authors
  • 55. Twitterers vs Non-Twitterers •  Time spent in conference rooms –  Twitter users spent on average 11.4% more time in the conf rooms than non-twitter users (mean is 26% higher) •  Number of people met F2F during the conference –  Twitter users met on average 9% more people F2F (mean 8% higher) •  Duration of F2F contacts –  Twitter users spent on average 63% more time in F2F contact than non twitter users (mean is 20% higher) 55
  • 56. Analysis of behaviour in online communities Web Science Summer School Galway, 2011 56
  • 57. Behaviour of individuals – micro level analysis (#$" 6DD1">?@20AB?M" 89O1209>M"PQM"12R2<DE27>#" ;01">D?@;<">@60;<>"" @0"K88"92;L" S:DT>"9:2"0239">9;7"72>2;7?:27N" (" !#'" !#&" :2;<9:=">?@20AB?"C" >D?@;<"E7DB<2>#"F72G" ?:;@7>HIJ>" !#%" !#$" DO9>@127M" :@6:" >:=" E7DB<2" >?@20A>9N" !" (" )" *" (+" (," $(" $)" $*" ++" +," %(" %)" -./0123" 4$4"526722" 4$4"8972069:" 57
  • 58. Why monitor behaviour? •  Understand impact of behaviour on community evolution •  Forecast community future •  Learn when intervention might be needed •  Learn which behaviour should be encouraged or discouraged •  Find what could trigger certain behaviours •  What is the best mix of behaviour to increase engagement in the community •  To see which users need more support, which ones should be confined, and which ones should be promoted 58
  • 59. Behaviour analysis Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing discussion forums using common user roles. In Proc. Web Science Conf. (WebSci10), Raleigh, NC: US, 2010 •  Behaviour compositions in Boards.ie:
  • 61. Encoding Rules in Ontologies with SPIN
  • 62. Approach for inferring User Roles Structural, social network, Feature levels change with the reciprocity, persistence, participation dynamics of the community Run our rules over each user’s features Associate Roles with a collection of and derive the role composition feature-to-level Mappings e.g. in-degree -> high, out-degree -> high 62
  • 63. Data from Boards.ie •  Forum 246 (Commuting and Transport): Demonstrates a clear increase in activity over time. •  Forum 388 (Rugby): Exhibits periodic increase and decrease in activity and hence it provides good examples of healthy/unhealthy evolutions. •  Forum 411 (Mobile Phones and PDAs): Increase in activity over time with some fluctuation - i.e. reduction and increase over various time windows. •  For the time in 2004-01 to 2006-12
  • 64. Features •  In-degree Ratio: The proportion of users U that reply to user υi, thus indicating the concentration of users that reply to υi •  Posts Replied Ratio: Proportion of posts by user υi that yield a reply, used to gauge the popularity of the user’s content based on replies •  Thread Initiation Ratio: Proportion of threads that have been started by υi. •  Bi-directional Threads Ratio: Proportion of threads where user υi replies to a user and receives a reply, thus forming a reciprocal communication •  Bi-directional Neighbours Ratio: The proportion of neighbours where a reciprocal interaction has taken place - e.g. υi replied to υi and υi replied to υi. •  Average Posts per Thread: The average number of posts made in every thread that user υi has participated in •  Standard Deviation of Posts per Thread: The standard deviation of the number of posts in every thread that user υi has participated in. This gauges the distribution of the discussion lengths.
  • 66. Results Commuting and Transport Rugby Mobile Phones and PDAs •  Correlation of individual features in each of the three forums
  • 67. (a) Forum 246: Commuting and Transport Results (b) Forum 388: Rugby (c) Forum 411: Mobile Phones and PDAs •  Variation in behaviour composition & activity •  Behaviour composition in/ stability influences forum activity
  • 68. Prediction analysis – preliminary results! •  Predicting rise/fall in post submission numbers •  Binary classification •  Features : Community composition, roles and percentages of users associated with each Forum P R F1 ROC 246 0.799 0.769 0.780 0.800 388 0.603 0.615 0.605 0.775 411 0.765 0.692 0.714 0.617 All 0.583 0.667 0.607 0.466 •  Cross-community predictions are less reliable than individual community analysis due to the idiosyncratic behaviour observed in each individual community
  • 69. Observations so far •  Growing communities contain more elitists and popular participants •  Shrinking communities contain many taciturns and ignored users •  A stable composition, with a mix of roles, is associated with increased community activity •  Different communities may require different behaviour compositions to increase activity/health
  • 70. What features make online communities tick
  • 71. •  How many do you recognise? Use? •  Which ones still exist? •  Which are strong and healthy? •  Which are aging and withering? •  What health signs should we look for? •  How can we predict their future evolution? 71
  • 72. Rise and fall of social networks 72
  • 73. Predicting engagement •  Which posts will receive a reply? –  What are the most influential features here? •  How much discussion will it generate? –  What are the key factors of lengthy discussions? 73
  • 74. user attributes - describing the reputation of the user - and attributes of a post’s content - generally referred to as content features. In Table 1 we define user and Common online communityFeatures features content features and study their influence on the discussion “continuation”. Table 1. User and Content User Features In Degree: Number of followers of U # Out Degree: Number of users U follows # List Degree: Number of lists U appears on. Lists group users by topic # Post Count: Total number of posts the user has ever posted # User Age: Number of minutes from user join date # P ostCount Post Rate: Posting frequency of the user U serAge Content Features Post length: Length of the post in characters # Complexity: Cumulative entropy of the unique words in post p λ i∈[1,n] pi(log λ−log pi) of total word length n and pi the frequency of each word λ Uppercase count: Number of uppercase words # Readability: Gunning fog index using average sentence length (ASL) [7] and the percentage of complex words (PCW). 0.4(ASL + P CW ) Verb Count: Number of verbs # Noun Count: Number of nouns # Adjective Count: Number of adjectives # Referral Count: Number of @user # Time in the day: Normalised time in the day measured in minutes # Informativeness: Terminological novelty of the post wrt other posts The cumulative tfIdf value of each term t in post p t∈p tf idf (t, p) Polarity: Cumulation of polar term weights in p (using P o+N e Sentiwordnet3 lexicon) normalised by polar terms count |terms| •  How do all these features influence activity generation in an online 4.2 Experiments community? are intended to test the performance of different classification mod- Experiments – els in identifying seed posts. Therefore we used four classifiers: discriminative Such knowledge leads to better use and management of the community 74 classifiers Perceptron and SVM, the generative classifier Naive Bayes and the
  • 75. Experiment for identifying seed posts •  Twitter data on the Haiti earthquake, and the Union Address Dataset Users Tweets Seeds Non-seeds Replies Haiti 44,497 65,022 1,405 60,686 2,931 Union Address 66,300 80,272 7,228 55,169 17,875 •  Evaluated a binary classification task –  Is this post a seed post or not? 75
  • 76. first report on the results obtained from our model selection phase, before moving Identifying seeds with different type of onto our results from using the best model with the top-k features. features Table 3. Results from the classification of seed posts using varying feature sets and classification models (a) Haiti Dataset (b) Union Address Dataset P R F1 ROC P R F1 ROC User Perc 0.794 0.528 0.634 0.727 User Perc 0.658 0.697 0.677 0.673 SVM 0.843 0.159 0.267 0.566 SVM 0.510 0.946 0.663 0.512 NB 0.948 0.269 0.420 0.785 NB 0.844 0.086 0.157 0.707 J48 0.906 0.679 0.776 0.822 J48 0.851 0.722 0.782 0.830 Content Perc 0.875 0.077 0.142 0.606 Content Perc 0.467 0.698 0.560 0.457 SVM 0.552 0.727 0.627 0.589 SVM 0.650 0.589 0.618 0.638 NB 0.721 0.638 0.677 0.769 NB 0.762 0.212 0.332 0.649 J48 0.685 0.705 0.695 0.711 J48 0.740 0.533 0.619 0.736 All Perc 0.794 0.528 0.634 0.726 All Perc 0.630 0.762 0.690 0.672 SVM 0.483 0.996 0.651 0.502 SVM 0.499 0.990 0.664 0.506 NB 0.962 0.280 0.434 0.852 NB 0.874 0.212 0.341 0.737 J48 0.824 0.775 0.798 0.836 J48 0.890 0.810 0.848 0.877 4.3 Results Our•  findings from Table 3 demonstrate the effectiveness of using solely user User features are most important in Twitter features for identifying seed posts. Infeatures gives best results Address datasets •  But combining user & content both the Haiti and Union training a classification model using user features shows improved performance76 over the same models trained using content features. In the case of the Union
  • 77. Impact of different features which we found to be 0.674 indicating a good correlation between the two lists and• their respective ranks.the highest impact on identification of seed What features have posts? TableRank features by information gainGain Ratio wrt Seed Post class label. The •  4. Features ranked by Information ratio wrt seed post class label feature name is paired within its IG in brackets. Rank Haiti Union Address 1 user-list-degree (0.275) user-list-degree (0.319) 2 user-in-degree (0.221) content-time-in-day (0.152) 3 content-informativeness (0.154) user-in-degree (0.133) 4 user-num-posts (0.111) user-num-posts (0.104) 5 content-time-in-day (0.089) user-post-rate (0.075) 6 user-post-rate (0.075) user-out-degree (0.056) 7 content-polarity (0.064) content-referral-count (0.030) 8 user-out-degree (0.040) user-age (0.015) 9 content-referral-count (0.038) content-polarity (0.015) 10 content-length (0.020) content-length (0.010) 11 content-readability (0.018) content-complexity (0.004) 12 user-age (0.015) content-noun-count (0.002) 13 content-uppercase-count (0.012) content-readability (0.001) 14 content-noun-count (0.010) content-verb-count (0.001) 15 content-adj-count (0.005) content-adj-count (0.0) 16 content-complexity (0.0) content-informativeness (0.0) 17 content-verb-count (0.0) content-uppercase-count (0.0) 77
  • 78. 7 content-polarity (0.064) content-referral-count (0.030) 8 user-out-degree (0.040) user-age (0.015) 9 content-referral-count (0.038) content-polarity (0.015) Positive/negative impact of features 10 11 12 content-length (0.020) content-readability (0.018) user-age (0.015) content-length (0.010) content-complexity (0.004) content-noun-count (0.002) 13 content-uppercase-count (0.012) content-readability (0.001) 14 content-noun-count (0.010) content-verb-count (0.001) •  What is the correlation between seed posts and features? 15 16 content-adj-count (0.005) content-complexity (0.0) content-adj-count (0.0) content-informativeness (0.0) 17 content-verb-count (0.0) content-uppercase-count (0.0) Haiti Union Address Fig. 3. Contributions of top-5 features to identifying Non-seeds (N ) and Seeds(S). Upper plots are for the Haiti dataset and the lower plots are for the Union Address 78 dataset.
  • 79. Identifying Seed Posts •  Can we identify seed posts using the top-k features? –  Stability is reached with 5 features –  Classification with 5 features is sufficient for identifying posts that generate responses 79
  • 80. Predicting Discussion Activity •  Reply rates: –  Haiti 1-74 responses, Union Address 1-75 responses •  Compare rankings –  Ground truth vs predicted •  Experiments –  Using Haiti and Union Address datasets –  Evaluate predicted rank k where k={1,5,10,20,50,100) –  Support Vector Regression with user, content, user+content features Dataset Training Test size Test Vol Test Vol SD size Mean Haiti 980 210 1.664 3.017 Union Address 5,067 1,161 1.761 2.342 80
  • 81. Predicting Discussion Activity Haiti dataset Union Address dataset •  Content features are key for top ranks •  Use features more important for higher ranks 81
  • 82. Identifying Seed Posts in Boards.ie •  Used the same features as before –  User features •  In-degree, out-degree, post count, user age, post rate –  Content features •  Post Length, complexity, readability, referral count, time in day, informativeness, polarity •  New features designed to capture user affinity –  Forum Entropy •  Concentration of forum activity •  Higher entropy = large forum spread –  Forum Likelihood •  Likelihood of forum post given user history •  Combines post history with incoming data 82
  • 83. Experiment for identifying seed posts •  Used all posts from Boards.ie in 2006 •  Built features using a 6-month window prior to seed post date Posts Seeds Non-Seeds Replies Users 1,942,030 90,765 21,800 1,829,465 29,908 •  Evaluated a binary classification task –  Is this post a seed post or not? –  Precision, Recall, F1 and Accuracy –  Tested: user, content, focus features, and their combinations 83
  • 84. h the features (i.e., user TABLE II om t − 188 to t − 1. In R ESULTS FROMTHE CLASSIFICATION OF SEED POSTS USING Identifying seeds with different type of he features compiled for outcomes and will not VARYING FEATURE SETS AND CLASSIFICATION MODELS features user may increase their User SVM P 0.775 R 0.810 F 0.774 ROC 0.581 1 ich would not be a true Naive Bayes 0.691 0.767 0.719 0.540 ime the post was made. Max Ent 0.776 0.806 0.722 0.556 J48 0.778 0.809 0.734 0.582 e number of posts (seeds, Content SVM 0.739 0.804 0.729 0.511 tained within. Naive Bayes 0.730 0.794 0.740 0.616 Max Ent 0.758 0.806 0.730 0.678 TING S EED P OSTS J48 0.795 0.822 0.783 0.617 ls are often hindered by Focus SVM 0.649 0.805 0.719 0.500 Naive Bayes 0.710 0.737 0.722 0.588 We alleviate this problem Max Ent 0.649 0.805 0.719 0.586 and non-seeds through a J48 0.649 0.805 0.719 0.500 posts have been identified User + Content SVM 0.790 0.808 0.727 0.509 Naive Bayes 0.712 0.772 0.732 0.593 of discussion that such Max Ent 0.767 0.807 0.734 0.671 ook for the best classifier J48 0.795 0.821 0.779 0.675 ts and then search for the User + Focus SVM 0.776 0.810 0.776 0.583 Naive Bayes 0.699 0.778 0.724 0.585 guishing seed posts from Max Ent 0.771 0.806 0.722 0.607 atures that are associated J48 0.777 0.810 0.742 0.617 Content + Focus SVM 0.750 0.805 0.729 0.511 Naive Bayes 0.732 0.787 0.746 0.658 Max Ent 0.762 0.807 0.731 0.692 J48 0.798 0.823 0.787 0.662 the previously described All SVM 0.791 0.808 0.727 0.510 ntaining both seeds and Naive Bayes 0.724 0.780 0.740 0.637 Max Ent 0.768 0.808 0.733 0.688 r collection of posts we J48 0.798 0.824 0.792 0.692 tures listed in section III 84
  • 85. Positive/negative impact of features on Boards.ie TABLE III R EDUCTION IN F1 LEVELS AS INDIVIDUAL FEATURES ARE DROPPED FROM THE J 48 CLASSIFIER •  What are the most Feature Dropped F1 important features for - 0.815 predicting seed posts? Post Count In-Degree 0.815 0.811* Out-Degree 0.811* User Age 0.807*** Post Rate 0.815 Forum Entropy 0.815 •  Correlations: Forum Likelihood 0.798*** Post Length 0.810** –  Referral counts (non-seeds) Complexity 0.811** –  Forum likelihood (seeds) Readability 0.802*** Referral Count 0.793*** –  Informativeness (non-seeds) Time in Day 0.810** Informativeness 0.801*** –  Readability (seeds) Polarity 0.808*** Signif. codes: p-value < 0.001 *** 0.01 ** 0.05 * 0.1 . –  User age (non-seeds) hyperlinks (e.g., ads and spams). This contrasts with work in Twitter which found that tweets containing many links were 85
  • 86. Predicting Discussion Activity in Boards.ie •  Can we predict the level of discussion activity? 86
  • 87. Predicting Discussion Activity in Boards.ie •  What impact do features have on discussion length? –  Assessed Linear Regression model with focus and content features –  Forum Likelihood (pos) –  Content Length (+/neutral) –  Complexity (pos) –  Readability (+/neutral) –  Referral Count (neg) –  Time in Day (+/neutral) –  Informativeness (-/neutral) –  Polarity (neg) 87
  • 88. Stay tuned •  More communities –  SAP, IBM, StackOverflow, Reddit –  Compare impact of features on their dynamics •  Better behaviour analysis –  Less features, more forums/communities, more graphs! –  Healthy? posts, reciprocation, discussions, sentiment mixture •  Churn analysis –  Correlation of features/behaviour to ‘bounce rate’ •  Intervention! –  Opportunities and mechanisms to influence behaviour 88
  • 89. Upcoming events Social Object Networks IEEE Social Computing, 2011 October 9-10, Boston, USA http://ir.ii.uam.es/socialobjects2011/ ! Deadline: August 5, 2011 Intelligent Web Services Meet Social Computing AAAI Spring Symposium 2012, March 26-28, Stanford, California http://vitvar.com/events/aaai-ss12 Deadline: Octover 7, 2011 89
  • 90. Questionnaire on user needs http://socsem.open.ac.uk/limesurvey/index.php?sid=55487 Questionnaire is to identify the needs that community users have within online communities and to learn the factors and issues that influence those needs. 90
  • 91. Thanks to My social semantics team Live Social Semantics team Sofia Angeletou Ciro Cattuto Wouter van Den Broeck Matthew Rowe Research Associate ISI, Turin ISI, Turin Research Associate Acknowledgements Alain Barrat Martin Szomszor CPT Marseille & ISI CeRC, City University, UK Gianluca Correndo, Uni Southampton Ivan Cantador, UAM, Madrid STI International ESWC09/10 & HT09 chairs and organisers All LSS participants 91