Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Gephi : dynamic features by Sébastien 7187 views
- Gephi Tutorial Layouts by Gephi Consortium 255983 views
- Gephi Quick Start by Gephi Consortium 825831 views
- Introduction to Biourbanism. Epist... by International Soc... 971 views
- Data visualization 4 dummies by Roberto Pereira S... 969 views
- Social Network Analysis Using Gephi by Nilkanth Shirodkar 5482 views

Presentation for a workshop given at the Centre for Interdisciplinary Methodologies at Warwick University on May 9 2013. Focuses on conceptual and historical questions. Comments, references, and explanations are in the notes.

License: CC Attribution License

No Downloads

Total views

10,557

On SlideShare

0

From Embeds

0

Number of Embeds

776

Shares

0

Downloads

174

Comments

9

Likes

15

No notes for slide

- 1. Interactive visualization and explorationof network data with gephiBernhard RiederUniversiteit van AmsterdamMediastudies Departmentand some conceptual context
- 2. ContextTerms like "big data", "computational social science", "digital humanities","digital methods", etc. are receiving a lot of attention.They point to a set of practices of knowledge production: data analysis,visualization, modeling, etc.Instead of a totalizing search for a "logic" of data analysis, we couldinquire into the vocabulary of concepts and analytical gestures thatconstitute the practice of data analysis.A twofold approach to methods:☉ Engagement, development, application => digital methods☉ Conceptual, historical, and political analysis and critique => software studies
- 3. This workshopHow do we talk about data? How do we analyze them? What is our frameof thought? How do we go further in terms of imagination, expressivity?☉ Introduction☉ A bit of math☉ Two kinds of mathematics☉ Concepts and techniques from graph theory☉ Working with gephiEngage the theory of knowledge (epistemology) mobilized in data analysis,but through the actual techniques and not generalizing concepts.
- 4. Basic ideasWhy?Why do network analysis and visualization? Which arguments are putforward?☉ New media: technical and conceptual structures modeled as networks☉ Calculative capacities: powerful techniques and tools☉ Visualization: the network diagram, "visual analytics"☉ Logistics: data and software are available☉ Methodology: dissatisfaction with statistics (SNA)☉ Society: diversification, problems with demographics / statistics / theory
- 5. Platforms like Twitterboost opportunities forconnectivity betweenvarious types of actors.
- 6. At the same time, theyproduce detailed datatraces that are highlycentralized and searchable.Much of these data can beanalyzed as graphs.
- 7. What styles of reasoning?Hacking (1991) building the concept of "style of reasoning" on A. C.Crombie’s (1994) "styles of scientific thinking":☉ postulation and deduction☉ experiment and empirical research☉ reasoning by analogy☉ ordering by comparison and taxonomy☉ statistical analysis of regularities and probabilities☉ genetic developmentWhat kind of reasoning are we mobilizing in data analysis?Is it one type of reasoning or many?Are we "positivists" when we do data analysis? Reductionists?
- 8. Quality / quantity"One of my favorite fantasies is a dialogue between Mills and Lazarsfeld in which the formerreads to the latter the first sentence of The Sociological Imagination: Nowadays men oftenfeel that their private lives are a series of traps. Lazarsfeld immediately replies: How manymen, which men, how long have they felt this way, which aspects of their private livesbother them, do their public lives bother them, when do they feel free rather than trapped,what kinds of traps do they experience, etc., etc., etc. If Mills succumbed, the two of themwould have to apply to the National Institute of Mental Health for a million-dollar grant tocheck out and elaborate that first sentence. They would need a staff of hundreds, and whenfinished they would have written Americans View Their Mental Health rather than TheSociological Imagination, provided that they finished at all, and provided that either of themcared enough at the end to bother writing anything." (Maurice Stein, cit. in Gitlin 1978)Theory vs. empiricism, macro vs. micro, qualitative vs. quantitative, inductive vs.deductive, associative vs. formalistic, etc.The promise of data analysis tools, applied to exhaustive (and cheap) data, is tobridge the gap, to allow zooming, "quali-quanti" (Latour 2010).
- 9. Two kinds of mathematicsCan there be data analysis without math? No.Does this imply epistemological commitments? Yes.But there are choices, e.g. between:☉ Confirmatory data analysis => deductive☉ Exploratory data analysis (Tukey 1962) => inductiveThere is a fast growing variety of formal analytical gestures relying onmathematical modeling and computation.
- 10. Two kinds of mathematicsStatisticsObserved: objects and propertiesInferred: social forcesData representation: the tableVisual representation: quantity chartsGrouping: "class" (similar properties)Graph-theoryObserved: objects and relationsInferred: structureData representation: the matrixVisual representation: network diagramsGrouping: "clique" (dense relations)
- 11. Graph theoryLeonhard Euler, "Seven Bridges of Königsberg", 1735Introducing the "point and line" model
- 12. Graph theoryDevelops over the 20th century, in particular the second half.Integrates branches of mathematics (topology, geometry, statistics, etc.).Graph theory is "the mathematics of structure" (Harary 1965), "amathematical model for any system involving a binary relation" (Harary1969); it makes relational structure calculable."Perhaps even more than to the contact between mankind and nature, graph theory owes tothe contact of human beings between each other." (König 1936)
- 13. Basic ideasMoreno 1934Graph theory developed inexchange with sociometry,small-group research and(later) social exchangetheory.Starting point:"the sociometric test"(experimental definition of"relation")
- 14. Basic ideas
- 15. Forsythe and Katz, 1946, "adjacency matrix"
- 16. Harary, Graph Theory, 1969
- 17. Basic ideasThe late 1990sThe network "singularity":☉ The network imaginary, a "new science of networks" (Watts 2005)☉ Computational capacities (memory, speed, interfaces, etc.)☉ New platforms and datasets☉ Packaged toolsDifferent traditions conflate to form network analysis:☉ Social network analysis and sociometrics☉ Scientometrics / science and technology studies☉ Mathematics / physics / computer science☉ Information and data visualization☉ Digital sociology / new media studies
- 18. Basic ideasAdamic and Glance, "Divided They Blog", 2005
- 19. Formalization"As we have seen, the basic terms of digraph theory are point and line. Thus, if anappropriate coordination is made so that each entity of an empirical system is identifiedwith a point and each relationship is identified with a line, then for all true statementsabout structural properties of the obtained digraph there are corresponding true statementsabout structural properties of the empirical system." (Harary et al. 1965)There is always an epistemological commitment!=> What can "carry" the reductionism and formalization?=> What types of analytical gestures?
- 20. Facebook Page "ElShaheeed", June 2010 – June 2011, (Poell / Rieder, forthcoming)7K posts, 700K users, 3.6M comments, 10M likes (tool: netvizz), work in progress!
- 21. Facebook Page "ElShaheeed", June 2010 – June 2011:comment timescatter, log10 y scale, likes on
- 22. Facebook Page "ElShaheeed", June 2010 – June 2011:scatterplot comments / likes, per post type
- 23. Facebook Page "ElShaheeed"700K nodes, 11M connectionsColor: type
- 24. Facebook Page "ElShaheeed"700K nodes, 11M connectionsColor: outdegree
- 25. Basic ideasWhat Kind of Phenomena/Data?Interactive networks (Watts 2004): link encodes tangible interaction☉ social network☉ citation networks☉ hypertext networksSymbolic networks (Watts 2004): link is conceptual☉ co-presence (Tracker Tracker, IMDB, etc.)☉ co-word☉ any kind of "structure" that can be as point and line=> do all kinds of analysis (SNA, transportation, text mining, etc.)=> analyze structure in various ways
- 26. Basic ideasWhat is a graph?An abstract representation of nodes connected by links.Two ways of dealing with graphs:☉ mathematical analysis (graph statistics, structural measures, etc.)☉ visualization (network diagram, matrix, arc diagram, etc.)
- 27. Three different force-based layouts of my FB profileOpenOrd, ForceAtlas, Fruchterman-Reingold
- 28. Non force-based layoutsCircle diagram, parallel bubble lines, arc diagram
- 29. Network statisticsbetweenness centralitydegreeRelational elements of graphs canbe represented as tables (nodeshave properties) and analyzedthrough statistics.Network statistics bridge the gapbetween individual units and thestructural forms they areembedded in.This is currently an extremelyprolific field of research.
- 30. Basic ideas
- 31. Basic ideasWhat is a graph?Vertices and edges!Nodes and lines!Two main types:Directed (e.g. Twitter)Undirected (e.g. Facebook)Properties of nodes:degree, centrality, etc.Properties of edges:weight, direction, etc.Properties of the graph:averages, diameter, communities, etc.
- 32. Basic ideas
- 33. Basic ideasWikipedia: Glossary of graph theoryTools are easy, concepts are hard
- 34. Basic ideasInteractive visual analyticsBringing structure to the surface (gephi panel: "layout")☉ different spatializations (force, geometry, etc.)Projecting variables into the diagram (gephi panel: "ranking")☉ Size (nodes, edges, labels, etc.)☉ Color (nodes, edges, labels, etc.)Deriving measures (gephi panel: "statistics")☉ Properties of nodes, edges, structure => new variablesAnalysis: e.g. correlation between spatial layout and variables?
- 35. Basic ideashttp://courses.polsys.net/gephi/
- 36. Basic ideas
- 37. Basic ideasTwitter #ows dataset, co-hashtag analysisStrong topic clustering
- 38. Twitter 1% sample, co-hashtag analysis227,029 unique hashtags, 1627 displayed (freq >= 50)Size: frequencyColor: modularity
- 39. Size: frequencyColor: user diversityTwitter 1% sample, co-hashtag analysis227,029 unique hashtags, 1627 displayed (freq >= 50)
- 40. Size: frequencyColor: degreeTwitter 1% sample, co-hashtag analysis227,029 unique hashtags, 1627 displayed (freq >= 50)
- 41. Twitter 1% sampleCo-hashtag analysisDegree vs.wordFrequency
- 42. Degree vs. userDiversityTwitter 1% sampleCo-hashtag analysis
- 43. FB group "Islam is dangerous"Friendship network, color: betweenness centrality2.339 membersAverage degree of 39.6981.7% have at least one friend in the group55.4% five or more37.2% have 20 or morefounder and admin has 609 friends
- 44. FB group "Islam is dangerous"Friendship network, color: Interface languageen_us, de, en_uk, it dominate
- 45. Mapping European ExtremismFriendship relations of 18 extreme-right groups
- 46. FB page "Educate children about the evils of Islam"Links have more comments, photos more likes.
- 47. FB page "Stop the Islamizationof the World"Number of posts and reactions
- 48. FB page "Stop theIslamization of the World"
- 49. Basic ideasInteractive visual analyticsBringing structure to the surface (gephi panel: "layout")☉ different spatializations (force, geometry, etc.)Projecting variables into the diagram (gephi panel: "ranking")☉ Size (nodes, edges, labels, etc.)☉ Color (nodes, edges, labels, etc.)Deriving measures (gephi panel: "statistics")☉ Properties of nodes, edges, structure => new variablesAnalysis: e.g. correlation between spatial layout and variables?
- 50. Basic ideashttp://courses.polsys.net/gephi/
- 51. Nine measures of centrality (Freeman 1979)
- 52. Label PR α=0.85 PR α=0.7 PR α=0.55 PR α=0.4 In-Degree Out-Degree Degreen34 0.0944 0.0743 0.0584 0.0460 4 1 5n1 0.0867 0.0617 0.0450 0.0345 1 2 3n17 0.0668 0.0521 0.0423 0.0355 2 1 3n39 0.0663 0.0541 0.0453 0.0388 5 1 6n22 0.0619 0.0506 0.0441 0.0393 5 1 6n27 0.0591 0.0451 0.0371 0.0318 1 0 1n38 0.0522 0.0561 0.0542 0.0486 6 0 6n11 0.0492 0.0372 0.0306 0.0274 3 1 4
- 53. Basic ideasUS Airports
- 54. Thank Yourieder@uva.nlhttps://www.digitalmethods.nethttp://thepoliticsofsystems.net"Far better an approximate answer to the right question,which is often vague, than an exact answer to the wrongquestion, which can always be made precise. Dataanalysis must progress by approximate answers, at best,since its knowledge of what the problem really is will atbest be approximate." (Tukey 1962)

No public clipboards found for this slide

Login to see the comments