SlideShare a Scribd company logo
1 of 68
Download to read offline
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer
Computer Science Department
Stanford University

CIDR 2009 – Monterey, CA
5 January 2009
A Tale of Two Visualizations
vizster
Observations
Groups spent more time in front of the
visualization than individuals.

Friends encouraged each other to unearth
relationships, probe community boundaries, and
challenge reported information.

Social play resulted in informal analysis, often
driven by story-telling of group histories.
NameVoyager
The Baby Name Voyager
Social Data Analysis
Visual sensemaking can be social as
well as cognitive.
Analysis of data coupled with social
interpretation and deliberation.

How can user interfaces catalyze and
support collaborative visual analysis?
sense.us
A Web Application for Collaborative
Visualization of Demographic Data
Voyagers and Voyeurs
Complementary faces of analysis
Voyager – focus on visualized data
Active engagement with the data
Serendipitous comment discovery

Voyeur – focus on comment listings
Investigate others’ explorations
Find people and topics of interest
Catalyze new explorations
Out of the Lab,
 Into the Wild
Wikimapia.org
DecisionSite posters




Spotfire Decision Site Posters
Tableau Server
Many-Eyes
Social Data Analysis In Action
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming

For each, some thoughts on future directions.
I asked my colleagues: if you could give database
researchers a wish list, what would it be?
Discussion and Debate
Tableau X-Box / Quest Diag?

              “Valley of Death”
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage



 Feature prevalence from content analysis (min Cohen’s = .74)
 High co-occurrence of Observations, Questions, and Hypotheses
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions




Wikipedia: Shared Revisions   NASA ClickWorkers: Statistics
WANTED: Structured Conversation

Reduce the cost of synthesizing contributions

Can we represent data, visualizations, and social
activity in a unified data model?
Text is Data, Too
Visualization Popularity
                                                  Service
                              Many-Eyes                            Swivel
       Tag Cloud
   Bubble Graph
      Word Tree
        Bar Chart
            Maps
Network Diagram
        Treemap
    Matrix Chart
      Line Graph
      Scatterplot
  Stacked Graph
        Pie Chart
      Histogram
                    0.0 0.1   0.2    0.3    0.4   0.5 0.0 0.1   0.2    0.3    0.4   0.5
                               Percentage                        Percentage


Over 1/3 of Many-Eyes visualizations use free text
Alberto Gonzales
WANTED: Better Tools for Text

Statistical Analysis of text (with ties to source!)
Entity Extraction
Aggregation and Comparison of texts
  Get a “global” view of documents

We can do better than Tag Clouds (!?)
Use text analysis tools to enable analysis of
structured conversation by the community.
Data Integrity and Cleaning
No cooks in 1910? … There may have
been cooks then. But maybe not.
The great postmaster
scourge of 1910?
      Or just a bug
      in the data?
Content Analysis of Comments
                                           Service
                           Sense.us                       Many-Eyes
  Observation
     Question
   Hypothesis
 Data Integrity
        Linking
    Socializing
System Design
        Testing
           Tips
         To-Do
   Affirmation
                  0   20      40      60   80 0      20      40      60   80
                           Percentage                     Percentage


 16% of sense.us comments and 10% of Many-Eyes comments
 reference data quality or integrity.
WANTED: Data Cleaning Tools

Reshape data, reformat rows & columns
Handle missing data: label, repair, interpolate
Entity resolution and de-duplication
Group related values into aggregates
Assist table lookups & data transforms

Provide tools in situ to leverage collective
Transparency requires provenance
Integrating Data in Context
College Drug Use
College Drug Use
Harry Potter is Freaking Popular
WANTED: In-Situ Data Integration

Search for and suggest related data or views
User input for types, schema matching, or data
Apply in context of the current task
 But record mappings for future use
Record provenance: chain of data sources

Examples: Google Web Tables, Pay-As-You-Go,
  Stanford Vispedia, Utah VisTrails
Pointing and Naming
“Look at that spike.”
“Look at the spike for Turkey.”
“Look at the spike in the middle.”
Free-form   Data-aware
Visual Queries
Model selections as declarative queries over
interface elements or underlying data




  (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
Visual Queries
Model selections as declarative queries over
interface elements or underlying data

Applicable to dynamic, time-varying data
Retarget selection across visual encodings
Support social navigation and data mining
WANTED: Data-Aware Annotation

Meta-queries linking annotations to views
Visually specifying notification triggers
Annotating data aggregates (use lineage?)
Unified model (again!) to facilitate reference
How to make it work at scale?

How else to use machine-readable annotations?
Can annotations be used to steer data mining?
Conclusion
Social Data Analysis
Collective analysis of data supported
by social interaction.
1. Discussion and Debate
2. Text is Data, Too
3. Data Integrity and Cleaning
4. Integrating Data in Context
5. Pointing and Naming
Summary
As visualization becomes common on the web,
opportunities for collaborative analysis abound.
Weave visualizations into the web: data access,
visualization creation, view sharing and pointing.
Support discovery, discussion, and integration
of contributions to leverage the collective.
Improve both processes and technologies for
communication and dissemination.
Parting Thoughts
Visualizations may have a catalytic effect
on social interaction around data.

Encourage participation by minimizing or
offsetting interaction costs.

Provide incentives by fostering the
personal relevance of the data.
Acknowledgements

@ Berkeley: Maneesh Agrawala, Wes Willett,
  danah boyd, Marti Hearst, Joe Hellerstein
@ IBM: Martin Wattenberg, Fernanda Viégas
@ PARC: Stu Card
@ Tableau: Jock Mackinlay, Chris Stolte,
  Christian Chabot
Voyagers and Voyeurs
Supporting Social Data Analysis

Jeffrey Heer Stanford University
jheer@stanford.edu
http://jheer.org
With a collaborative spirit, with a collaborative platform
where people can upload data, explore data, compare
solutions, discuss the results, build consensus, we can
engage passionate people, local communities, media and
this will raise - incredibly - the amount of people who can
understand what is going on.

And this would have fantastic outcomes: the engagement of
people, especially new generations; it would increase
knowledge, unlock statistics, improve transparency and
accountability of public policies, change culture, increase
numeracy, and in the end, improve democracy and welfare.

       Enrico Giovannini, Chief Statistician, OECD. June 2007.

More Related Content

Viewers also liked

Viewers also liked (6)

Cidr
CidrCidr
Cidr
 
C I D R
C I D RC I D R
C I D R
 
Cidr.ppt
Cidr.pptCidr.ppt
Cidr.ppt
 
Unicast multicast & broadcast
Unicast multicast & broadcastUnicast multicast & broadcast
Unicast multicast & broadcast
 
Ch05
Ch05Ch05
Ch05
 
Classless addressing
Classless addressingClassless addressing
Classless addressing
 

Similar to CIDR 2009: Jeff Heer Keynote

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachAndre Freitas
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systemsDavide Eynard
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsMathieu d'Aquin
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Stephanie Steinhardt
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Stefano A Gazziano
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...ACMBangalore
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignSam Pottinger
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebEdward Curry
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media suresh sood
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...DataWorks Summit/Hadoop Summit
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkResearch Data Alliance
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015Marianne Sweeny
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020P2Pvalue
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?Elena Simperl
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 

Similar to CIDR 2009: Jeff Heer Keynote (20)

Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing ApproachCoping with Data Variety in the Big Data Era: The Semantic Computing Approach
Coping with Data Variety in the Big Data Era: The Semantic Computing Approach
 
Show me the data! Actionable insight from open courses
Show me the data! Actionable insight from open coursesShow me the data! Actionable insight from open courses
Show me the data! Actionable insight from open courses
 
Research on collaborative information sharing systems
Research on collaborative information sharing systemsResearch on collaborative information sharing systems
Research on collaborative information sharing systems
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
Designing for Collaboration: Challenges & Considerations of Multi-Use Informa...
 
Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2Digital cultural heritage spring 2015 day 2
Digital cultural heritage spring 2015 day 2
 
Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...Social Network Analysis (SNA) and its implications for knowledge discovery in...
Social Network Analysis (SNA) and its implications for knowledge discovery in...
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Re-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game DesignRe-Empower the Public with Data Visualization and Game Design
Re-Empower the Public with Data Visualization and Game Design
 
Why Data Science is a Science
Why Data Science is a ScienceWhy Data Science is a Science
Why Data Science is a Science
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Querying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data WebQuerying Heterogeneous Datasets on the Linked Data Web
Querying Heterogeneous Datasets on the Linked Data Web
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
Using Machine Learning to Capture Data Meaning and Wrangle it to Liberate its...
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing Work
 
Sweeny group think-ias2015
Sweeny group think-ias2015Sweeny group think-ias2015
Sweeny group think-ias2015
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
The web of data: how are we doing so far?
The web of data: how are we doing so far?The web of data: how are we doing so far?
The web of data: how are we doing so far?
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 

More from infoblog

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynoteinfoblog
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)infoblog
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)infoblog
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)infoblog
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)infoblog
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)infoblog
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)infoblog
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)infoblog
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)infoblog
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)infoblog
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)infoblog
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)infoblog
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealedinfoblog
 

More from infoblog (14)

CIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton KeynoteCIDR 2009: James Hamilton Keynote
CIDR 2009: James Hamilton Keynote
 
Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)Claremont Report on Database Research: Research Directions (Le Gruenwald)
Claremont Report on Database Research: Research Directions (Le Gruenwald)
 
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)Claremont Report on Database Research: Research Directions (Eric A. Brewer)
Claremont Report on Database Research: Research Directions (Eric A. Brewer)
 
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
Claremont Report on Database Research: Research Directions (Rakesh Agrawal)
 
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)Claremont Report on Database Research: Research Directions (Gerhard Weikum)
Claremont Report on Database Research: Research Directions (Gerhard Weikum)
 
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
Claremont Report on Database Research: Research Directions (Beng Chin Ooi)
 
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
Claremont Report on Database Research: Research Directions (Yannis E. Ioannidis)
 
Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)Claremont Report on Database Research: Research Directions (Donald Kossmann)
Claremont Report on Database Research: Research Directions (Donald Kossmann)
 
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)Claremont Report on Database Research: Research Directions (Johannes Gehrke)
Claremont Report on Database Research: Research Directions (Johannes Gehrke)
 
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
Claremont Report on Database Research: Research Directions (Alon Y. Halevy)
 
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
Claremont Report on Database Research: Research Directions (Anastasia Ailamaki)
 
Spot Sigs
Spot SigsSpot Sigs
Spot Sigs
 
Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)Database Research Principles Revealed (Small Size)
Database Research Principles Revealed (Small Size)
 
Database Research Principles Revealed
Database Research Principles RevealedDatabase Research Principles Revealed
Database Research Principles Revealed
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

CIDR 2009: Jeff Heer Keynote

  • 1. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Computer Science Department Stanford University CIDR 2009 – Monterey, CA 5 January 2009
  • 2. A Tale of Two Visualizations
  • 4. Observations Groups spent more time in front of the visualization than individuals. Friends encouraged each other to unearth relationships, probe community boundaries, and challenge reported information. Social play resulted in informal analysis, often driven by story-telling of group histories.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10. Social Data Analysis Visual sensemaking can be social as well as cognitive. Analysis of data coupled with social interpretation and deliberation. How can user interfaces catalyze and support collaborative visual analysis?
  • 11. sense.us A Web Application for Collaborative Visualization of Demographic Data
  • 12.
  • 13. Voyagers and Voyeurs Complementary faces of analysis Voyager – focus on visualized data Active engagement with the data Serendipitous comment discovery Voyeur – focus on comment listings Investigate others’ explorations Find people and topics of interest Catalyze new explorations
  • 14. Out of the Lab, Into the Wild
  • 15.
  • 16.
  • 20.
  • 22. Social Data Analysis In Action 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming For each, some thoughts on future directions. I asked my colleagues: if you could give database researchers a wish list, what would it be?
  • 24.
  • 25.
  • 26.
  • 27. Tableau X-Box / Quest Diag? “Valley of Death”
  • 28.
  • 29.
  • 30.
  • 31. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage Feature prevalence from content analysis (min Cohen’s = .74) High co-occurrence of Observations, Questions, and Hypotheses
  • 32. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Wikipedia: Shared Revisions NASA ClickWorkers: Statistics
  • 33. WANTED: Structured Conversation Reduce the cost of synthesizing contributions Can we represent data, visualizations, and social activity in a unified data model?
  • 35. Visualization Popularity Service Many-Eyes Swivel Tag Cloud Bubble Graph Word Tree Bar Chart Maps Network Diagram Treemap Matrix Chart Line Graph Scatterplot Stacked Graph Pie Chart Histogram 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 Percentage Percentage Over 1/3 of Many-Eyes visualizations use free text
  • 36.
  • 38. WANTED: Better Tools for Text Statistical Analysis of text (with ties to source!) Entity Extraction Aggregation and Comparison of texts Get a “global” view of documents We can do better than Tag Clouds (!?) Use text analysis tools to enable analysis of structured conversation by the community.
  • 39. Data Integrity and Cleaning
  • 40. No cooks in 1910? … There may have been cooks then. But maybe not.
  • 41. The great postmaster scourge of 1910? Or just a bug in the data?
  • 42.
  • 43.
  • 44. Content Analysis of Comments Service Sense.us Many-Eyes Observation Question Hypothesis Data Integrity Linking Socializing System Design Testing Tips To-Do Affirmation 0 20 40 60 80 0 20 40 60 80 Percentage Percentage 16% of sense.us comments and 10% of Many-Eyes comments reference data quality or integrity.
  • 45. WANTED: Data Cleaning Tools Reshape data, reformat rows & columns Handle missing data: label, repair, interpolate Entity resolution and de-duplication Group related values into aggregates Assist table lookups & data transforms Provide tools in situ to leverage collective Transparency requires provenance
  • 47.
  • 48.
  • 51. Harry Potter is Freaking Popular
  • 52.
  • 53. WANTED: In-Situ Data Integration Search for and suggest related data or views User input for types, schema matching, or data Apply in context of the current task But record mappings for future use Record provenance: chain of data sources Examples: Google Web Tables, Pay-As-You-Go, Stanford Vispedia, Utah VisTrails
  • 55. “Look at that spike.”
  • 56. “Look at the spike for Turkey.”
  • 57. “Look at the spike in the middle.”
  • 58. Free-form Data-aware
  • 59. Visual Queries Model selections as declarative queries over interface elements or underlying data (-118.371 ≤ lon AND lon ≤ -118.164) AND (33.915 ≤ lat AND lat ≤ 34.089)
  • 60. Visual Queries Model selections as declarative queries over interface elements or underlying data Applicable to dynamic, time-varying data Retarget selection across visual encodings Support social navigation and data mining
  • 61. WANTED: Data-Aware Annotation Meta-queries linking annotations to views Visually specifying notification triggers Annotating data aggregates (use lineage?) Unified model (again!) to facilitate reference How to make it work at scale? How else to use machine-readable annotations? Can annotations be used to steer data mining?
  • 63. Social Data Analysis Collective analysis of data supported by social interaction. 1. Discussion and Debate 2. Text is Data, Too 3. Data Integrity and Cleaning 4. Integrating Data in Context 5. Pointing and Naming
  • 64. Summary As visualization becomes common on the web, opportunities for collaborative analysis abound. Weave visualizations into the web: data access, visualization creation, view sharing and pointing. Support discovery, discussion, and integration of contributions to leverage the collective. Improve both processes and technologies for communication and dissemination.
  • 65. Parting Thoughts Visualizations may have a catalytic effect on social interaction around data. Encourage participation by minimizing or offsetting interaction costs. Provide incentives by fostering the personal relevance of the data.
  • 66. Acknowledgements @ Berkeley: Maneesh Agrawala, Wes Willett, danah boyd, Marti Hearst, Joe Hellerstein @ IBM: Martin Wattenberg, Fernanda Viégas @ PARC: Stu Card @ Tableau: Jock Mackinlay, Chris Stolte, Christian Chabot
  • 67. Voyagers and Voyeurs Supporting Social Data Analysis Jeffrey Heer Stanford University jheer@stanford.edu http://jheer.org
  • 68. With a collaborative spirit, with a collaborative platform where people can upload data, explore data, compare solutions, discuss the results, build consensus, we can engage passionate people, local communities, media and this will raise - incredibly - the amount of people who can understand what is going on. And this would have fantastic outcomes: the engagement of people, especially new generations; it would increase knowledge, unlock statistics, improve transparency and accountability of public policies, change culture, increase numeracy, and in the end, improve democracy and welfare. Enrico Giovannini, Chief Statistician, OECD. June 2007.