Presentation by Wiebke Loosen, Julius Reimer & Fenja Schmidt held at The Future of Journalism Conference: Risks, Threats and Opportunities, Sep 10th 2015, Cardiff
When Data Become News. A Content Analysis of Data Journalism Pieces.
1. When
Data
Become
News
A
content
analysis
of
data
journalism
pieces
Wiebke
Loosen,
Julius
Reimer
&
Fenja
Schmidt
@wloosen
@julius_reimer
@Fen_Ja
The
Future
of
Journalism
Conference:
Risks,
Threats
and
OpportuniAes|
Cardiff
|
2015
2. Introduc4on:
‘Big
Data’
and
the
Data-‐Driven
Society
• Double
relevance
of
‘big
data’
and
the
data-‐driven
society
for
journalism:
-‐ Topic
worth
covering:
show
related
developments
and
their
consequences
to
make
them
understandable
and
publicly
debatable
-‐ The
‘computaAonal
turn’
affects
pracAces
of
news
producAon
à
Emergence
of
a
new
journalisAc
sub-‐field
‘computaAonal/
data(-‐driven)
journalism’
(cf.
Coddington,
2015;
Fink/Anderson,
2015;
Lewis,
2015)
Loosen/Reimer/Schmidt
2
3. Literature
Review:
Research
on
Data
Journalism
(#ddj)
A
“rapidly
growing
body”
(Lewis,
2015:
322)
of
studies
focusing
on:
1. Defining
what
#ddj
is
(e.g.,
Anderson,
2013;
Appelgren/Nygren,
2014;
Coddington,
2015;
Fink
&
Anderson,
2015;
Gray
et
al.,
2012)
Presumed
key
characterisAcs:
-‐ (Usually
large)
sets
of
quanAtaAve
(digital)
data
-‐ VisualisaAon
(maps,
bar
charts,
etc.)
-‐ ParAcipaAon
and
crowdsourcing
-‐ Open
data
and
open
source
2. Researching
what
actors
in
the
field
do
and
think
(Appelgren/Nygren,
2014;
De
Maeyer
et
al.,
2015;
Fink
/Anderson,
2015;
Parasie,
2014;
Parasie/Dagiral,
2013;
Karlsen/Stavelin,
2014;
Weinacht/Spiller,
2014)
à
No
systemaAcally
gathered
insights
regarding
data
journalism
as
“an
emerging
form
of
storytelling”
(Appelgren/Nygren,
2014:
394)
Loosen/Reimer/Schmidt
3
4. Research
Objec4ves
Focus
on
the
output
of
#ddj
to
beher
understand
its
reporAng
styles
and
data
sources:
à
Map
actual
occurrence
and
classify
different
types
of
presumed
key
characterisAcs
in
data-‐driven
pieces:
-‐ Data
sets
and
data
processing
-‐ VisualisaAon
elements
-‐ InteracAve
features
à
Determine
topics
covered
à
IdenAfy
media
organisaAons
which
are
parAcularly
acAve
in
the
field
Loosen/Reimer/Schmidt
4
5. Methodology:
Sample
• Nominees
for
the
Data
Journalism
Award
(issued
annually
by
the
Global
Editors‘
Network)
2013
and
2014
(following
Lanosga,
2014;
Wahl-‐Jorgensen,
2013a,
2013b)
• ParAcular
sample
with
a
‘double
bias’
(special
group,
self-‐selected)
and
a
‘double
advantage’
(defined
as
#ddj
by
experts
in
the
field,
seen
as
‘gold
standard’
that
could
influence
further
development)
Loosen/Reimer/Schmidt
Submissions
Nominated
projects
Projects
suited
for
analysis
Award-‐winning
projects
(%
of
analysed
projects)
2013
>300
72
56
6
(10.7)
2014
520
75
64
9
(14.1)
Total
>820
147
120
15
(12.5)
5
6. Methodology:
Codebook
• Standardised
‘hand-‐made’
content
analysis
(e.g.,
Krippendorff,
2013;
Lombard
et
al.,
2002)
Loosen/Reimer/Schmidt
Dimensions
V
No.
Categories
of
analysis
Formal
characterisAcs
V
1-‐13
Medium,
topic,
language,
length
&
no.
of
related
arAcle(s),
no.
of
people
involved,
external
partners,
…
Dataset
V
14-‐22
Type
of
data
source,
access
to
data,
kind
of
data,
geographical
&
temporal
reference,
changeability
of
dataset,
unit
of
analysis,
addiAonal
info
Analysis
and
journalisAc
ediAng
of
content
V
23-‐26
Personalized
case
example,
criAcism,
visualisaAon,
purpose
of
analysis
Context
of
use
V
27-‐29
InteracAve
funcAons,
online
access
to
the
database,
opportuniAes
of
further
interacAon/communicaAon
6
7. Results:
Organisa4ons
and
Staff
Involved
• Dominance
of
newspapers:
42.5
%
(of
all
cases)
• Rise
of
magazines
(7.1
%
à
17.2
%)
and
of
invesAgaAve
journalisAc
organisaAons
(14.3
%
à
25
%)
• Data
journalism
is
mostly
a
collaboraAve
effort:
-‐
On
average
five
authors/contributors
-‐
Increase
from
2013
to
2014
-‐
External
partners
menAoned
in
35
%
of
all
cases
Loosen/Reimer/Schmidt
7
8. Results:
Topics
Covered
and
Formal
Elements
• Most
important
topic:
poliAcs
(48.3
%),
osen
in
combinaAon
with
financial
aspects
• Societal
issues:
33.3
%;
health
&
science:
21.7
%;
business
&
economy:
20
%
• Mostly
combinaAon
of
visualisaAon(s)
with
one
(48.3
%)
or
more
(34.2
%)
accompanying
texts
• Personalised
case
example
as
a
way
to
counter
abstractness
of
quanAtaAve
data
-‐
In
total
40.8
%
of
the
pieces
-‐
Lower
rates
for
economic
and
educaAon
topics
(20.8
%
and
22.2
%)
Loosen/Reimer/Schmidt
8
12. Example:
Personal
Data
Loosen/Reimer/Schmidt
12
Your
Olympic
Athlete
Body
Match
(2013):
hhp://www.bbc.co.uk/news/uk-‐19050139
(9.9.15)
13. Results:
Sources
and
Access
to
Data
Loosen/Reimer/Schmidt
13
• Sources:
official
insAtuAons
(67.5
%),
other
non-‐commercial
organisaAons
(44.2
%),
own
sources
(18.3
%)
• Mostly
data
that
is
publicly
available
(41.7
%),
access
to
data
osen
not
indicated
(40
%)
15. Example:
Connec4ons
and
Flows
Loosen/Reimer/Schmidt
15
Rede
de
Escândalos
(2013):
hhp://veja.abril.com.br/infograficos/painel_rede_escandalos/
network_of_scandals.html
(9.9.15)
16. Results:
Visualisa4ons
&
Interac4ve
features
• Mainly
pictures
(60.0
%),
simple
staAc
charts
(54.2
%),
and
maps
(49.2
%)
• Rarely
animated
visualisaAons
(15.8
%),
no
case
without
visualisaAon
• CombinaAon
of
more
than
two
different
kinds
of
visualisaAons
(74.2
%),
osen
simple
staAc
charts
with
pictures
(31.7
%)
or
a
map
(27.5
%)
• InteracAve
funcAons:
mostly
zoom
and
details
on
demand
(55.8
%),
filtering
(51.7
%)
-‐
18.3
%
of
cases
have
no
interacAve
funcAons
at
all
-‐
The
average
piece
contains
1.55
different
interacAve
features
Loosen/Reimer/Schmidt
16
17. Conclusion:
The
‘Typical’
#ddj
Piece
The
‘typical’
data-‐driven
piece…
• is
published
by
a
newspaper,
• covers
a
poliAcal
topic,
• relies
on
public
data
from
official
sources,
• builds
its
story
on
financial
and/or
geodata
–
preferably
collected
on
a
naAonal
scale,
• is
based
on
a
simple
unit
of
analysis
such
as
single
persons,
• compares
values
in
order
to
show
differences
and
similariAes
between
different
objects
of
study
(e.g.,
people
of
different
gender,
neighbourhoods)
• combines
two
types
of
visualisaAons
–
preferably
pictures
with
maps
or
simple
charts,
• allows
the
user
to
zoom
into
a
map,
request
details
and/or
to
filter
data.
Loosen/Reimer/Schmidt
17
18. Conclusion:
Tendencies
of
Development
• Data
journalism
is
increasingly
personnel
intensive
–
at
least
as
far
as
our
parAcular
sample
is
concerned
• Significant
increase
of
stories
building
on
data
from
non-‐commercial
organisaAons
(e.g.
universiAes,
NGOs,
research
insAtutes)
between
2013
and
2014
à
#ddj
increasingly
discovers
new
data
sources
• Awarded
stories
are
more
likely
to
refer
to
data
on
a
naAonal
level;
stories
from
2014
are
less
likely
to
draw
on
regional
data
than
those
from
2013
à
news
value
of
data
• Awarded
stories
are
less
likely
to
contain
no
interacAve
funcAons
• Results
for
DJA
2015
will
show
if
we
can
idenAfy
any
clearer
lines
of
developments
Loosen/Reimer/Schmidt
18
20. References
Anderson,
Chris
W.
(2013).
Towards
a
sociology
of
computaAonal
and
algorithmic
journalism.
New
Media
&
Society,
15(7),
pp.
1005–
1021.
Appelgren,
Ester;
Nygren,
Gunnar
(2014).
Data
journalism
in
Sweden.
Introducing
new
methods
and
genres
of
journalism
into
“old”
organizaAons.
Digital
Journalism,
2(3),
pp.
394–405.
Coddington,
Mark
(2015).
Clarifying
journalism’s
quanAtaAve
turn.
A
typology
for
evaluaAng
data
journalism,
computaAonal
journalism,
and
computer-‐assisted
reporAng.
Digital
Journalism,
3(3),
pp.
331–348.
De
Maeyer,
Juliehe;
Libert,
Manon;
Domingo,
David;
Heinderyckx,
François;
Le
Cam,
Florence
(2015).
WaiAng
for
data
journalism.
A
qualitaAve
assessment
of
the
anecdotal
take-‐up
of
data
journalism
in
French-‐speaking
Belgium.
Digital
Journalism,
3(3),
pp.
432–
446.
Fink,
Katherine;
Anderson,
Christopher
W.
(2015).
Data
journalism
in
the
United
States.
Beyond
the
“usual
suspects”.
Journalism
Studies,
6(4),
pp.
467–481.
Gray,
Jonathan;
Bounegru,
Liliana;
Chambers,
Lucy
(eds.)
(2012):
The
data
journalism
handbook.
How
journalists
can
use
data
to
improve
the
news.
(Early
release).
Sebastopol:
O’Reilly.
Karlsen,
Joakim;
Stavelin,
Eirik
(2014).
ComputaAonal
journalism
in
Norwegian
newsrooms.
Journalism
PracEce,
8(1),
pp.
34–48.
Krippendorff,
Klaus
(2013).
Content
analysis:
an
introducEon
to
its
methodology.
Los
Angeles:
SAGE.
Lanosga,
Gerry
(2014):
New
views
of
invesAgaAve
reporAng
in
the
twenAeth
century.
American
Journalism,
31(4),
pp.
490–506.
Lewis,
Seth
C.
(2015).
Journalism
in
an
era
of
big
data.
Digital
Journalism,
3(3),
pp.
321–330.
Lombard,
Mahhew;
Snyder-‐Duch,
Jennifer;
Bracken,
Cheryl
Campanella
(2002):
Content
Analysis
in
Mass
CommunicaAon.
Assessment
and
ReporAng
of
Intercoder
Reliability.
Human
CommunicaEon
Research,
28(4),
pp.
587–604.
Parasie,
Sylvain
(2014).
Data-‐driven
revelaAon?
Epistemological
tensions
in
invesAgaAve
journalism
in
the
age
of
“big
data”.
Digital
Journalism,
DOI:
10.1080/21670811.2014.976408.
Parasie,
Sylvain;
Dagiral,
Eric
(2013).
Data-‐driven
journalism
and
the
public
good.
“Computer-‐assistedreporters”
and
“programmer-‐
journalists”
in
Chicago.
New
Media
&
Society,
15(6),
pp.
853–871.
Wahl-‐Jorgensen,
Karin
(2013a)
SubjecAvity
and
story-‐telling
in
journalism.
Examining
expressions
of
affect,
judgement
and
appreciaAon
in
Pulitzer
Prize-‐winning
stories.
Journalism
Studies
14(3),
pp.
305–20.
Wahl-‐Jorgensen,
Karin
(2013b):
The
strategic
ritual
of
emoAonality:
a
case
study
of
Pulitzer
Prize-‐winning
arAcles.
Journalism
14(1),
pp.
129–45.
Weinacht,
Stefan;
Spiller,
Ralf
(2014).
Datenjournalismus
in
Deutschland.
Eine
exploraAve
Untersuchung
zu
Rollenbildern
von
Datenjournalisten
[Data-‐journalism
in
Germany.
An
exploratory
study
on
the
role
concepAons
of
data-‐journalists].
PublizisEk,
59(4),
pp.
411–433.
Loosen/Reimer/Schmidt
20