SlideShare a Scribd company logo
1 of 5
Download to read offline
•     Introduce	
  paper	
  title	
  
     •     Ask	
  people	
  to	
  interact,	
  comment,	
  respond	
  to	
  our	
  questions	
  during	
  presentation	
  using	
  
           #tweetprivacy	
  	
  
     •     Credits	
  	
  
               o Charlesworth	
  –	
  whose	
  Digital	
  Lives	
  Report	
  was	
  one	
  of	
  the	
  only	
  papers	
  that	
  provided	
  any	
  
                    analysis	
  and	
  guidance	
  in	
  the	
  area	
  of	
  social	
  media	
  archiving.	
  	
  
	
  
	
  
Interest	
  in	
  social	
  media	
  data	
  is	
  multidisciplinary,	
  resulting	
  in	
  conflicting	
  views	
  regarding	
  the	
  ethical	
  
management	
  of	
  captured	
  datasets.	
  Curators	
  will	
  be	
  required	
  to	
  navigate	
  these	
  conflicting	
  views	
  as	
  they	
  
work	
  to	
  provide	
  appropriate	
  mechanisms	
  for	
  access	
  and	
  reuse	
  of	
  these	
  data.	
  	
  
	
  
We	
  hope	
  to	
  encourage	
  researchers,	
  library,	
  archive,	
  or	
  repository	
  staff	
  to	
  engage	
  in	
  a	
  cross-­‐disciplinary	
  
conversation	
  about	
  the	
  privacy	
  issues	
  (as	
  well	
  as	
  the	
  host	
  of	
  other	
  issues)	
  inherent	
  in	
  using	
  social	
  media	
  
as	
  a	
  primary	
  source	
  for	
  research.	
  	
  
	
  
We’re	
  going	
  to	
  show	
  you	
  a	
  clip	
  from	
  Laila	
  Sakr’s	
  presentation	
  at	
  the	
  Tech@state	
  Data	
  Visualization	
  
Conference	
  in	
  Washington	
  DC.	
  The	
  clip	
  provides	
  a	
  good	
  example	
  of	
  how	
  researchers	
  are	
  using	
  twitter	
  and	
  
other	
  social	
  media	
  data.	
  
	
  
[Play	
  Clip]	
  
	
  
	
  
There	
  are	
  two	
  key	
  things	
  I	
  want	
  to	
  point	
  out:	
  
	
  
	
  1.	
  Long-­‐term	
  archiving	
  of	
  this	
  data	
  and	
  other	
  curatorial	
  issues	
  like	
  value,	
  authenticity,	
  and	
  significant	
  
properties	
  are	
  absent	
  from	
  this	
  talk,	
  which	
  is	
  not	
  surprising.	
  They	
  were	
  also	
  absent	
  in	
  many	
  of	
  the	
  papers	
  
we	
  read	
  that	
  utilized	
  Twitter	
  data.	
  This	
  demonstrates	
  that	
  there	
  is	
  an	
  overall	
  emphasis	
  by	
  researches	
  at	
  
this	
  point,	
  on	
  collection	
  and	
  analysis	
  rather	
  than	
  on	
  preservation.	
  	
  
	
  
2.	
  Sakr	
  makes	
  sure	
  to	
  say	
  that	
  she	
  is	
  downloading	
  only	
  the	
  publicly	
  available	
  tweets	
  using	
  the	
  search	
  API	
  
and	
  how	
  this	
  could	
  potentially	
  affect	
  her	
  sample	
  and	
  the	
  validity	
  of	
  it.	
  	
  She’s	
  not	
  talking	
  about	
  it	
  in	
  terms	
  
of	
  privacy	
  issues	
  –	
  which	
  further	
  illustrates	
  that	
  the	
  focus	
  is	
  on	
  analysis	
  rather	
  privacy	
  or	
  the	
  ethics.	
  
	
  
We’d	
  like	
  to	
  take	
  an	
  informal	
  poll	
  similar	
  to	
  last	
  night’s	
  poll	
  of	
  the	
  audience’s	
  willingness	
  to	
  have	
  their	
  
genome	
  sequenced.	
  	
  Who	
  among	
  those	
  of	
  you	
  who	
  use	
  Twitter	
  as	
  a	
  communication	
  tool	
  is	
  completely	
  fine	
  
with	
  having	
  your	
  tweets,	
  profile	
  information,	
  images,	
  location	
  data	
  downloaded,	
  analyzed,	
  archived,	
  
preserved?	
  
	
  
-­‐of	
  those	
  of	
  you	
  with	
  your	
  hands	
  raised,	
  how	
  many	
  of	
  you	
  have	
  tweeted	
  something	
  of	
  a	
  more	
  personal	
  
nature	
  that	
  you	
  might	
  not	
  want	
  archived?	
  
	
  
And	
  who	
  here	
  is	
  actively	
  involved	
  with	
  the	
  collection	
  of	
  Twitter	
  data?	
  –	
  any	
  social	
  media	
  data?	
  
?What	
  do	
  you	
  do	
  with	
  it	
  –	
  Tweet	
  here]	
  
	
  
	
  
The	
  reason	
  I	
  ask	
  is	
  we	
  found	
  through	
  our	
  work	
  with	
  the	
  Hypercities	
  Egypt	
  Twitter	
  data,	
  that	
  the	
  issue	
  of	
  
whether	
  or	
  not	
  there	
  are	
  privacy	
  concerns	
  with	
  a	
  data	
  source	
  like	
  Twitter	
  is	
  essentially	
  a	
  research	
  ethics	
  
issue;	
  which	
  varies	
  depending	
  on	
  the	
  role	
  and/or	
  subject	
  background	
  of	
  the	
  researcher	
  and	
  how	
  they	
  
view	
  the	
  context	
  of	
  the	
  data	
  creation.	
  	
  (refer	
  Confounding	
  Relationships	
  to	
  point	
  out	
  various	
  roles)	
  
	
  
So,	
  our	
  central	
  thesis	
  is	
  that	
  perceptions	
  of	
  privacy	
  in	
  social	
  media	
  platforms	
  are	
  formed	
  by	
  disciplinary	
  
culture,	
  the	
  capabilities	
  and	
  constraints	
  of	
  the	
  platform,	
  and	
  community	
  norms	
  the	
  platform	
  itself.	
  
	
  
Does	
  analyzing	
  a	
  person's	
  Tweets	
  constitute	
  researching	
  a	
  human	
  subject?	
  Or	
  are	
  Tweets	
  a	
  creative	
  text	
  
which	
  requires	
  proper	
  citation	
  and	
  credit	
  to	
  the	
  authors	
  or	
  tweeters?	
  Or	
  are	
  Tweets	
  part	
  of	
  the	
  open	
  
public	
  record.	
  Social	
  scientists	
  tend	
  to	
  view	
  the	
  data	
  as	
  Human	
  Subject	
  research,	
  while	
  Humanists	
  tend	
  to	
  
view	
  the	
  data	
  as	
  a	
  form	
  of	
  publication.	
  	
  These	
  very	
  different	
  ways	
  of	
  viewing	
  the	
  data	
  require	
  different	
  
methods	
  for	
  dealing	
  with	
  privacy.	
  	
  
	
  
We	
  feel	
  it	
  is	
  important	
  to	
  state	
  that	
  social	
  media	
  data	
  are	
  not	
  homogenous;	
  each	
  platform	
  has	
  its	
  own	
  
unique	
  constraints	
  for	
  the	
  creation/inclusion	
  of	
  content	
  as	
  well	
  as	
  constraints	
  on	
  how	
  users	
  may	
  engage	
  
in	
  the	
  space,	
  and	
  their	
  expectations	
  and	
  norms	
  of	
  interaction.	
  	
  
	
  
Our	
  case	
  study	
  focuses	
  on	
  Twitter,	
  so	
  while	
  we	
  provide	
  a	
  general	
  framework	
  assessing	
  privacy	
  issues	
  
with	
  social	
  media,	
  it	
  must	
  be	
  understood,	
  that	
  because	
  of	
  the	
  uniqueness	
  of	
  Twitter’s	
  Privacy	
  Policies,	
  
Terms	
  of	
  Service,	
  Developers	
  Rules	
  of	
  the	
  Road,	
  the	
  analysis	
  and	
  interpretation	
  are	
  not	
  necessarily	
  
generalizable	
  to	
  other	
  platforms,	
  such	
  as	
  Facebook.	
  	
  Like	
  many	
  data	
  curation	
  activities	
  there	
  will	
  be	
  some	
  
facets	
  which	
  can	
  be	
  generalized,	
  while	
  others	
  may	
  be	
  platform,	
  or	
  subject	
  specific.	
  	
  Part	
  of	
  determining	
  
the	
  curation	
  needs	
  of	
  social	
  media	
  data	
  will	
  be	
  to	
  determine	
  these	
  boundaries.	
  
	
  
	
  
What	
  can	
  we	
  learn	
  about	
  you	
  from	
  Twitter?	
  
[Show	
  different	
  visualizations,	
  then	
  tweet	
  map,	
  tweet	
  image]	
  
Depending	
  on	
  how	
  the	
  data	
  are	
  visualized	
  we	
  can	
  learn	
  about	
  you	
  as	
  an	
  individual,	
  your	
  internet	
  
relations,	
  or	
  as	
  part	
  of	
  huge	
  collective,	
  or	
  nothing	
  about	
  you	
  as	
  an	
  individual	
  (r-­‐shief	
  image).	
  	
  Different	
  
visualizations	
  will	
  enable	
  better	
  anonymization	
  than	
  others.	
  
	
  
However,	
  the	
  underlying	
  dataset	
  used	
  to	
  generate	
  the	
  visualizations	
  will	
  still	
  contain:	
  
if	
  your	
  account	
  is	
  unprotected,	
  name,	
  location,	
  photos,	
  etc.	
  anything	
  you	
  decide	
  to	
  share	
  in	
  your	
  timeline	
  –	
  
so	
  if	
  you	
  include	
  other	
  personal	
  info	
  –	
  like	
  an	
  email	
  or	
  some	
  such	
  thing,	
  we	
  can	
  find	
  it	
  out	
  about	
  you.	
  
	
  
But	
  What	
  else	
  can	
  we	
  find	
  out	
  about	
  you?	
  [show	
  the	
  Alyaa	
  Gad	
  slide	
  –	
  then	
  the	
  Google	
  Search]	
  	
  Thanks	
  to	
  
the	
  power	
  of	
  search	
  engines	
  like	
  google,	
  we	
  can	
  get	
  a	
  lot	
  more	
  information,	
  which	
  may	
  be	
  collected	
  and	
  
archived	
  as	
  well.	
  
	
  
	
  
Our	
  Case	
  Study	
  or	
  what	
  I	
  like	
  to	
  call	
  “we’ve	
  got	
  tweets,	
  now	
  what?”	
  
	
  
Todd	
  Presner,	
  a	
  UCLA	
  Faculty	
  member	
  and	
  two	
  researchers	
  collected	
  a	
  subset	
  of	
  the	
  overall	
  Twitter	
  data	
  
available.	
  He	
  asked	
  the	
  library	
  to	
  archive	
  it.	
  Before	
  we	
  could	
  do	
  anything	
  with	
  it,	
  we	
  had	
  to	
  assess	
  what	
  he	
  
had	
  collected.	
  	
  
	
  
The	
  HyperCities	
  team	
  used	
  the	
  Twitter	
  Search	
  API	
  to	
  pull	
  data	
  based	
  on	
  the	
  location	
  parameter	
  (within	
  
200	
  km	
  of	
  the	
  center	
  of	
  Cairo),	
  time	
  period	
  (January	
  30,	
  2011	
  through	
  February	
  24,	
  2011),	
  AND	
  one	
  of	
  
three	
  hashtags	
  (#jan25	
  OR	
  #egypt	
  OR	
  #tahrir).	
  	
  	
  
	
  
They	
  downloaded	
  approximately	
  420,000	
  public	
  Tweets	
  during	
  the	
  initial	
  phase	
  of	
  this	
  analysis	
  and	
  
continue	
  to	
  feed	
  their	
  site	
  with	
  live	
  feeds.	
  	
  
	
  
Like	
  Sakr,	
  the	
  data	
  capture	
  was	
  motivated	
  by	
  the	
  fact	
  that	
  significant	
  events	
  were	
  taking	
  place	
  using	
  
Twitter,	
  and	
  because	
  twitter	
  data	
  disappears	
  quickly	
  (10	
  days),	
  they	
  decided	
  to	
  start	
  downloading	
  and	
  
make	
  it	
  available	
  to	
  as	
  many	
  people	
  as	
  possible	
  for	
  future	
  reference	
  and	
  study.	
  	
  
	
  
There	
  wasn’t	
  necessarily	
  any	
  research	
  question	
  or	
  overarching	
  thesis	
  behind	
  the	
  collection	
  other	
  than	
  to	
  
provide	
  a	
  glimpse	
  back	
  to	
  the	
  Egyptian	
  Revolution	
  Twitterverse.	
  	
  As	
  Dr.	
  Charlesworth	
  pointed	
  out	
  
yesterday	
  morning,	
  legal	
  issues	
  with	
  gathering	
  this	
  type	
  of	
  data	
  won’t	
  be	
  at	
  the	
  forefront	
  of	
  the	
  
researcher’s	
  mind.	
  	
  
	
  
Based	
  on	
  the	
  search	
  parameters,	
  the	
  data	
  set	
  captured	
  eight	
  out	
  of	
  approximately	
  forty	
  possible	
  Twitter	
  
data	
  fields,	
  revealing	
  how	
  the	
  method	
  of	
  capture,	
  and	
  search	
  parameters	
  profoundly	
  shape	
  the	
  resultant	
  
data.	
  	
  
	
  
The	
  data	
  is	
  sitting	
  on	
  Prof.	
  Presner’s	
  personal	
  server	
  as	
  JSON	
  files,	
  but	
  the	
  data	
  will	
  soon	
  be	
  converted	
  
into	
  XML	
  for	
  ease	
  in	
  depositing	
  and	
  managing	
  the	
  data	
  in	
  Isalandora.	
  
	
  
These	
  facts	
  must	
  be	
  documented	
  in	
  order	
  for	
  future	
  users	
  to	
  have	
  a	
  clear	
  understanding	
  of	
  the	
  data	
  set.	
  	
  
	
  
	
  
“But	
  the	
  data	
  are	
  already	
  public…”	
  
So	
  if	
  the	
  general	
  understanding	
  that	
  your	
  twitter	
  data	
  is	
  open	
  and	
  public,	
  and	
  that	
  people	
  using	
  these	
  
platforms	
  want	
  to	
  be	
  seen	
  AND	
  heard,	
  why	
  should	
  we	
  be	
  concerned	
  about	
  privacy?	
  	
  	
  
	
  
The	
  Privacy	
  Policy	
  of	
  twitter	
  stipulates	
  that	
  while	
  you	
  “own”	
  your	
  content	
  –	
  anyone,	
  including	
  twitter	
  or	
  
any	
  third	
  party,	
  are	
  given	
  the	
  right	
  to	
  access	
  your	
  data	
  and	
  re-­‐use	
  it.	
  (our	
  reading	
  of	
  the	
  privacy	
  policy)	
  	
  

Those who see Twitter data as data that contains potentially identifying information about human subjects may
want to anonymize the data for the authors' protection, and may see displaying user names as unethical. This
runs contrary to Twitters Rules of the Road which require the display of a user id to give credit to the person
who tweeted.
	
  
Yet Twitter also acknowledges this public/private tension in their own policies by suggesting if there is a
concern over privacy or security risks by making a user id or other information available, the individual or
media should get in touch with them.

The	
  debate	
  about	
  the	
  capture,	
  reuse,	
  and	
  display	
  of	
  Twitter	
  data	
  is	
  the	
  line	
  between	
  	
  thelegality	
  of	
  
collecting	
  this	
  content	
  and	
  the	
  ethics	
  of	
  doing	
  so.	
  	
  
	
  
To	
  date	
  there	
  haven’t	
  been	
  any	
  formal	
  legal	
  challenges	
  about	
  the	
  downloading,	
  use	
  and	
  archiving	
  of	
  
Twitter	
  data,	
  that	
  we	
  are	
  aware	
  of.	
  	
  	
  
	
  
Thus	
  ensues	
  a	
  wide-­‐ranging	
  debate	
  by	
  scholars	
  who	
  characterize	
  privacy	
  issues	
  with	
  social	
  media	
  data	
  in	
  
the	
  following	
  ways:	
  
	
  
Most	
  researchers	
  take	
  a	
  harm-­‐based	
  view	
  of	
  privacy,	
  in	
  which	
  the	
  goal	
  is	
  to	
  protect	
  users’	
  information	
  
from	
  negative	
  actors.	
  	
  
           This	
  includes	
  concern	
  for	
  security	
  issues	
  (used	
  by	
  government	
  agencies	
  to	
  track	
  and	
  arrest;	
  use	
  as	
  
evidence).	
  	
  
           Recognizing	
  there	
  are	
  loopholes	
  in	
  the	
  data,	
  which	
  enables	
  someone	
  to	
  get	
  a	
  lot	
  of	
  information	
  
about	
  an	
  individual,	
  even	
  if	
  all	
  you	
  have	
  is	
  a	
  username;	
  deletion	
  of	
  account	
  and	
  changing	
  from	
  public	
  to	
  
private	
  content	
  captured	
  will	
  be	
  available.	
  	
  
 
Finally,	
  (Buyer	
  beware)	
  those	
  users	
  who	
  have	
  opted	
  to	
  make	
  their	
  accounts	
  public	
  have	
  no	
  grounds	
  for	
  
complaint	
  about	
  the	
  collection	
  and	
  reuse	
  of	
  their	
  content,	
  even	
  if	
  they	
  did	
  not	
  anticipate	
  reuse	
  by	
  
researchers	
  or	
  commercial	
  firms	
  (Thelwall,	
  2010;	
  Vieweg,	
  2010).	
  
	
  
Danah	
  boyd	
  still	
  asks:	
  Just	
  because	
  we	
  can	
  collect	
  it,	
  should	
  we?	
  
	
  
Michael	
  Zimmer,	
  an	
  Internet	
  Privacy	
  scholar,	
  argues	
  instead	
  for	
  a	
  dignity-­‐based	
  view	
  of	
  privacy	
  that	
  
views	
  the	
  act	
  of	
  another	
  person	
  taking	
  one’s	
  personal	
  information	
  from	
  the	
  social	
  networking	
  sphere,	
  
amassing	
  into	
  a	
  database,	
  making	
  available	
  for	
  use	
  and	
  scrutiny,	
  is	
  an	
  affront	
  to	
  the	
  users’/subjects’	
  
human	
  dignity	
  and	
  their	
  ability	
  to	
  control	
  the	
  flow	
  of	
  their	
  personal	
  information.	
  
	
  	
  
	
  
Finally,	
  What	
  are	
  the	
  user’s	
  expectations	
  of	
  how	
  their	
  tweets	
  will	
  be	
  used?	
  	
  	
  
How	
  many	
  here	
  have	
  actually	
  read	
  Twitter’s	
  privacy	
  policy?	
  FB?	
  	
  Do	
  you	
  understand	
  the	
  implications	
  of	
  
re-­‐use?	
  
	
  
___Schmidt,	
  Trepte,	
  and	
  Reinecke	
  (2011)	
  observe	
  that	
  users	
  develop	
  shared	
  routines	
  and	
  expectations	
  of	
  
self-­‐disclosure,	
  noting	
  that	
  privacy	
  management	
  is	
  performed	
  for	
  a	
  specific	
  audience.	
  	
  
	
  
Facebook	
  for	
  example	
  enables	
  users	
  to	
  select	
  privacy	
  settings	
  on	
  a	
  post-­‐by-­‐post	
  basis,	
  choosing	
  who	
  is	
  
able	
  to	
  read,	
  comment,	
  and	
  interact	
  with	
  specific	
  content,	
  and	
  allowing	
  the	
  user	
  fairly	
  granular	
  control	
  
over	
  the	
  flow	
  of	
  their	
  information.	
  Twitter	
  allows	
  only	
  binary	
  control;	
  users	
  can	
  designate	
  their	
  account	
  
as	
  “protected”	
  (i.e.	
  Tweets	
  are	
  only	
  visible	
  to	
  approved	
  followers),	
  or	
  “public”	
  (enabled	
  by	
  default),	
  which	
  
makes	
  a	
  user’s	
  profile	
  and	
  timeline	
  accessible	
  to	
  anyone,	
  even	
  those	
  without	
  a	
  Twitter	
  account.	
  	
  
	
  
The	
  ethical	
  jury	
  is	
  going	
  to	
  be	
  out	
  on	
  this	
  for	
  a	
  while;	
  at	
  least	
  until	
  scholarly	
  communities	
  work	
  out	
  
parameters	
  and	
  provide	
  guidance	
  for	
  acceptable	
  use	
  of	
  social	
  media	
  data.	
  	
  In	
  the	
  meantime,	
  what	
  are	
  we	
  
to	
  do?	
  Legal	
  and	
  ethical	
  policy	
  related	
  to	
  privacy	
  and	
  social	
  media	
  data	
  is	
  still	
  in	
  flux	
  and	
  almost	
  always	
  
lags	
  behind	
  the	
  pace	
  of	
  research.	
  Yet	
  libraries,	
  etc	
  are	
  pressured	
  to	
  act	
  now	
  to	
  archive	
  the	
  data.	
  
	
  
	
  
Data	
  repositories	
  will	
  be	
  caught	
  in	
  the	
  middle	
  of	
  these	
  divergent	
  viewpoints	
  when	
  trying	
  to	
  determine	
  the	
  
best	
  methods	
  of	
  providing	
  access	
  to	
  the	
  data.	
  	
  
	
  
The	
  norms	
  of	
  individual	
  research	
  disciplines	
  often	
  provide	
  guidance	
  for	
  curators,	
  but	
  when	
  researchers	
  
with	
  divergent	
  norms	
  seek	
  access	
  to	
  the	
  same	
  data,	
  it	
  can	
  be	
  difficult	
  to	
  determine	
  how	
  best	
  to	
  serve	
  the	
  
broadest	
  number	
  of	
  users.	
  	
  
	
  
	
  
Experience	
  with	
  this	
  data	
  set	
  and	
  the	
  literature	
  review	
  led	
  us	
  to	
  the	
  following	
  recommendations	
  
Libraries	
  or	
  other	
  data	
  repositories	
  will	
  need	
  to	
  decide	
  if	
  archiving	
  social	
  media	
  data	
  fits	
  with	
  
their	
  overall	
  institutional	
  mission	
  and	
  goals.	
  	
  
	
  
	
  
Libraries	
  should	
  determine	
  the	
  overall	
  risks	
  associated	
  with	
  collecting	
  and	
  archiving	
  social	
  media	
  
data	
  and	
  design	
  strategies	
  to	
  mitigate	
  those	
  risks.	
  	
  
Because	
  of	
  the	
  significance	
  scholars	
  are	
  placing	
  on	
  the	
  need	
  to	
  collect	
  and	
  now	
  in	
  our	
  case	
  archive	
  twitter	
  
data,	
  we	
  are	
  convinced	
  that	
  providing	
  for	
  the	
  collection,	
  preservation,	
  and	
  reuse	
  of	
  social	
  media	
  data	
  
requires	
  at	
  the	
  very	
  least	
  conversations	
  among	
  researchers,	
  libraries,	
  archives,	
  institutional	
  review	
  
boards,	
  scholarly	
  societies,	
  and	
  other	
  national	
  and	
  international	
  organizations	
  concerned	
  with	
  the	
  
production	
  and	
  preservation	
  of	
  scholarship.	
  	
  
	
  
Part	
  of	
  the	
  discussion	
  will	
  need	
  to	
  include	
  the	
  context	
  or	
  conditions	
  under	
  which	
  the	
  data	
  have	
  been	
  
collected.	
  	
  One	
  important	
  aspect	
  of	
  this,	
  as	
  Dr.	
  Charlesworth	
  mentioned	
  in	
  his	
  talk	
  yesterday,	
  is	
  a	
  way	
  to	
  
gather	
  “legal	
  metadata”	
  so	
  that	
  going	
  forward	
  the	
  archive	
  or	
  repository	
  will	
  have	
  the	
  necessary	
  privacy	
  I’s	
  
dotted	
  and	
  t’s	
  crossed,	
  in	
  so	
  much	
  as	
  it	
  is	
  possible.	
  
	
  
	
  
Libraries	
  should	
  engage	
  researchers	
  as	
  early	
  as	
  possible	
  in	
  the	
  research	
  process.	
  	
  
curators	
  are	
  presented	
  with	
  a	
  golden	
  opportunity	
  to	
  collaborate	
  with	
  researchers	
  as	
  close	
  to	
  the	
  
beginning	
  of	
  the	
  research	
  lifecycle.	
  	
  Through	
  a	
  collaborative	
  process,	
  we	
  can	
  ideally	
  facilitate	
  the	
  creation	
  
of	
  collections	
  that	
  balance	
  openness	
  with	
  privacy	
  concerns,	
  and	
  encourage	
  broad	
  reuse.	
  	
  
	
  
While	
  that	
  early	
  intervention	
  may	
  not	
  happen,	
  we	
  can	
  employ	
  curatoratorial	
  strategies	
  on	
  the	
  backend	
  of	
  
the	
  data	
  gathering	
  will	
  hopefully	
  push	
  the	
  issue.	
  (one	
  of	
  which	
  will	
  be	
  discussed	
  in	
  our	
  next	
  
recommendation.)	
  
	
  
Here	
  we	
  start	
  to	
  addresses	
  the	
  question	
  somewhat	
  that	
  was	
  asked	
  yesterday	
  at	
  Dr.	
  Charlesworth’s	
  
presentation	
  about	
  educating	
  the	
  researchers.	
  	
  
	
  
	
  
Libraries	
  choosing	
  to	
  archive	
  social	
  media	
  data	
  should	
  develop	
  clear	
  and	
  easy	
  to	
  use	
  collection	
  
and	
  deposit	
  policies,	
  forms	
  and	
  tools.	
  	
  	
  
It	
  has	
  been	
  our	
  argument	
  since	
  first	
  working	
  with	
  Twitter	
  Data	
  that	
  a	
  way	
  to	
  both	
  educate	
  researchers	
  
and	
  create	
  ingestible,	
  reusable	
  data	
  into	
  a	
  repository	
  is	
  to	
  create	
  a	
  workflow	
  that	
  asks	
  the	
  necessary	
  
questions	
  of	
  researchers,	
  which	
  would	
  aid	
  in	
  the	
  creation	
  of	
  a	
  codebook	
  and	
  documentation	
  for	
  the	
  data.	
  
	
  
We	
  created	
  a	
  twitter	
  deposit	
  form	
  that	
  is	
  geared	
  toward	
  raising	
  the	
  privacy	
  issues	
  with	
  this	
  platform,	
  
educating	
  the	
  researcher,	
  as	
  well	
  as	
  providing	
  a	
  way	
  to	
  record	
  the	
  basic	
  legal	
  and	
  descriptive	
  metadata	
  
necessary	
  for	
  contextualizing	
  the	
  data	
  for	
  re-­‐use.	
  
	
  
Teachable	
  moments	
  for	
  information	
  literacy	
  librarians.	
  
Understand	
  and	
  know	
  the	
  source	
  of	
  information.	
  
	
  
Twitter	
  adds	
  language	
  to	
  their	
  privacy	
  policy	
  that	
  more	
  explicitly	
  state	
  use.	
  
(gain	
  consent	
  –	
  and	
  then	
  the	
  data	
  truly	
  become	
  open)	
  	
  
Ideally,	
  Twitter	
  would	
  take	
  a	
  different	
  approach	
  to	
  releasing	
  the	
  data	
  for	
  research;	
  partner	
  directly	
  with	
  
researchers;	
  rather	
  than	
  with	
  a	
  third	
  party	
  like	
  GNIP	
  which	
  charges	
  for	
  the	
  data	
  and	
  isn’t	
  clear	
  what	
  can	
  
be	
  done	
  with	
  it	
  once	
  it’s	
  been	
  purchased.	
  
	
  
	
  
Lastly,	
  thanks	
  to	
  all	
  who	
  have	
  been	
  tweeting	
  during	
  the	
  session;	
  we	
  wanted	
  to	
  let	
  you	
  know	
  that	
  we’ve	
  
archived	
  them	
  in	
  TwapperKeeper.	
  

More Related Content

What's hot

BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESMicah Altman
 
20181024 oa week_rdm_myriam_mertens
20181024 oa week_rdm_myriam_mertens20181024 oa week_rdm_myriam_mertens
20181024 oa week_rdm_myriam_mertensOpenAccessBelgium
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSMicah Altman
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Digital Methods Initiative
 
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_Digital Methods Initiative
 
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET Journal
 
Sas web 2010 lora-aroyo
Sas web 2010 lora-aroyoSas web 2010 lora-aroyo
Sas web 2010 lora-aroyoLora Aroyo
 
992 sms10 social_media_services
992 sms10 social_media_services992 sms10 social_media_services
992 sms10 social_media_servicessiyaza
 
Remembrance of data past
Remembrance of data pastRemembrance of data past
Remembrance of data pastAmélie Marian
 
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...Liliana Bounegru
 

What's hot (15)

Jf2516311637
Jf2516311637Jf2516311637
Jf2516311637
 
Rogers data days_2014_slides_opti
Rogers data days_2014_slides_optiRogers data days_2014_slides_opti
Rogers data days_2014_slides_opti
 
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCESBROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
BROWN BAG TALK WITH MICAH ALTMAN, SOURCES OF BIG DATA FOR SOCIAL SCIENCES
 
20181024 oa week_rdm_myriam_mertens
20181024 oa week_rdm_myriam_mertens20181024 oa week_rdm_myriam_mertens
20181024 oa week_rdm_myriam_mertens
 
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSBROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
 
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
Cross-Platform Profiling tutorial at the Digital Methods Summer School 2013
 
Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_Rogers studyingpoliticalissues mar2014_optimized_ii_
Rogers studyingpoliticalissues mar2014_optimized_ii_
 
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
 
A42020106
A42020106A42020106
A42020106
 
Jx2517481755
Jx2517481755Jx2517481755
Jx2517481755
 
Sas web 2010 lora-aroyo
Sas web 2010 lora-aroyoSas web 2010 lora-aroyo
Sas web 2010 lora-aroyo
 
nm
nmnm
nm
 
992 sms10 social_media_services
992 sms10 social_media_services992 sms10 social_media_services
992 sms10 social_media_services
 
Remembrance of data past
Remembrance of data pastRemembrance of data past
Remembrance of data past
 
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...What Actor-Network Theory (ANT) and digital methods can do for data journalis...
What Actor-Network Theory (ANT) and digital methods can do for data journalis...
 

Similar to What Your Tweets Tell Us About You, Speaker Notes

Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisFarida Vis
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Serge Beckers
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Nicola Osborne
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platformFayan TAO
 
Characterizing microblogs
Characterizing microblogsCharacterizing microblogs
Characterizing microblogsEtico Capital
 
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...Katja Reuter, PhD
 
Twitter Usage at Conferences
Twitter Usage at ConferencesTwitter Usage at Conferences
Twitter Usage at ConferencesRam Parthasarathy
 
Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Tim Highfield
 
P11 goonetilleke
P11 goonetillekeP11 goonetilleke
P11 goonetillekeRahul Yadav
 
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsRESHAN FARAZ
 
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & BiasStudying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & Biasgloriakt
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolFarida Vis
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitterKatrin Weller
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeJosh Cowls
 
Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Bart Rienties
 
Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Kandy Woodfield
 

Similar to What Your Tweets Tell Us About You, Speaker Notes (20)

Researching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media AnalysisResearching Social Media – Big Data and Social Media Analysis
Researching Social Media – Big Data and Social Media Analysis
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?Twitter: Social Network Or News Medium?
Twitter: Social Network Or News Medium?
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...
 
Text mining on Twitter information based on R platform
Text mining on Twitter information based on R platformText mining on Twitter information based on R platform
Text mining on Twitter information based on R platform
 
Characterizing microblogs
Characterizing microblogsCharacterizing microblogs
Characterizing microblogs
 
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...Disseminating Scientific Research via Twitter: Research Evidence and Practica...
Disseminating Scientific Research via Twitter: Research Evidence and Practica...
 
Twitter Usage at Conferences
Twitter Usage at ConferencesTwitter Usage at Conferences
Twitter Usage at Conferences
 
Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...Mapping Movements: Social movement research and big data: critiques and alter...
Mapping Movements: Social movement research and big data: critiques and alter...
 
P11 goonetilleke
P11 goonetillekeP11 goonetilleke
P11 goonetilleke
 
Analyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-TweetsAnalyzing-Threat-Levels-of-Extremists-using-Tweets
Analyzing-Threat-Levels-of-Extremists-using-Tweets
 
Studying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & BiasStudying Cybercrime: Raising Awareness of Objectivity & Bias
Studying Cybercrime: Raising Awareness of Objectivity & Bias
 
Collecting Twitter Data
Collecting Twitter DataCollecting Twitter Data
Collecting Twitter Data
 
F017433947
F017433947F017433947
F017433947
 
s00146-014-0549-4.pdf
s00146-014-0549-4.pdfs00146-014-0549-4.pdf
s00146-014-0549-4.pdf
 
Picturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter SchoolPicturing the Social: Talk for Transforming Digital Methods Winter School
Picturing the Social: Talk for Transforming Digital Methods Winter School
 
Challenges in-archiving-twitter
Challenges in-archiving-twitterChallenges in-archiving-twitter
Challenges in-archiving-twitter
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
 
Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...Lecture series: Using trace data or subjective data, that is the question dur...
Lecture series: Using trace data or subjective data, that is the question dur...
 
Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...Blurring the Boundaries? Ethical challenges in using social media for social...
Blurring the Boundaries? Ethical challenges in using social media for social...
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 

What Your Tweets Tell Us About You, Speaker Notes

  • 1. Introduce  paper  title   • Ask  people  to  interact,  comment,  respond  to  our  questions  during  presentation  using   #tweetprivacy     • Credits     o Charlesworth  –  whose  Digital  Lives  Report  was  one  of  the  only  papers  that  provided  any   analysis  and  guidance  in  the  area  of  social  media  archiving.         Interest  in  social  media  data  is  multidisciplinary,  resulting  in  conflicting  views  regarding  the  ethical   management  of  captured  datasets.  Curators  will  be  required  to  navigate  these  conflicting  views  as  they   work  to  provide  appropriate  mechanisms  for  access  and  reuse  of  these  data.       We  hope  to  encourage  researchers,  library,  archive,  or  repository  staff  to  engage  in  a  cross-­‐disciplinary   conversation  about  the  privacy  issues  (as  well  as  the  host  of  other  issues)  inherent  in  using  social  media   as  a  primary  source  for  research.       We’re  going  to  show  you  a  clip  from  Laila  Sakr’s  presentation  at  the  Tech@state  Data  Visualization   Conference  in  Washington  DC.  The  clip  provides  a  good  example  of  how  researchers  are  using  twitter  and   other  social  media  data.     [Play  Clip]       There  are  two  key  things  I  want  to  point  out:      1.  Long-­‐term  archiving  of  this  data  and  other  curatorial  issues  like  value,  authenticity,  and  significant   properties  are  absent  from  this  talk,  which  is  not  surprising.  They  were  also  absent  in  many  of  the  papers   we  read  that  utilized  Twitter  data.  This  demonstrates  that  there  is  an  overall  emphasis  by  researches  at   this  point,  on  collection  and  analysis  rather  than  on  preservation.       2.  Sakr  makes  sure  to  say  that  she  is  downloading  only  the  publicly  available  tweets  using  the  search  API   and  how  this  could  potentially  affect  her  sample  and  the  validity  of  it.    She’s  not  talking  about  it  in  terms   of  privacy  issues  –  which  further  illustrates  that  the  focus  is  on  analysis  rather  privacy  or  the  ethics.     We’d  like  to  take  an  informal  poll  similar  to  last  night’s  poll  of  the  audience’s  willingness  to  have  their   genome  sequenced.    Who  among  those  of  you  who  use  Twitter  as  a  communication  tool  is  completely  fine   with  having  your  tweets,  profile  information,  images,  location  data  downloaded,  analyzed,  archived,   preserved?     -­‐of  those  of  you  with  your  hands  raised,  how  many  of  you  have  tweeted  something  of  a  more  personal   nature  that  you  might  not  want  archived?     And  who  here  is  actively  involved  with  the  collection  of  Twitter  data?  –  any  social  media  data?   ?What  do  you  do  with  it  –  Tweet  here]       The  reason  I  ask  is  we  found  through  our  work  with  the  Hypercities  Egypt  Twitter  data,  that  the  issue  of   whether  or  not  there  are  privacy  concerns  with  a  data  source  like  Twitter  is  essentially  a  research  ethics   issue;  which  varies  depending  on  the  role  and/or  subject  background  of  the  researcher  and  how  they   view  the  context  of  the  data  creation.    (refer  Confounding  Relationships  to  point  out  various  roles)    
  • 2. So,  our  central  thesis  is  that  perceptions  of  privacy  in  social  media  platforms  are  formed  by  disciplinary   culture,  the  capabilities  and  constraints  of  the  platform,  and  community  norms  the  platform  itself.     Does  analyzing  a  person's  Tweets  constitute  researching  a  human  subject?  Or  are  Tweets  a  creative  text   which  requires  proper  citation  and  credit  to  the  authors  or  tweeters?  Or  are  Tweets  part  of  the  open   public  record.  Social  scientists  tend  to  view  the  data  as  Human  Subject  research,  while  Humanists  tend  to   view  the  data  as  a  form  of  publication.    These  very  different  ways  of  viewing  the  data  require  different   methods  for  dealing  with  privacy.       We  feel  it  is  important  to  state  that  social  media  data  are  not  homogenous;  each  platform  has  its  own   unique  constraints  for  the  creation/inclusion  of  content  as  well  as  constraints  on  how  users  may  engage   in  the  space,  and  their  expectations  and  norms  of  interaction.       Our  case  study  focuses  on  Twitter,  so  while  we  provide  a  general  framework  assessing  privacy  issues   with  social  media,  it  must  be  understood,  that  because  of  the  uniqueness  of  Twitter’s  Privacy  Policies,   Terms  of  Service,  Developers  Rules  of  the  Road,  the  analysis  and  interpretation  are  not  necessarily   generalizable  to  other  platforms,  such  as  Facebook.    Like  many  data  curation  activities  there  will  be  some   facets  which  can  be  generalized,  while  others  may  be  platform,  or  subject  specific.    Part  of  determining   the  curation  needs  of  social  media  data  will  be  to  determine  these  boundaries.       What  can  we  learn  about  you  from  Twitter?   [Show  different  visualizations,  then  tweet  map,  tweet  image]   Depending  on  how  the  data  are  visualized  we  can  learn  about  you  as  an  individual,  your  internet   relations,  or  as  part  of  huge  collective,  or  nothing  about  you  as  an  individual  (r-­‐shief  image).    Different   visualizations  will  enable  better  anonymization  than  others.     However,  the  underlying  dataset  used  to  generate  the  visualizations  will  still  contain:   if  your  account  is  unprotected,  name,  location,  photos,  etc.  anything  you  decide  to  share  in  your  timeline  –   so  if  you  include  other  personal  info  –  like  an  email  or  some  such  thing,  we  can  find  it  out  about  you.     But  What  else  can  we  find  out  about  you?  [show  the  Alyaa  Gad  slide  –  then  the  Google  Search]    Thanks  to   the  power  of  search  engines  like  google,  we  can  get  a  lot  more  information,  which  may  be  collected  and   archived  as  well.       Our  Case  Study  or  what  I  like  to  call  “we’ve  got  tweets,  now  what?”     Todd  Presner,  a  UCLA  Faculty  member  and  two  researchers  collected  a  subset  of  the  overall  Twitter  data   available.  He  asked  the  library  to  archive  it.  Before  we  could  do  anything  with  it,  we  had  to  assess  what  he   had  collected.       The  HyperCities  team  used  the  Twitter  Search  API  to  pull  data  based  on  the  location  parameter  (within   200  km  of  the  center  of  Cairo),  time  period  (January  30,  2011  through  February  24,  2011),  AND  one  of   three  hashtags  (#jan25  OR  #egypt  OR  #tahrir).         They  downloaded  approximately  420,000  public  Tweets  during  the  initial  phase  of  this  analysis  and   continue  to  feed  their  site  with  live  feeds.      
  • 3. Like  Sakr,  the  data  capture  was  motivated  by  the  fact  that  significant  events  were  taking  place  using   Twitter,  and  because  twitter  data  disappears  quickly  (10  days),  they  decided  to  start  downloading  and   make  it  available  to  as  many  people  as  possible  for  future  reference  and  study.       There  wasn’t  necessarily  any  research  question  or  overarching  thesis  behind  the  collection  other  than  to   provide  a  glimpse  back  to  the  Egyptian  Revolution  Twitterverse.    As  Dr.  Charlesworth  pointed  out   yesterday  morning,  legal  issues  with  gathering  this  type  of  data  won’t  be  at  the  forefront  of  the   researcher’s  mind.       Based  on  the  search  parameters,  the  data  set  captured  eight  out  of  approximately  forty  possible  Twitter   data  fields,  revealing  how  the  method  of  capture,  and  search  parameters  profoundly  shape  the  resultant   data.       The  data  is  sitting  on  Prof.  Presner’s  personal  server  as  JSON  files,  but  the  data  will  soon  be  converted   into  XML  for  ease  in  depositing  and  managing  the  data  in  Isalandora.     These  facts  must  be  documented  in  order  for  future  users  to  have  a  clear  understanding  of  the  data  set.         “But  the  data  are  already  public…”   So  if  the  general  understanding  that  your  twitter  data  is  open  and  public,  and  that  people  using  these   platforms  want  to  be  seen  AND  heard,  why  should  we  be  concerned  about  privacy?         The  Privacy  Policy  of  twitter  stipulates  that  while  you  “own”  your  content  –  anyone,  including  twitter  or   any  third  party,  are  given  the  right  to  access  your  data  and  re-­‐use  it.  (our  reading  of  the  privacy  policy)     Those who see Twitter data as data that contains potentially identifying information about human subjects may want to anonymize the data for the authors' protection, and may see displaying user names as unethical. This runs contrary to Twitters Rules of the Road which require the display of a user id to give credit to the person who tweeted.   Yet Twitter also acknowledges this public/private tension in their own policies by suggesting if there is a concern over privacy or security risks by making a user id or other information available, the individual or media should get in touch with them. The  debate  about  the  capture,  reuse,  and  display  of  Twitter  data  is  the  line  between    thelegality  of   collecting  this  content  and  the  ethics  of  doing  so.       To  date  there  haven’t  been  any  formal  legal  challenges  about  the  downloading,  use  and  archiving  of   Twitter  data,  that  we  are  aware  of.         Thus  ensues  a  wide-­‐ranging  debate  by  scholars  who  characterize  privacy  issues  with  social  media  data  in   the  following  ways:     Most  researchers  take  a  harm-­‐based  view  of  privacy,  in  which  the  goal  is  to  protect  users’  information   from  negative  actors.     This  includes  concern  for  security  issues  (used  by  government  agencies  to  track  and  arrest;  use  as   evidence).     Recognizing  there  are  loopholes  in  the  data,  which  enables  someone  to  get  a  lot  of  information   about  an  individual,  even  if  all  you  have  is  a  username;  deletion  of  account  and  changing  from  public  to   private  content  captured  will  be  available.    
  • 4.   Finally,  (Buyer  beware)  those  users  who  have  opted  to  make  their  accounts  public  have  no  grounds  for   complaint  about  the  collection  and  reuse  of  their  content,  even  if  they  did  not  anticipate  reuse  by   researchers  or  commercial  firms  (Thelwall,  2010;  Vieweg,  2010).     Danah  boyd  still  asks:  Just  because  we  can  collect  it,  should  we?     Michael  Zimmer,  an  Internet  Privacy  scholar,  argues  instead  for  a  dignity-­‐based  view  of  privacy  that   views  the  act  of  another  person  taking  one’s  personal  information  from  the  social  networking  sphere,   amassing  into  a  database,  making  available  for  use  and  scrutiny,  is  an  affront  to  the  users’/subjects’   human  dignity  and  their  ability  to  control  the  flow  of  their  personal  information.         Finally,  What  are  the  user’s  expectations  of  how  their  tweets  will  be  used?       How  many  here  have  actually  read  Twitter’s  privacy  policy?  FB?    Do  you  understand  the  implications  of   re-­‐use?     ___Schmidt,  Trepte,  and  Reinecke  (2011)  observe  that  users  develop  shared  routines  and  expectations  of   self-­‐disclosure,  noting  that  privacy  management  is  performed  for  a  specific  audience.       Facebook  for  example  enables  users  to  select  privacy  settings  on  a  post-­‐by-­‐post  basis,  choosing  who  is   able  to  read,  comment,  and  interact  with  specific  content,  and  allowing  the  user  fairly  granular  control   over  the  flow  of  their  information.  Twitter  allows  only  binary  control;  users  can  designate  their  account   as  “protected”  (i.e.  Tweets  are  only  visible  to  approved  followers),  or  “public”  (enabled  by  default),  which   makes  a  user’s  profile  and  timeline  accessible  to  anyone,  even  those  without  a  Twitter  account.       The  ethical  jury  is  going  to  be  out  on  this  for  a  while;  at  least  until  scholarly  communities  work  out   parameters  and  provide  guidance  for  acceptable  use  of  social  media  data.    In  the  meantime,  what  are  we   to  do?  Legal  and  ethical  policy  related  to  privacy  and  social  media  data  is  still  in  flux  and  almost  always   lags  behind  the  pace  of  research.  Yet  libraries,  etc  are  pressured  to  act  now  to  archive  the  data.       Data  repositories  will  be  caught  in  the  middle  of  these  divergent  viewpoints  when  trying  to  determine  the   best  methods  of  providing  access  to  the  data.       The  norms  of  individual  research  disciplines  often  provide  guidance  for  curators,  but  when  researchers   with  divergent  norms  seek  access  to  the  same  data,  it  can  be  difficult  to  determine  how  best  to  serve  the   broadest  number  of  users.         Experience  with  this  data  set  and  the  literature  review  led  us  to  the  following  recommendations   Libraries  or  other  data  repositories  will  need  to  decide  if  archiving  social  media  data  fits  with   their  overall  institutional  mission  and  goals.         Libraries  should  determine  the  overall  risks  associated  with  collecting  and  archiving  social  media   data  and  design  strategies  to  mitigate  those  risks.     Because  of  the  significance  scholars  are  placing  on  the  need  to  collect  and  now  in  our  case  archive  twitter   data,  we  are  convinced  that  providing  for  the  collection,  preservation,  and  reuse  of  social  media  data   requires  at  the  very  least  conversations  among  researchers,  libraries,  archives,  institutional  review  
  • 5. boards,  scholarly  societies,  and  other  national  and  international  organizations  concerned  with  the   production  and  preservation  of  scholarship.       Part  of  the  discussion  will  need  to  include  the  context  or  conditions  under  which  the  data  have  been   collected.    One  important  aspect  of  this,  as  Dr.  Charlesworth  mentioned  in  his  talk  yesterday,  is  a  way  to   gather  “legal  metadata”  so  that  going  forward  the  archive  or  repository  will  have  the  necessary  privacy  I’s   dotted  and  t’s  crossed,  in  so  much  as  it  is  possible.       Libraries  should  engage  researchers  as  early  as  possible  in  the  research  process.     curators  are  presented  with  a  golden  opportunity  to  collaborate  with  researchers  as  close  to  the   beginning  of  the  research  lifecycle.    Through  a  collaborative  process,  we  can  ideally  facilitate  the  creation   of  collections  that  balance  openness  with  privacy  concerns,  and  encourage  broad  reuse.       While  that  early  intervention  may  not  happen,  we  can  employ  curatoratorial  strategies  on  the  backend  of   the  data  gathering  will  hopefully  push  the  issue.  (one  of  which  will  be  discussed  in  our  next   recommendation.)     Here  we  start  to  addresses  the  question  somewhat  that  was  asked  yesterday  at  Dr.  Charlesworth’s   presentation  about  educating  the  researchers.         Libraries  choosing  to  archive  social  media  data  should  develop  clear  and  easy  to  use  collection   and  deposit  policies,  forms  and  tools.       It  has  been  our  argument  since  first  working  with  Twitter  Data  that  a  way  to  both  educate  researchers   and  create  ingestible,  reusable  data  into  a  repository  is  to  create  a  workflow  that  asks  the  necessary   questions  of  researchers,  which  would  aid  in  the  creation  of  a  codebook  and  documentation  for  the  data.     We  created  a  twitter  deposit  form  that  is  geared  toward  raising  the  privacy  issues  with  this  platform,   educating  the  researcher,  as  well  as  providing  a  way  to  record  the  basic  legal  and  descriptive  metadata   necessary  for  contextualizing  the  data  for  re-­‐use.     Teachable  moments  for  information  literacy  librarians.   Understand  and  know  the  source  of  information.     Twitter  adds  language  to  their  privacy  policy  that  more  explicitly  state  use.   (gain  consent  –  and  then  the  data  truly  become  open)     Ideally,  Twitter  would  take  a  different  approach  to  releasing  the  data  for  research;  partner  directly  with   researchers;  rather  than  with  a  third  party  like  GNIP  which  charges  for  the  data  and  isn’t  clear  what  can   be  done  with  it  once  it’s  been  purchased.       Lastly,  thanks  to  all  who  have  been  tweeting  during  the  session;  we  wanted  to  let  you  know  that  we’ve   archived  them  in  TwapperKeeper.