SlideShare a Scribd company logo
1 of 99
Investigating Reference Rot in Web-Based Scholarly Communication

Herbert Van de Sompel
Los Alamos National Laboratory
@hvdsomp

Martin Klein
Los Alamos National Laboratory
@mart1nkle1n

http://hiberlink.org #hiberlink
http://mementoweb.org #memento

Hiberlink is funded by the Andrew W. Mellon Foundation
Hiberlink Project Partners
• Los Alamos National Laboratory:
• Research Library: Martin Klein, Robert Sanderson, Herbert Van
de Sompel
• University of Edinburgh:
• Edina: Peter Burnhill, Neil Mayo, Muriel Mewissen, Christine
Rees, Tim Stickland, Riachard Wincewicz
• Language Technology Group: Beatrice Alex, Claire Grover,
Richard Tobin, Ke “Adam” Zhou
• Funding: Andrew W. Mellon Foundation

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Acknowledgments
• Primary datasets: arXiv, Chesapeake Project, Elsevier, PubMed
Central, PLoS, … (many more to come)

• Secondary datasets: Ex Libris, MS Academic, SerialsSolutions
• Technology support: CrossRef Labs, CrossRef Prospect, Elsevier

• Liaisons: archive.is, CrossRef, Internet Archive, Old Dominion
University Web Science & Digital Library Research Group, perma.cc

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Reference Rot
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Problem Domain
• Web-based scholarly communication links to, references, Web
resources:
• Formal citing of scholarly resources
• Referencing “Web at Large” resources needed or created in
research activities e.g. project websites, software, ontologies,
workflows, online debate, slides, blogs, videos, etc.

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Problem Domain
• Links to web resources are subject to Reference Rot:
• Link Rot: Link stops working, e.g. HTTP 404
• Content Decay: Linked content changes over time

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources

To Web at Large Resources

Link Rot
Content Decay

an increasingly blurry boundary

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

To Web at Large Resources

DOI, HTTP version of DOI

Content Decay

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

To Web at Large Resources

Fixity of content

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

To Web at Large Resources

Fixity of content
Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

To Web at Large Resources

Fixity of content
Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

There are issues here too, see
David Rosenthal blog post http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References to Scholarly Resources
• We hope/assume that peer-reviewed scholarly literature has fixity
and is adequately archived

• This, BTW, might not be a correct assumption:
• Dynamic, content rich, landing pages
• No public audit regarding archival status of electronic journal
literature archived in special-purpose infrastructure
• Poor archiving in public web archives, related to protected
content
• Initial information in Keepers Registry indicates spotty archiving
of of electronic journal literature
• … Still, this is NOT what Hiberlink investigates
See David Rosenthal blog post http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

To Web at Large Resources

Fixity of content
Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Hiberlink focus

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References to “Web at Large” Resources
• Hiberlink focuses on the wide variety of web resources needed or
created in research activities

• These resources:
• Are not necessarily under the custodianship of a party that cares
about long term integrity, access
• Do not necessarily have the same sense of fixity that e.g.
journal articles have
• Reference Rot makes it impossible to adequately recreate the
temporal context for scholarly discourse

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Herbert Van de Sompel, et al. (2004) http://dx.doi.org/10.1045/september2004-vandesompel
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
!Exist

Archived

Exist

Archived

!Exist

Archived

!Exist

!Archived

Exist

Archived
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Hiberlink: Investigating Reference Rot

• Hiberlink explores references to Web at Large resources:
• Quantifies Reference Rot
• Explores potential solutions to Reference Rot
• Focuses on links in electronic journal articles
• But has the big picture in mind: dynamic, interdependent,
web-based scholarly assets
• See Herbert Van de Sompel, From the Version of
Record to a Version of the Record, CNI Spring 2013
plenary talk - http://www.youtube.com/watch?v=fhrGSQbNVA

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

To Web at Large Resources

Fixity of content
Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Is it worth our time to study this?

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Articles Increasingly Link to Web Resources

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
The New York Times Cares

http://www.nytimes.com/2013/09/24/us/politics/
in-supreme-court-opinions-clicks-that-lead-nowhere.html
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Reference Rot in Law Journals
Zittrain, J., Kendra, A., Lessig, L. (2013) Perma: Scoping and
Addressing the Problem of Link and Reference Rot in Legal
Citations
• Link rot in Law Journals: ~27%
• Reference rot in law journals: ~70%

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2329161
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Not Just in Scholarly Communication
Zittrain, J., Kendra, A., Lessig, L. (2013) Perma: Scoping and
Addressing the Problem of Link and Reference Rot in Legal
Citations
Liebler, R., Liebert, J. (2012) Something rotten in the State of Legal
Citation
• Link rot: 29% of links in Supreme Court decisions (study of 19962010)
• Reference rot, including link rot: 49.9% of links in Supreme Court
decisions

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2329161
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2188070
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Not Just in Scholarly Communication

http://en.wikipedia.org/wiki/Wikipedia_talk:Link_rot
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Quantifying Reference Rot
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Quantifying Reference Rot
• Reference Rot has been studied before:
• For the web at large
• For scholarly communication
• For government documents
• What is different with Hiberlink?
• Investigates Reference Rot not just link rot, i.e. includes the
aspect of changing content not just rotting links
• Investigates coverage of referenced resources in web archives
• Operates at a massive scale regarding number of journal
articles, referenced URIs, web archive lookups

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
STUDY
Author (Date)
Lawrence (2001)
Casserly (2003)
Casserly (2007)
Rumsey (2002)
Davis (2002)
Wren (2004)
Sellitto (2005)
Goh (2005)
Dimitrova (2007)
McCown (2005)
Wagner (2009)
Parker (2007)
Duda (2008)
Falagas (2007)
Russell (2008)
Wren (2008)
Moghaddam (2010)
Sanderson (2011)

Year of
Publication
of Citations
1993-1999
1999-2000
1999-2000
1997-2001
1999-2001
1994-2002
1995-2003
1997-2003
2000-2003
1995-2004
2002-2004
2002-2005
1997-2005
2003-2006
1999-2006
1994-2007
1995-2008
1993-2010

# URIs

67,577
500
500
3,406
688
1,630
1,043
2,516
1,126
4,387
2,011
1,229
2,100
1,417
510
6,154
1,761
162,052

#URIs looked
up in web
archives
500
500
2.011
1,761
162,052

Sanderson, R., Phillips, M., and Van de Sompel, H. (2011) http://arxiv.org/abs/1105.3459
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Quantifying Reference Rot - Methodology

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
• Various full text corpora
• Articles 01/1997-12/2012
• URI extraction from XML and PDF
• Improvement on URI extraction
techniques used in prior research
• Validation study planned
• Referencing article
• Referencing journal
• Article dates: submission,
acceptation, publication
• URI position: abstract, body,
footnote, references
• Filter DOIs, HTTP version of DOIs
• Filter URIs that should have been
referenced by means of a DOI
• Supported by secondary
datasets
• Filter obvious noise, e.g. localhost,
example.org, foo.bar, licenses, etc.
• HTTP HEAD on referenced URI-R
• Follow redirects up to a maximum
of 50
• Record HTTP transaction chain
• If HTTP transaction chain ends with
2XX status code: Exists
• If HTTP transaction chain does not
end with 2XX: !Exist
• Lookup in web archives via a
Memento Aggregator that covers
among others Internet Archive,
Archive-It, archive.is, British
Library web archive, UK National
Archives web archive, Icelandic
web archive
• Obtain TimeMap per URI
• If TimeMap does not exist:
!Archived
• If TimeMap exists, select
Memento URI-M closest to
article publication date
• HTTP HEAD on URI-M
• Follow archived redirects
up to a maximum of 50
• Record HTTP transaction
chain
• If HTTP transaction chain
ends 2XX: Archived
• If HTTP transaction chain
does not end with 2XX:
!Archived
Data used for analysis
200k

31.2%

10k

80

90

!Exist
Archived
Archived within 30 days
Archived within 14 days
Archived within 7 days
Archived within 1 day

50k

100

Quantifying Reference Rot – Early Results

1k
100

40

50

Amount of citations

60

70

16.8%

10

20

30

11.3%

1

0

40.7%
1997

1999

2001

2003

2005

2007

2009

2011

1

5

10

50
Weeks

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013

100

500

1000
Study: PubMed Central Corpus 01/1997 – 12/2012
•
•
•
•

Articles processed:
Articles that contain Web at Large URIs:
References to Web at Large URIs:
Unique referenced Web at Large URIs:

494,785
176,527
557,432
327,782

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Percentage Exists & Archived Referenced URIs
Exists & Archived
!Exists & Archived
Exists & !Archived
!Exists & !Archived

31.2%
16.8%

11.3%

40.7%
URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Percentage Exists & Archived in 30 Day Window
23%

16.7%

Exists & Archived
!Exists & Archived
Exists & !Archived
!Exists & !Archived

5.1%

55.2%
URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Percentage Exists & Archived in 15 Day Window
24.6%

Exists & Archived
!Exists & Archived
Exists & !Archived
!Exists & !Archived
12.4%

3.5%

59.5%
URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Percentage Exists & Archived in 07 Day Window
25.8%

Exists & Archived
!Exists & Archived
Exists & !Archived
!Exists & !Archived
8.8%

2.3%

63.1%

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Percentage Exists & Archived in 01 Day Window
Exists & Archived
!Exists & Archived
Exists & !Archived
!Exists & !Archived

27.9%

0.9%
0.2%

71%

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
50
0

10

20

30

40

Percent

60

70

80

90

100

Percentage of !Exists per Year

1997

1999

2001

2003

2005

2007

2009

2011

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
100

Percentage of !Exists, Archived per Year

0

10

20

30

40

50

60

70

80

90

!Exist
Archived
Archived within 30 days
Archived within 14 days
Archived within 7 days
Archived within 1 day

1997

1999

2001

2003

2005

2007

2009

2011

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
100
90
80
0

10

20

30

40

50

60

70

80
70
60
50
40
30
0

10

20

Percent

Percentage !Exists URIs

90

!Exist
Archived
Archived within 30 days
Archived within 14 days
Archived within 7 days
Archived within 1 day

1997

1999

2001

2003

2005

2007

2009

2011

Percentage Archived URIs for !Exists URIs

100

Percentage of !Exists and of Those Archived per Year

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
100

1000

10000 30000

Absolute Number of Archived per Year

1

Archived
Archived within 30 days
Archived within 14 days
Archived within 7 days
Archived within 1 day
1997

1999

2001

2003

2005

2007

2009

2011

URIs extracted from PubMed papers – links to Web at Large resources
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Solving Reference Rot
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources
Link Rot

DOI, HTTP version of DOI

Content Decay

Fixity of content

To Web at Large Resources

-

Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Addressing Content Decay
• Aim for a more pro-active approach to collect snapshots of web
resources (likely to be) referenced in scholarly communication
• A system that hosts resources that are likely to be referenced in
scholarly communication can create snapshots of itself by:
o Using CMS, wikis, datawikis with solid versioning
mechanisms
o Subscribing to on-demand self web archiving service
o Using transactional web archives, cf. SiteStory
• Referenced resources can be web archived on-demand:
o By authors during note taking, authoring
o By platforms involved in the publication process, e.g.
archiving linked resources at the time of manuscript
submission
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources

To Web at Large Resources

Link Rot

DOI, HTTP version of DOI

Content Decay

Fixity of content

-

Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Web archiving
Content Versioning Systems
Self archiving

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Click link to blog post
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive page
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Search and find Mementos in Internet Archive for
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Search and find a Memento in archive.is for
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Click perma.cc link to Memento of blog post
http://perma.cc/0Hg62eLdZ3T

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from perma.cc
http://perma.cc/0Hg62eLdZ3T

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Search and do not find Mementos in Internet Archive for
http://perma.cc/0Hg62eLdZ3T

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Search and do not find Mementos in archive.is for
http://perma.cc/0Hg62eLdZ3T

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
What Happened?
• Good news: The number of archived copies of the blog post was
increased by pro-actively creating a Memento in perma.cc
• Bad news: The possibility of finding Mementos for the blog post
in other web archives was undermined by replacing the Original
URI-R with the Memento URI-M
• The Memento URI-M is a key in only one archive
• The Original URI-R is a key in all web archives
• Using the Memento URI-M in a link requires the permanent
existence/uptime of the archive that issued it
• One link rot problem was replaced by another …

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Web Archives Less Permanent than Permanent?

http://webcitation.org
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Web Archives Less Permanent than Permanent?

http://ws-dl.blogspot.com/2013/11/2013-11-21-conservative-party-speeches.html
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Web Archives Less Permanent than Permanent?

http://richmondsfblog.com/2013/11/06/part-of-internet-archive-building-badly-burned-in-earlymorning-fire/
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
What To Do?
• Need an approach for referencing archived resources that
supports lookups in many web archives, not just one
• Since the Original URI-R is a key in all web archives, the linking
approach needs to necessarily include it
• Hence, two URIs are required:
• The Original URI-R
• The Memento URI-M, e.g. the perma.cc URI
• But a link in HTML only carries one URI!
• It is understandable that the Memento URI-M is used for the
link: the approach works with existing web infrastructure
• Yet, an approach to address link rot that itself is subject to
link rot is … err… problematic
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
The Missing Link Proposal

• Extend the link to the Original URI-R with temporal context:
• Memento URI-M in a specific archive
• Dates:
• date of page that contains the link
• date of the link, cf. “accessed at” in citations of web
resources
• Provide the Original URI-R and the temporal context in a
machine-actionable manner so it can be used by user and
machine agents to retrieve Mementos from various web archives

http://mementoweb.org/missing-link/
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
The Missing Link Proposal

http://mementoweb.org/missing-link/
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
How to Make Missing Link Happen?
• The existing approach works out of the box but is problematic
• Missing Link requires infrastructure changes but generally
contributes to increased web persistence:
• HTML
• META for page date: no problem, already in use
• Attributes for <a> to convey URI-M and link date:
• data- extensibility mechanism in HTML5 can be
used but is not intended for cross-site applications
• In 1995, HTML had the URN attribute for <a> as a
means to address web persistence concerns
• Browser, tool support

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
References in Web-Based Scholarly Communication

To Scholarly Resources

To Web at Large Resources

Link Rot

DOI, HTTP version of DOI

Missing Link proposal

Content Decay

Fixity of content

-

Archiving: CLoCKSS,
LoCKSS, Portico, Keepers
Registry, …

Web archiving
Content Versioning Systems
Self archiving

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Demo: Application Using Temporal Context for Links

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Application Using Temporal Context for Links
• Memento for Chrome is an application that uses Original URI-R
and dates to access Mementos in various web archives
• Memento around the date selected in user interface
calendar
• Most recently archived Memento

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Memento Time Travel for Chrome

http://bit.ly/memento-for-chrome
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Memento Time Travel for Chrome

http://www.youtube.com/watch?v=0_70lQPOOIg
http://www.youtube.com/watch?v=WtZHKeFwjzk
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Application Using Temporal Context for Links
• An experimental version of Memento for Chrome also uses
Missing Link information (Original URI-R, URI-M, and dates) to
access Mementos in various web archives:
• Memento around the date selected in user interface calendar
• Most recently archived Memento
• Memento around the date of the page that contains the link
• Memento around the date of the link
• Memento URI-M in a specific archive
• A Memento client is just one example of an application that can
use temporal context provided for links. Other applications,
including search engines, can use it too

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
NYT has <META itemprop=“datePublished” content=“2013-09-23”>

Link in NYT was:
<a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/”>
Changed to:
<a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/”
data-versionurl=“http://perma.cc/0Hg62eLdZ3T”>
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get near current time (done on Nov 25 2013)
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/
enabler: <a href=“URI-R”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from archive.is, Nov 24 2013
http://archive.is/20131124221749/http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get at page date
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/
enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-23”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from Internet Archive, Sep 24 2013
http://web.archive.org/web/20130924053315/http://futureoftheinternet/2013/09/22/perma

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get from perma.cc
http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/
enabler: <a href=“URI-R” data-versionurl=“URI-M”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from perma.cc, Oct 2 2013
http://perma.cc/0Hg62eLdZ3T

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Link in NYT was:
<a href=“http://perma.cc/0Hg62eLdZ3T”>
Changed to:
<a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/”
data-versionurl=“http://perma.cc/0Hg62eLdZ3T”>
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
All previous options available

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Added:
<META itemprop=“datePublished” content=“2013-09-22”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Click Link (done on November 25 2013)
http://en.wikipedia.org/wiki/Link_rot
enabler: <a href=“URI-R”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Page
http://en.wikipedia.org/wiki/Link_rot

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Scroll down in page
Shows Perma.cc link, added October 22 2013, a month after the blog post

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get at page date
http://en.wikipedia.org/Link_rot
enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-22”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Page
http://en.wikipedia.org/w/index.php?title=Link_rot&oldid=571327764

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Scroll down in page
Does not show Perma.cc link, added October 22 2013, a month after the blog post

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Link in blog was:
<a href=“http://librarylab.law.harvard.edu”>
Changed (for fun) to:
<a href=“http://librarylab.law.harvard.edu” data-versiondate=“2010-09-22”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Click Link (done on November 25 2013)
http://librarylab.law.harvard.edu
enabler: <a href=“URI-R”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Page
http://librarylab.law.harvard.edu

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get at page date
http://librarylab.law.harvard.edu
enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-22”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from archive.is, Jun 21 2013
http://archive.is/20130621162538/http://librarylab.law.harvard.edu

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Right Click Link Get at link date
http://librarylab.law.harvard.edu
enabler: <a href=“URI-R” data-versiondate=“2010-09-22”>

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Receive Memento from Internet Archive, Sep 18 2010
http://web.archive.org/web/20100918025331/http://librarylab.law.harvard.edu

Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Bottom Line: A Link Leads to Many Times and Archives

http://mementoweb.org/missing-link/
Herbert Van de Sompel, Martin Klein – Hiberlink
CNI Fall 2013, Washington, DC, December 9 2013
Investigating Reference Rot in Web-Based Scholarly Communication

Herbert Van de Sompel
Los Alamos National Laboratory
@hvdsomp

Martin Klein
Los Alamos National Laboratory
@mart1nkle1n

http://hiberlink.org #hiberlink
http://mementoweb.org #memento

Hiberlink is funded by the Andrew W. Mellon Foundation

More Related Content

What's hot

Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarshipHerbert Van de Sompel
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DoneHerbert Van de Sompel
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDMartin Klein
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for LibrariesLukas Koster
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for RepositoriesMartin Klein
 
The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about itHerbert Van de Sompel
 
How much does $1.7 billion buy?
How much does $1.7 billion buy?How much does $1.7 billion buy?
How much does $1.7 billion buy?Martin Klein
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Jon Voss
 
Quantifying Orphaned Annotations in Hypothes.is
Quantifying Orphaned Annotations in Hypothes.isQuantifying Orphaned Annotations in Hypothes.is
Quantifying Orphaned Annotations in Hypothes.ismaturban
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...Alison Hitchens
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22jodischneider
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Herbert Van de Sompel
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebHerbert Van de Sompel
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Richard Urban
 
Web Data Management in the RDF Age
Web Data Management in the RDF AgeWeb Data Management in the RDF Age
Web Data Management in the RDF AgeM. Tamer Özsu
 
Linked Data at ISAW: How and Why
Linked Data at ISAW: How and WhyLinked Data at ISAW: How and Why
Linked Data at ISAW: How and Whyparegorios
 
20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologiesMelanie Courtot
 

What's hot (20)

Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCID
 
Memento 101
Memento 101Memento 101
Memento 101
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for Repositories
 
The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
How much does $1.7 billion buy?
How much does $1.7 billion buy?How much does $1.7 billion buy?
How much does $1.7 billion buy?
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
 
Quantifying Orphaned Annotations in Hypothes.is
Quantifying Orphaned Annotations in Hypothes.isQuantifying Orphaned Annotations in Hypothes.is
Quantifying Orphaned Annotations in Hypothes.is
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
 
TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22TPDL2013 tutorial linked data for digital libraries 2013-10-22
TPDL2013 tutorial linked data for digital libraries 2013-10-22
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
Web Data Management in the RDF Age
Web Data Management in the RDF AgeWeb Data Management in the RDF Age
Web Data Management in the RDF Age
 
Linked Data at ISAW: How and Why
Linked Data at ISAW: How and WhyLinked Data at ISAW: How and Why
Linked Data at ISAW: How and Why
 
20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies20141112 courtot big_datasemwebontologies
20141112 courtot big_datasemwebontologies
 

Viewers also liked

Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesHerbert Van de Sompel
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataHerbert Van de Sompel
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationHerbert Van de Sompel
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersHerbert Van de Sompel
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeHerbert Van de Sompel
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataHerbert Van de Sompel
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkHerbert Van de Sompel
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationHerbert Van de Sompel
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationHerbert Van de Sompel
 
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastMemento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastHerbert Van de Sompel
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiativeHerbert Van de Sompel
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference LinkingHerbert Van de Sompel
 

Viewers also liked (20)

Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked Data
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Attempts at innovation in scholarly communication
Attempts at innovation in scholarly communicationAttempts at innovation in scholarly communication
Attempts at innovation in scholarly communication
 
The Roof is on Fire
The Roof is on FireThe Roof is on Fire
The Roof is on Fire
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 
the UPS protoproto project
the UPS protoproto projectthe UPS protoproto project
the UPS protoproto project
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & Exchange
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
The djatoka Image Server
The djatoka Image ServerThe djatoka Image Server
The djatoka Image Server
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
 
The Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communicationThe Web as infrastructure for scholarly research and communication
The Web as infrastructure for scholarly research and communication
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
The aDORe Federation Architecture
The aDORe Federation ArchitectureThe aDORe Federation Architecture
The aDORe Federation Architecture
 
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the PastMemento: Big Leaps Towards Seamless Navigation of the Web of the Past
Memento: Big Leaps Towards Seamless Navigation of the Web of the Past
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 

Similar to Hiberlink: Investigating Reference Rot, December 2013

Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...Martin Klein
 
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEnsuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEDINA, University of Edinburgh
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...EDINA, University of Edinburgh
 
Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...EDINA, University of Edinburgh
 
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyHIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyPRELIDA Project
 
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky ReichEDINA, University of Edinburgh
 
Web Today, Good Tomorrow? Transactional archiving of web content
Web Today, Good Tomorrow? Transactional archiving of web contentWeb Today, Good Tomorrow? Transactional archiving of web content
Web Today, Good Tomorrow? Transactional archiving of web contentPeter Burnhill
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live WebMartin Klein
 
Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...EDINA, University of Edinburgh
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataLars G. Svensson
 
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Martin Kalfatovic
 
Stronger together: community initiatives in journal management
Stronger together: community initiatives in journal managementStronger together: community initiatives in journal management
Stronger together: community initiatives in journal managementJisc
 
Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record EDINA, University of Edinburgh
 

Similar to Hiberlink: Investigating Reference Rot, December 2013 (20)

Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
Reference Rot in Scholarly Communication: A Reliable Quantification and a P...
 
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEnsuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
 
Reference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and RemedyReference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and Remedy
 
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
Hiberlink: Prototypes of pro-active approaches to support the archiving of we...
 
Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...Prototypes of pro-active approaches to support the archiving of web reference...
Prototypes of pro-active approaches to support the archiving of web reference...
 
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and RemedyHIBERLINK: Reference Rot and Linked Data: Threat and Remedy
HIBERLINK: Reference Rot and Linked Data: Threat and Remedy
 
The opac and the web
The opac and the webThe opac and the web
The opac and the web
 
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich
'Your Scholarship. Our World. Preserving the Long Tail' by Vicky Reich
 
Reference Rot: Threat and Remedy
Reference Rot: Threat and RemedyReference Rot: Threat and Remedy
Reference Rot: Threat and Remedy
 
Is Linked Open Data the way forward?
Is Linked Open Data the way forward?Is Linked Open Data the way forward?
Is Linked Open Data the way forward?
 
Web Today, Good Tomorrow? Transactional archiving of web content
Web Today, Good Tomorrow? Transactional archiving of web contentWeb Today, Good Tomorrow? Transactional archiving of web content
Web Today, Good Tomorrow? Transactional archiving of web content
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 
Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
 
Linked Data Basics
Linked Data BasicsLinked Data Basics
Linked Data Basics
 
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
Thinking of Linking: A random series of ideas, concepts, Platonic ideals, a y...
 
Stronger together: community initiatives in journal management
Stronger together: community initiatives in journal managementStronger together: community initiatives in journal management
Stronger together: community initiatives in journal management
 
Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record
 

More from Herbert Van de Sompel

Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly recordHerbert Van de Sompel
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsHerbert Van de Sompel
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructureHerbert Van de Sompel
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationHerbert Van de Sompel
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveHerbert Van de Sompel
 

More from Herbert Van de Sompel (12)

Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
Paint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner InfrastructurePaint-Yourself-In-The-Corner Infrastructure
Paint-Yourself-In-The-Corner Infrastructure
 
ResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource SynchronizationResourceSync: Web-Based Resource Synchronization
ResourceSync: Web-Based Resource Synchronization
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 

Recently uploaded

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Hiberlink: Investigating Reference Rot, December 2013

  • 1. Investigating Reference Rot in Web-Based Scholarly Communication Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp Martin Klein Los Alamos National Laboratory @mart1nkle1n http://hiberlink.org #hiberlink http://mementoweb.org #memento Hiberlink is funded by the Andrew W. Mellon Foundation
  • 2. Hiberlink Project Partners • Los Alamos National Laboratory: • Research Library: Martin Klein, Robert Sanderson, Herbert Van de Sompel • University of Edinburgh: • Edina: Peter Burnhill, Neil Mayo, Muriel Mewissen, Christine Rees, Tim Stickland, Riachard Wincewicz • Language Technology Group: Beatrice Alex, Claire Grover, Richard Tobin, Ke “Adam” Zhou • Funding: Andrew W. Mellon Foundation Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 3. Acknowledgments • Primary datasets: arXiv, Chesapeake Project, Elsevier, PubMed Central, PLoS, … (many more to come) • Secondary datasets: Ex Libris, MS Academic, SerialsSolutions • Technology support: CrossRef Labs, CrossRef Prospect, Elsevier • Liaisons: archive.is, CrossRef, Internet Archive, Old Dominion University Web Science & Digital Library Research Group, perma.cc Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 4. Reference Rot Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 5. Problem Domain • Web-based scholarly communication links to, references, Web resources: • Formal citing of scholarly resources • Referencing “Web at Large” resources needed or created in research activities e.g. project websites, software, ontologies, workflows, online debate, slides, blogs, videos, etc. Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 6. Problem Domain • Links to web resources are subject to Reference Rot: • Link Rot: Link stops working, e.g. HTTP 404 • Content Decay: Linked content changes over time Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 7. References in Web-Based Scholarly Communication To Scholarly Resources To Web at Large Resources Link Rot Content Decay an increasingly blurry boundary Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 8. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot To Web at Large Resources DOI, HTTP version of DOI Content Decay Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 9. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay To Web at Large Resources Fixity of content Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 10. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay To Web at Large Resources Fixity of content Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 11. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay To Web at Large Resources Fixity of content Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … There are issues here too, see David Rosenthal blog post http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 12. References to Scholarly Resources • We hope/assume that peer-reviewed scholarly literature has fixity and is adequately archived • This, BTW, might not be a correct assumption: • Dynamic, content rich, landing pages • No public audit regarding archival status of electronic journal literature archived in special-purpose infrastructure • Poor archiving in public web archives, related to protected content • Initial information in Keepers Registry indicates spotty archiving of of electronic journal literature • … Still, this is NOT what Hiberlink investigates See David Rosenthal blog post http://blog.dshr.org/2013/11/patio-perspectives-at-anadp-ii.html Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 13. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay To Web at Large Resources Fixity of content Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Hiberlink focus Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 14. References to “Web at Large” Resources • Hiberlink focuses on the wide variety of web resources needed or created in research activities • These resources: • Are not necessarily under the custodianship of a party that cares about long term integrity, access • Do not necessarily have the same sense of fixity that e.g. journal articles have • Reference Rot makes it impossible to adequately recreate the temporal context for scholarly discourse Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 15. Herbert Van de Sompel, et al. (2004) http://dx.doi.org/10.1045/september2004-vandesompel Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 16. !Exist Archived Exist Archived !Exist Archived !Exist !Archived Exist Archived Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 17. Hiberlink: Investigating Reference Rot • Hiberlink explores references to Web at Large resources: • Quantifies Reference Rot • Explores potential solutions to Reference Rot • Focuses on links in electronic journal articles • But has the big picture in mind: dynamic, interdependent, web-based scholarly assets • See Herbert Van de Sompel, From the Version of Record to a Version of the Record, CNI Spring 2013 plenary talk - http://www.youtube.com/watch?v=fhrGSQbNVA Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 18. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay To Web at Large Resources Fixity of content Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Is it worth our time to study this? Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 19. Articles Increasingly Link to Web Resources URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 20. The New York Times Cares http://www.nytimes.com/2013/09/24/us/politics/ in-supreme-court-opinions-clicks-that-lead-nowhere.html Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 21. Reference Rot in Law Journals Zittrain, J., Kendra, A., Lessig, L. (2013) Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations • Link rot in Law Journals: ~27% • Reference rot in law journals: ~70% http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2329161 Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 22. Not Just in Scholarly Communication Zittrain, J., Kendra, A., Lessig, L. (2013) Perma: Scoping and Addressing the Problem of Link and Reference Rot in Legal Citations Liebler, R., Liebert, J. (2012) Something rotten in the State of Legal Citation • Link rot: 29% of links in Supreme Court decisions (study of 19962010) • Reference rot, including link rot: 49.9% of links in Supreme Court decisions http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2329161 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2188070 Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 23. Not Just in Scholarly Communication http://en.wikipedia.org/wiki/Wikipedia_talk:Link_rot Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 24. Quantifying Reference Rot Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 25. Quantifying Reference Rot • Reference Rot has been studied before: • For the web at large • For scholarly communication • For government documents • What is different with Hiberlink? • Investigates Reference Rot not just link rot, i.e. includes the aspect of changing content not just rotting links • Investigates coverage of referenced resources in web archives • Operates at a massive scale regarding number of journal articles, referenced URIs, web archive lookups Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 26. STUDY Author (Date) Lawrence (2001) Casserly (2003) Casserly (2007) Rumsey (2002) Davis (2002) Wren (2004) Sellitto (2005) Goh (2005) Dimitrova (2007) McCown (2005) Wagner (2009) Parker (2007) Duda (2008) Falagas (2007) Russell (2008) Wren (2008) Moghaddam (2010) Sanderson (2011) Year of Publication of Citations 1993-1999 1999-2000 1999-2000 1997-2001 1999-2001 1994-2002 1995-2003 1997-2003 2000-2003 1995-2004 2002-2004 2002-2005 1997-2005 2003-2006 1999-2006 1994-2007 1995-2008 1993-2010 # URIs 67,577 500 500 3,406 688 1,630 1,043 2,516 1,126 4,387 2,011 1,229 2,100 1,417 510 6,154 1,761 162,052 #URIs looked up in web archives 500 500 2.011 1,761 162,052 Sanderson, R., Phillips, M., and Van de Sompel, H. (2011) http://arxiv.org/abs/1105.3459 Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 27. Quantifying Reference Rot - Methodology Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 28.
  • 29. • Various full text corpora • Articles 01/1997-12/2012
  • 30. • URI extraction from XML and PDF • Improvement on URI extraction techniques used in prior research • Validation study planned
  • 31. • Referencing article • Referencing journal • Article dates: submission, acceptation, publication • URI position: abstract, body, footnote, references
  • 32. • Filter DOIs, HTTP version of DOIs • Filter URIs that should have been referenced by means of a DOI • Supported by secondary datasets • Filter obvious noise, e.g. localhost, example.org, foo.bar, licenses, etc.
  • 33.
  • 34. • HTTP HEAD on referenced URI-R • Follow redirects up to a maximum of 50 • Record HTTP transaction chain • If HTTP transaction chain ends with 2XX status code: Exists • If HTTP transaction chain does not end with 2XX: !Exist
  • 35. • Lookup in web archives via a Memento Aggregator that covers among others Internet Archive, Archive-It, archive.is, British Library web archive, UK National Archives web archive, Icelandic web archive
  • 36. • Obtain TimeMap per URI • If TimeMap does not exist: !Archived • If TimeMap exists, select Memento URI-M closest to article publication date • HTTP HEAD on URI-M • Follow archived redirects up to a maximum of 50 • Record HTTP transaction chain • If HTTP transaction chain ends 2XX: Archived • If HTTP transaction chain does not end with 2XX: !Archived
  • 37. Data used for analysis
  • 38. 200k 31.2% 10k 80 90 !Exist Archived Archived within 30 days Archived within 14 days Archived within 7 days Archived within 1 day 50k 100 Quantifying Reference Rot – Early Results 1k 100 40 50 Amount of citations 60 70 16.8% 10 20 30 11.3% 1 0 40.7% 1997 1999 2001 2003 2005 2007 2009 2011 1 5 10 50 Weeks Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013 100 500 1000
  • 39. Study: PubMed Central Corpus 01/1997 – 12/2012 • • • • Articles processed: Articles that contain Web at Large URIs: References to Web at Large URIs: Unique referenced Web at Large URIs: 494,785 176,527 557,432 327,782 Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 40. Percentage Exists & Archived Referenced URIs Exists & Archived !Exists & Archived Exists & !Archived !Exists & !Archived 31.2% 16.8% 11.3% 40.7% URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 41. Percentage Exists & Archived in 30 Day Window 23% 16.7% Exists & Archived !Exists & Archived Exists & !Archived !Exists & !Archived 5.1% 55.2% URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 42. Percentage Exists & Archived in 15 Day Window 24.6% Exists & Archived !Exists & Archived Exists & !Archived !Exists & !Archived 12.4% 3.5% 59.5% URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 43. Percentage Exists & Archived in 07 Day Window 25.8% Exists & Archived !Exists & Archived Exists & !Archived !Exists & !Archived 8.8% 2.3% 63.1% URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 44. Percentage Exists & Archived in 01 Day Window Exists & Archived !Exists & Archived Exists & !Archived !Exists & !Archived 27.9% 0.9% 0.2% 71% URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 45. 50 0 10 20 30 40 Percent 60 70 80 90 100 Percentage of !Exists per Year 1997 1999 2001 2003 2005 2007 2009 2011 URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 46. 100 Percentage of !Exists, Archived per Year 0 10 20 30 40 50 60 70 80 90 !Exist Archived Archived within 30 days Archived within 14 days Archived within 7 days Archived within 1 day 1997 1999 2001 2003 2005 2007 2009 2011 URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 47. 100 90 80 0 10 20 30 40 50 60 70 80 70 60 50 40 30 0 10 20 Percent Percentage !Exists URIs 90 !Exist Archived Archived within 30 days Archived within 14 days Archived within 7 days Archived within 1 day 1997 1999 2001 2003 2005 2007 2009 2011 Percentage Archived URIs for !Exists URIs 100 Percentage of !Exists and of Those Archived per Year URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 48. 100 1000 10000 30000 Absolute Number of Archived per Year 1 Archived Archived within 30 days Archived within 14 days Archived within 7 days Archived within 1 day 1997 1999 2001 2003 2005 2007 2009 2011 URIs extracted from PubMed papers – links to Web at Large resources Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 49. Solving Reference Rot Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 50. References in Web-Based Scholarly Communication To Scholarly Resources Link Rot DOI, HTTP version of DOI Content Decay Fixity of content To Web at Large Resources - Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 51. Addressing Content Decay • Aim for a more pro-active approach to collect snapshots of web resources (likely to be) referenced in scholarly communication • A system that hosts resources that are likely to be referenced in scholarly communication can create snapshots of itself by: o Using CMS, wikis, datawikis with solid versioning mechanisms o Subscribing to on-demand self web archiving service o Using transactional web archives, cf. SiteStory • Referenced resources can be web archived on-demand: o By authors during note taking, authoring o By platforms involved in the publication process, e.g. archiving linked resources at the time of manuscript submission Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 52. References in Web-Based Scholarly Communication To Scholarly Resources To Web at Large Resources Link Rot DOI, HTTP version of DOI Content Decay Fixity of content - Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Web archiving Content Versioning Systems Self archiving Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 53. Click link to blog post http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 54. Receive page http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 55. Search and find Mementos in Internet Archive for http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 56. Search and find a Memento in archive.is for http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 57. Click perma.cc link to Memento of blog post http://perma.cc/0Hg62eLdZ3T Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 58. Receive Memento from perma.cc http://perma.cc/0Hg62eLdZ3T Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 59. Search and do not find Mementos in Internet Archive for http://perma.cc/0Hg62eLdZ3T Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 60. Search and do not find Mementos in archive.is for http://perma.cc/0Hg62eLdZ3T Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 61. What Happened? • Good news: The number of archived copies of the blog post was increased by pro-actively creating a Memento in perma.cc • Bad news: The possibility of finding Mementos for the blog post in other web archives was undermined by replacing the Original URI-R with the Memento URI-M • The Memento URI-M is a key in only one archive • The Original URI-R is a key in all web archives • Using the Memento URI-M in a link requires the permanent existence/uptime of the archive that issued it • One link rot problem was replaced by another … Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 62. Web Archives Less Permanent than Permanent? http://webcitation.org Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 63. Web Archives Less Permanent than Permanent? http://ws-dl.blogspot.com/2013/11/2013-11-21-conservative-party-speeches.html Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 64. Web Archives Less Permanent than Permanent? http://richmondsfblog.com/2013/11/06/part-of-internet-archive-building-badly-burned-in-earlymorning-fire/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 65. What To Do? • Need an approach for referencing archived resources that supports lookups in many web archives, not just one • Since the Original URI-R is a key in all web archives, the linking approach needs to necessarily include it • Hence, two URIs are required: • The Original URI-R • The Memento URI-M, e.g. the perma.cc URI • But a link in HTML only carries one URI! • It is understandable that the Memento URI-M is used for the link: the approach works with existing web infrastructure • Yet, an approach to address link rot that itself is subject to link rot is … err… problematic Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 66. The Missing Link Proposal • Extend the link to the Original URI-R with temporal context: • Memento URI-M in a specific archive • Dates: • date of page that contains the link • date of the link, cf. “accessed at” in citations of web resources • Provide the Original URI-R and the temporal context in a machine-actionable manner so it can be used by user and machine agents to retrieve Mementos from various web archives http://mementoweb.org/missing-link/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 67. The Missing Link Proposal http://mementoweb.org/missing-link/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 68. How to Make Missing Link Happen? • The existing approach works out of the box but is problematic • Missing Link requires infrastructure changes but generally contributes to increased web persistence: • HTML • META for page date: no problem, already in use • Attributes for <a> to convey URI-M and link date: • data- extensibility mechanism in HTML5 can be used but is not intended for cross-site applications • In 1995, HTML had the URN attribute for <a> as a means to address web persistence concerns • Browser, tool support Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 69. References in Web-Based Scholarly Communication To Scholarly Resources To Web at Large Resources Link Rot DOI, HTTP version of DOI Missing Link proposal Content Decay Fixity of content - Archiving: CLoCKSS, LoCKSS, Portico, Keepers Registry, … Web archiving Content Versioning Systems Self archiving Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 70. Demo: Application Using Temporal Context for Links Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 71. Application Using Temporal Context for Links • Memento for Chrome is an application that uses Original URI-R and dates to access Mementos in various web archives • Memento around the date selected in user interface calendar • Most recently archived Memento Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 72. Memento Time Travel for Chrome http://bit.ly/memento-for-chrome Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 73. Memento Time Travel for Chrome http://www.youtube.com/watch?v=0_70lQPOOIg http://www.youtube.com/watch?v=WtZHKeFwjzk Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 74. Application Using Temporal Context for Links • An experimental version of Memento for Chrome also uses Missing Link information (Original URI-R, URI-M, and dates) to access Mementos in various web archives: • Memento around the date selected in user interface calendar • Most recently archived Memento • Memento around the date of the page that contains the link • Memento around the date of the link • Memento URI-M in a specific archive • A Memento client is just one example of an application that can use temporal context provided for links. Other applications, including search engines, can use it too Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 75. NYT has <META itemprop=“datePublished” content=“2013-09-23”> Link in NYT was: <a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/”> Changed to: <a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/” data-versionurl=“http://perma.cc/0Hg62eLdZ3T”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 76. Right Click Link Get near current time (done on Nov 25 2013) http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ enabler: <a href=“URI-R”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 77. Receive Memento from archive.is, Nov 24 2013 http://archive.is/20131124221749/http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 78. Right Click Link Get at page date http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-23”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 79. Receive Memento from Internet Archive, Sep 24 2013 http://web.archive.org/web/20130924053315/http://futureoftheinternet/2013/09/22/perma Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 80. Right Click Link Get from perma.cc http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/ enabler: <a href=“URI-R” data-versionurl=“URI-M”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 81. Receive Memento from perma.cc, Oct 2 2013 http://perma.cc/0Hg62eLdZ3T Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 82. Link in NYT was: <a href=“http://perma.cc/0Hg62eLdZ3T”> Changed to: <a href=“http://blogs.law.harvard.edu/futureoftheinternet/2013/09/22/perma/” data-versionurl=“http://perma.cc/0Hg62eLdZ3T”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 83. All previous options available Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 84. Added: <META itemprop=“datePublished” content=“2013-09-22”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 85. Click Link (done on November 25 2013) http://en.wikipedia.org/wiki/Link_rot enabler: <a href=“URI-R”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 86. Receive Page http://en.wikipedia.org/wiki/Link_rot Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 87. Scroll down in page Shows Perma.cc link, added October 22 2013, a month after the blog post Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 88. Right Click Link Get at page date http://en.wikipedia.org/Link_rot enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-22”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 89. Receive Page http://en.wikipedia.org/w/index.php?title=Link_rot&oldid=571327764 Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 90. Scroll down in page Does not show Perma.cc link, added October 22 2013, a month after the blog post Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 91. Link in blog was: <a href=“http://librarylab.law.harvard.edu”> Changed (for fun) to: <a href=“http://librarylab.law.harvard.edu” data-versiondate=“2010-09-22”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 92. Click Link (done on November 25 2013) http://librarylab.law.harvard.edu enabler: <a href=“URI-R”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 93. Receive Page http://librarylab.law.harvard.edu Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 94. Right Click Link Get at page date http://librarylab.law.harvard.edu enabler: <a href=“URI-R”> & <META itemprop=“datePublished” content=“2013-09-22”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 95. Receive Memento from archive.is, Jun 21 2013 http://archive.is/20130621162538/http://librarylab.law.harvard.edu Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 96. Right Click Link Get at link date http://librarylab.law.harvard.edu enabler: <a href=“URI-R” data-versiondate=“2010-09-22”> Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 97. Receive Memento from Internet Archive, Sep 18 2010 http://web.archive.org/web/20100918025331/http://librarylab.law.harvard.edu Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 98. Bottom Line: A Link Leads to Many Times and Archives http://mementoweb.org/missing-link/ Herbert Van de Sompel, Martin Klein – Hiberlink CNI Fall 2013, Washington, DC, December 9 2013
  • 99. Investigating Reference Rot in Web-Based Scholarly Communication Herbert Van de Sompel Los Alamos National Laboratory @hvdsomp Martin Klein Los Alamos National Laboratory @mart1nkle1n http://hiberlink.org #hiberlink http://mementoweb.org #memento Hiberlink is funded by the Andrew W. Mellon Foundation

Editor's Notes

  1. The basic consideration in the talk is that life used to be simple when scholarly assets were PDFs: single frozen assets
  2. Problem in scholarly communication, legal journals, supreme court opinions, wikipedia, … Since the problem is so broad, need a solution that works for the wqeb at large not just for scholarly communication
  3. Quote from Wagner et al:Because sites such as Internet Archive and WebCite will remove archived web pages at the owners’request, authors should not depend on these utilitiesas the sole archives for web-based information.