SlideShare a Scribd company logo
1 of 24
Measure All the
(Web Archiving) Things!
Nicholas Taylor
Web Archiving Service Manager
Stanford University Libraries
Archive-It Partner Meeting
August 18, 2015
how many more websites are we archiving?
“Library_01.jpg” by British Library
crawl report list
Archive-It: “Crawls for Account #198”
seeds for individual crawl
Archive-It: “Seeds for Crawl #99435”
download seed list
Archive-It: “Seeds for Crawl #99435”
downloaded seed list
whew, that was easy!
oh, wait a minute…
seed lists are per crawl
well, how many crawls are there?
• 6 accounts
• oldest active since 2007
• 30+ collections
• hundreds of crawls
count and average not enough
• seeds move in and out of
crawls
• seeds have different
frequencies
• new seeds w/ new URLs
for old seeds
• “university website” is
many seeds
plus
• non Archive-It web
archiving activity
“Dichotomic Maples” by francoismi under CC BY-NC-SA 2.0
“what gets measured, gets managed”
“Gudauri still life” by Carsten ten Brink under CC BY-NC-ND 2.0
why measure?
• advocacy/outreach
• service modeling
• program assessment
• policy making
• staffing assessment
• grant support
• prioritization
• risk assessment “Measuring river depth” by epeirogenic under CC BY-NC 2.0
what to measure?
• How to handle the data volume?
• What is the usage of web archives?
• How much does web archiving cost?
• How to assure the quality of archived content?
• How to secure institutional buy-in?
• How much loss have resources suffered?
• What is the impact of policy requirements?
community-valued metrics
0%
10%
20%
30%
40%
50%
60%
Volume Usage Cost Quality Buy-in Loss Policy
Percentage of organizations
NDSA: “Web Archiving in the United States: a 2013 Survey”
volume
• websites
– captured
– preserved
– described
• data
– captured
– preserved
• objects
– captured
– preserved “typography jumble” by Bill Dickinson under CC BY-NC 2.0
usage
• web analytics
– visitors
– visits
– referers
• actual use cases
(who + how many?)
– research
– teaching
– institutional legacy
– compliance
“113/365 Days: A page from my heart” by LaughingRhoda under CC BY-NC-ND 2.0
cost
• external
– out-payments for web
archiving services
– quota utilization
• internal
– staff time, by activity
– storage “Largest square from a dollar bill” by origami_madness under CC BY-NC 2.0
performance
• accessioning
throughput
• service request
turnaround
• collections/websites
w/ discovery records
• time to regenerate
full-text index
“Lower rack” by Andy Melton under CC BY-SA 2.0
community-valued…metrics?
0%
10%
20%
30%
40%
50%
60%
Volume Usage Cost Quality Buy-in Loss Policy
Percentage of organizations
NDSA: “Web Archiving in the United States: a 2013 Survey”
“not everything that counts can be counted”
“Ten Floods, Twenty-Five Trees, Nineteen Bubbles...” by Flood G. under CC BY-NC-ND 2.0
quality
• use case-specific?
• benchmark to ideal or
to limits of tools?
• quantifiable metrics?
• existing metrics as
proxies for quality?
• sampling approach?
• not just missing content
but also collected junk
NYARC: “I. Introduction - NYARC Documentation”
buy-in
• unique nominators?
• projects w/ web archiving
component?
• budgetary commitments?
• resource commitments?
• charge for service?
• testimonials?
“The Play” by Ryan Hyde under CC BY-SA 2.0
loss
UK Web Archive: “Ten years of the UK Web Archive: What have we saved?”
policy
• first capture under
embargo
• opt-out requests
• takedown requests
• external environment
“We apologise for any convenience - Update” by Alan Stanton under CC BY-SA 2.0
better measures, measuring better
“Line Art Project #2 VIS3 UCSD” by Mandy Jouan under CC BY-NC-ND 2.0

More Related Content

Viewers also liked

"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overviewMichele Weigle
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet ArchiveMichael Nelson
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital librariesSören Auer
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...nullhandle
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Developmentnullhandle
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Togethernullhandle
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Programnullhandle
 

Viewers also liked (8)

"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital libraries
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Development
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
 

Similar to Measure All the (Web Archiving) Things!

AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAmazon Web Services
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...nullhandle
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
Web-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationWeb-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationRachel Vacek
 
Content & Features Reno: Less Is More
Content & Features Reno: Less Is MoreContent & Features Reno: Less Is More
Content & Features Reno: Less Is MoreCharlie Morris
 
Collection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedCollection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedTony Horava
 
Collection management in a digital age ola2011
Collection management in a digital age ola2011Collection management in a digital age ola2011
Collection management in a digital age ola2011Tony Horava
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAmazon Web Services
 
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsAmazon Web Services
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchJaap Kamps
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web DataMarieke Guy
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open DataBrian Hole
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...Amazon Web Services
 
ISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningChristian Glahn
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congressnullhandle
 
Cro presentation for library jan13v2
Cro presentation for library jan13v2Cro presentation for library jan13v2
Cro presentation for library jan13v2NeilStewartCity
 
CLA Digital Collection Development
CLA Digital Collection DevelopmentCLA Digital Collection Development
CLA Digital Collection Developmentuclagovinfolibrarian
 

Similar to Measure All the (Web Archiving) Things! (20)

AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Web-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationWeb-Scale Discovery: Post Implementation
Web-Scale Discovery: Post Implementation
 
Cil06giltrud(1)
Cil06giltrud(1)Cil06giltrud(1)
Cil06giltrud(1)
 
Content & Features Reno: Less Is More
Content & Features Reno: Less Is MoreContent & Features Reno: Less Is More
Content & Features Reno: Less Is More
 
Collection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedCollection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revised
 
Collection management in a digital age ola2011
Collection management in a digital age ola2011Collection management in a digital age ola2011
Collection management in a digital age ola2011
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
 
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open Data
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
 
ISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for Learning
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
 
Cro presentation for library jan13v2
Cro presentation for library jan13v2Cro presentation for library jan13v2
Cro presentation for library jan13v2
 
Measuring impact
Measuring impactMeasuring impact
Measuring impact
 
CLA Digital Collection Development
CLA Digital Collection DevelopmentCLA Digital Collection Development
CLA Digital Collection Development
 

More from nullhandle

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archivesnullhandle
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...nullhandle
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIsnullhandle
 
Interoperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media ArchivingInteroperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media Archivingnullhandle
 
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...nullhandle
 
2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlightsnullhandle
 
Collection Development for Selective Web Archiving
Collection Development for Selective Web ArchivingCollection Development for Selective Web Archiving
Collection Development for Selective Web Archivingnullhandle
 
Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?nullhandle
 
WASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsWASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsnullhandle
 
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web ArchivingOutreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web Archivingnullhandle
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...nullhandle
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Researchnullhandle
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlightsnullhandle
 
Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivabilitynullhandle
 
Building Archivable Websites
Building Archivable WebsitesBuilding Archivable Websites
Building Archivable Websitesnullhandle
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistencenullhandle
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULnullhandle
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archivingnullhandle
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Researchnullhandle
 
Designing Preservable Websites
Designing Preservable WebsitesDesigning Preservable Websites
Designing Preservable Websitesnullhandle
 

More from nullhandle (20)

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archives
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
 
Interoperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media ArchivingInteroperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media Archiving
 
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
 
2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights
 
Collection Development for Selective Web Archiving
Collection Development for Selective Web ArchivingCollection Development for Selective Web Archiving
Collection Development for Selective Web Archiving
 
Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?
 
WASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsWASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIs
 
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web ArchivingOutreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Research
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights
 
Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivability
 
Building Archivable Websites
Building Archivable WebsitesBuilding Archivable Websites
Building Archivable Websites
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistence
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SUL
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Research
 
Designing Preservable Websites
Designing Preservable WebsitesDesigning Preservable Websites
Designing Preservable Websites
 

Recently uploaded

Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 

Recently uploaded (17)

Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 

Measure All the (Web Archiving) Things!

  • 1. Measure All the (Web Archiving) Things! Nicholas Taylor Web Archiving Service Manager Stanford University Libraries Archive-It Partner Meeting August 18, 2015
  • 2. how many more websites are we archiving? “Library_01.jpg” by British Library
  • 3. crawl report list Archive-It: “Crawls for Account #198”
  • 4. seeds for individual crawl Archive-It: “Seeds for Crawl #99435”
  • 5. download seed list Archive-It: “Seeds for Crawl #99435”
  • 8. oh, wait a minute… seed lists are per crawl well, how many crawls are there? • 6 accounts • oldest active since 2007 • 30+ collections • hundreds of crawls
  • 9. count and average not enough • seeds move in and out of crawls • seeds have different frequencies • new seeds w/ new URLs for old seeds • “university website” is many seeds plus • non Archive-It web archiving activity “Dichotomic Maples” by francoismi under CC BY-NC-SA 2.0
  • 10. “what gets measured, gets managed” “Gudauri still life” by Carsten ten Brink under CC BY-NC-ND 2.0
  • 11. why measure? • advocacy/outreach • service modeling • program assessment • policy making • staffing assessment • grant support • prioritization • risk assessment “Measuring river depth” by epeirogenic under CC BY-NC 2.0
  • 12. what to measure? • How to handle the data volume? • What is the usage of web archives? • How much does web archiving cost? • How to assure the quality of archived content? • How to secure institutional buy-in? • How much loss have resources suffered? • What is the impact of policy requirements?
  • 13. community-valued metrics 0% 10% 20% 30% 40% 50% 60% Volume Usage Cost Quality Buy-in Loss Policy Percentage of organizations NDSA: “Web Archiving in the United States: a 2013 Survey”
  • 14. volume • websites – captured – preserved – described • data – captured – preserved • objects – captured – preserved “typography jumble” by Bill Dickinson under CC BY-NC 2.0
  • 15. usage • web analytics – visitors – visits – referers • actual use cases (who + how many?) – research – teaching – institutional legacy – compliance “113/365 Days: A page from my heart” by LaughingRhoda under CC BY-NC-ND 2.0
  • 16. cost • external – out-payments for web archiving services – quota utilization • internal – staff time, by activity – storage “Largest square from a dollar bill” by origami_madness under CC BY-NC 2.0
  • 17. performance • accessioning throughput • service request turnaround • collections/websites w/ discovery records • time to regenerate full-text index “Lower rack” by Andy Melton under CC BY-SA 2.0
  • 18. community-valued…metrics? 0% 10% 20% 30% 40% 50% 60% Volume Usage Cost Quality Buy-in Loss Policy Percentage of organizations NDSA: “Web Archiving in the United States: a 2013 Survey”
  • 19. “not everything that counts can be counted” “Ten Floods, Twenty-Five Trees, Nineteen Bubbles...” by Flood G. under CC BY-NC-ND 2.0
  • 20. quality • use case-specific? • benchmark to ideal or to limits of tools? • quantifiable metrics? • existing metrics as proxies for quality? • sampling approach? • not just missing content but also collected junk NYARC: “I. Introduction - NYARC Documentation”
  • 21. buy-in • unique nominators? • projects w/ web archiving component? • budgetary commitments? • resource commitments? • charge for service? • testimonials? “The Play” by Ryan Hyde under CC BY-SA 2.0
  • 22. loss UK Web Archive: “Ten years of the UK Web Archive: What have we saved?”
  • 23. policy • first capture under embargo • opt-out requests • takedown requests • external environment “We apologise for any convenience - Update” by Alan Stanton under CC BY-SA 2.0
  • 24. better measures, measuring better “Line Art Project #2 VIS3 UCSD” by Mandy Jouan under CC BY-NC-ND 2.0