SlideShare a Scribd company logo
1 of 34
Download to read offline
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmarks for
Digital Preservation Tools



Kresimir Duretec, Artur Kulmukhametov, Andreas Rauber and Christoph Becker

Vienna University of Technology and University of Toronto







IPRES 2015 , Chapel Hill, North Carolina, USA
November 3, 2015
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview
1. Need for evidence in digital
preservation
•Software tools in digital preservation
and evidence around their quality
2. Software benchmarks in digital
preservation
•What, Why, How
•Software quality
3. Benchmark proposals
•Photo migration
•Property extraction from documents
4. Are we ready? Next steps https://goo.gl/ps6x8I
In case you get bored
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview
1. Need for evidence in digital
preservation
•Software tools in digital preservation
and evidence around their quality
2. Software benchmarks in digital
preservation
•What, Why, How
•Software quality
3. Benchmark proposals
•Photo migration
•Property extraction from documents
4. Are we ready? Next steps https://goo.gl/ps6x8I
In case you get bored
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Evidence in digital preservation
Important topic !!!
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Evidence in digital preservation
Important topic !!!
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Software tools in digital preservation
•many tools (>400 tools
in COPTR registry
http://coptr.digipres.org)
•categories
•identification,
validation &
characterisation
•migration
•emulation
•….
•high quality required
Droid
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
However !!!
Blog post at
OPF by paul
on 19th Oct
2012.
http://openpreservation.org/blog/2012/10/19/practitioners-have-spoken-we-need-better-characterisation/
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
However !!!
Blog post at
OPF by johan
on 31st Jan
2014.
http://openpreservation.org/blog/2014/01/31/why-cant-we-have-digital-preservation-tools-just-work/
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
However !!!
Panel at IPRES
2014
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Evidence of software quality in digital
preservation
•the need for evaluation is not new !
•testbeds and numerous individual evaluation initiatives
•often take-up is missing
2002 2006 2010
Dutch Digital
Preservation Testbed
DELOS Digital
PreservationTestbed
Planets Testbed
?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview
1. Need for evidence in digital
preservation
•Software tools in digital preservation
and evidence around their quality
2. Software benchmarks in digital
preservation
•What, Why, How
•Software quality
3. Benchmark proposals
•Photo migration
•Property extraction from documents
4. Are we ready? Next steps https://goo.gl/ps6x8I
In case you are already
bored
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Software benchmarks
•Proposal : Start creating software benchmarks to extend
the evidence base around software quality of digital
preservation tools
•major characteristics
•collaborative
•open
•public
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Software benchmarks - definition
Source Definition key words
SPEC
glossary
“a "benchmark" is a test, or set of tests, designed to
compare the performance of one computer system against
the performance of others”
test, compare,
performance
IEEE
Glossary
“1. a standard against which measurements or comparisons
can be made. 2. a procedure, problem, or test that can be
used to compare systems or components to each other or
to a standard. 3. a recovery file”
measurements,
comparison,
test
W. Tichy “a benchmark is a task domain sample executed by a
computer or by a human and computer. During execution,
the human or computer records well-defined performance
measurements. A benchmark provides a level playing field
for competing ideas, and (assuming the benchmark is
sufficiently representative) allows repeatable and objective
comparisons”
task domain
sample,
performance
measurements,
repeatable
objective
comparison
Sim et al. “a test or set of tests used to compare the performance of
alternative tools or techniques”
test, compare
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Software benchmarks - definition
Source Definition key words
SPEC
glossary
“a "benchmark" is a test, or set of tests, designed to
compare the performance of one computer system against
the performance of others”
test, compare,
performance
IEEE
Glossary
“1. a standard against which measurements or comparisons
can be made. 2. a procedure, problem, or test that can be
used to compare systems or components to each other or
to a standard. 3. a recovery file”
measurements,
comparison,
test
W. Tichy “a benchmark is a task domain sample executed by a
computer or by a human and computer. During execution,
the human or computer records well-defined performance
measurements. A benchmark provides a level playing field
for competing ideas, and (assuming the benchmark is
sufficiently representative) allows repeatable and objective
comparisons”
task domain
sample,
performance
measurements,
repeatable
objective
comparison
Sim et al. “a test or set of tests used to compare the performance of
alternative tools or techniques”
test, compare
DP Software Benchmark is a set of tests which enable
rigorous comparison of software solutions
for a specific task in digital preservation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmarks in other fields
•Software Engineering
•transactions
•virtualisation
•databases
•…
•Information Retrieval
•text
•video
•music
•…
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmarks in other fields
community
specifications tools & methods that
need benchmarks
yearly meeting
resultsiterative process
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmark components
Software engineering
Motivating comparison
Task sample
Performance measures
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmark components
Information retrieval
Tasks
Datasets
Answersets
Measures
Representation formats
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmark components
Software engineering
Motivating comparison
Task sample
Performance measures
Information retrieval
Tasks
Datasets
Answersets
Measures
Representation formats
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Motivating comparison
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•What is benchmark supposed to compare ?
•Which quality aspects are addressed ?
•What would be the benefits ?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Motivating comparison
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•software quality
•many aspects
•we need a model
•ISO SQUARE
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Function
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•specific task tool is expected to perform
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dataset
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•set of digital objects
•images
•audio
•video
•relevant
•accurate
•complete
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ground truth
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•correct answers
•optional element
•machine readable
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Performance measures
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•fitness of the benchmarked tool
•type
•quantitative
•qualitative
•calculated
•by human
•by machine
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview
1. Need for evidence in digital
preservation
•Software tools in digital preservation
and evidence around their quality
2. Software benchmarks in digital
preservation
•What, Why, How
•Software quality
3. Benchmark proposals
•Raw Photo migration
•Property extraction from documents
4. Are we ready? Next steps https://goo.gl/ps6x8I
In case you are
bored
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
RAW Photo migration benchmark
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•various RAW file formats
•migrate to DNG
•many tools should be tested
•migrate from RAW to DNG
•collection of photographs in RAW format
•SSIM
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Property values extraction benchmark
Digital preservation
Motivating comparison
Function
Dataset
Ground truth
Performance measures
•different document properties
•coverage of needed properties
•correctness of extracted values
•extract property values from a document
•collection of documents in MS Word 2007
file format
•correct values
•coverage
•accuracy
•exact match ratio
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Overview
1. Need for evidence in digital
preservation
•Software tools in digital preservation
and evidence around their quality
2. Software benchmarks in digital
preservation
•What, Why, How
•Software quality
3. Benchmark proposals
•Raw Photo migration
•Property extraction from documents
4. Are we ready? Next steps https://goo.gl/ps6x8I
In case you are
bored
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Benchmarking - community driven process
•benchmarks emerge from the community consensus
•community participation needs to be enabled
•based on established results where possible
•community needs to incur the costs
•continuous evolution of benchmarks is necessary
•Are we ready ?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Is the community ready ? Indicators
•projects
•workshops
•own conference
•several other
journals and
conferences
• concerns
•vocabularies
•metrics
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Next step
create benchmark specifications
Benchmarking forum
workshop
Friday 6th November
9:00 - 17:00
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Questions ?
https://goo.gl/ps6x8I
In case you were not
bored

More Related Content

Viewers also liked

Alexander academy tuition-and-fees-all-programs-2015
Alexander academy tuition-and-fees-all-programs-2015Alexander academy tuition-and-fees-all-programs-2015
Alexander academy tuition-and-fees-all-programs-2015iamprosperous
 
Canadian tourism college hrm diplomas + testimonials
Canadian tourism college hrm diplomas + testimonialsCanadian tourism college hrm diplomas + testimonials
Canadian tourism college hrm diplomas + testimonialsiamprosperous
 
Queens university-handbook
Queens university-handbookQueens university-handbook
Queens university-handbookiamprosperous
 
Glenlyon norfolk school ib
Glenlyon norfolk school ibGlenlyon norfolk school ib
Glenlyon norfolk school ibiamprosperous
 
Pickering College - school and appearance&requirements and expectatins
Pickering College - school and appearance&requirements and expectatinsPickering College - school and appearance&requirements and expectatins
Pickering College - school and appearance&requirements and expectatinsiamprosperous
 
UCW 加西大学 工商管理硕士
UCW 加西大学   工商管理硕士UCW 加西大学   工商管理硕士
UCW 加西大学 工商管理硕士iamprosperous
 
Alexander college general flyer english
Alexander college general flyer englishAlexander college general flyer english
Alexander college general flyer englishiamprosperous
 
York School District unlocking potencial for learning
York School District unlocking potencial for learningYork School District unlocking potencial for learning
York School District unlocking potencial for learningiamprosperous
 

Viewers also liked (15)

Alexander academy tuition-and-fees-all-programs-2015
Alexander academy tuition-and-fees-all-programs-2015Alexander academy tuition-and-fees-all-programs-2015
Alexander academy tuition-and-fees-all-programs-2015
 
Gds
GdsGds
Gds
 
Canadian tourism college hrm diplomas + testimonials
Canadian tourism college hrm diplomas + testimonialsCanadian tourism college hrm diplomas + testimonials
Canadian tourism college hrm diplomas + testimonials
 
Queens university-handbook
Queens university-handbookQueens university-handbook
Queens university-handbook
 
Techniques for Preserving Scientific Software Executions: Preserve the Mess o...
Techniques for Preserving Scientific Software Executions: Preserve the Mess o...Techniques for Preserving Scientific Software Executions: Preserve the Mess o...
Techniques for Preserving Scientific Software Executions: Preserve the Mess o...
 
UCW b comm brochure
UCW b comm brochureUCW b comm brochure
UCW b comm brochure
 
Glenlyon norfolk school ib
Glenlyon norfolk school ibGlenlyon norfolk school ib
Glenlyon norfolk school ib
 
Pickering College - school and appearance&requirements and expectatins
Pickering College - school and appearance&requirements and expectatinsPickering College - school and appearance&requirements and expectatins
Pickering College - school and appearance&requirements and expectatins
 
UCW 加西大学 工商管理硕士
UCW 加西大学   工商管理硕士UCW 加西大学   工商管理硕士
UCW 加西大学 工商管理硕士
 
Alexander college general flyer english
Alexander college general flyer englishAlexander college general flyer english
Alexander college general flyer english
 
York School District unlocking potencial for learning
York School District unlocking potencial for learningYork School District unlocking potencial for learning
York School District unlocking potencial for learning
 
Archiving Deferred Representations Using a Two-Tiered Crawling Approach. Just...
Archiving Deferred Representations Using a Two-Tiered Crawling Approach. Just...Archiving Deferred Representations Using a Two-Tiered Crawling Approach. Just...
Archiving Deferred Representations Using a Two-Tiered Crawling Approach. Just...
 
Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies...
Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies...Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies...
Preserving the Fruit of Our Labor: Establishing Digital Preservation Policies...
 
One Core Preservation System for all your Data. No Exceptions! Marco Klindt a...
One Core Preservation System for all your Data. No Exceptions! Marco Klindt a...One Core Preservation System for all your Data. No Exceptions! Marco Klindt a...
One Core Preservation System for all your Data. No Exceptions! Marco Klindt a...
 
Educational Records of Practice: Preservation and Access Concerns. Elizabeth ...
Educational Records of Practice: Preservation and Access Concerns. Elizabeth ...Educational Records of Practice: Preservation and Access Concerns. Elizabeth ...
Educational Records of Practice: Preservation and Access Concerns. Elizabeth ...
 

Similar to Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhametov, Andreas Rauber and Christoph Becker

Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerAdel Belasker
 
Abstract contents
Abstract contentsAbstract contents
Abstract contentsloisy28
 
bkremer-report-final
bkremer-report-finalbkremer-report-final
bkremer-report-finalBen Kremer
 
Software architecture for developers
Software architecture for developersSoftware architecture for developers
Software architecture for developersChinh Ngo Nguyen
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Nóra Szepes
 
DBMS_Lab_Manual_&_Solution
DBMS_Lab_Manual_&_SolutionDBMS_Lab_Manual_&_Solution
DBMS_Lab_Manual_&_SolutionSyed Zaid Irshad
 
Windows Internals Part 1_6th Edition.pdf
Windows Internals Part 1_6th Edition.pdfWindows Internals Part 1_6th Edition.pdf
Windows Internals Part 1_6th Edition.pdfLokeshSainathGudivad
 
bonino_thesis_final
bonino_thesis_finalbonino_thesis_final
bonino_thesis_finalDario Bonino
 
Hadoop as an extension of DW
Hadoop as an extension of DWHadoop as an extension of DW
Hadoop as an extension of DWSidi yazid
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsKelly Lipiec
 

Similar to Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhametov, Andreas Rauber and Christoph Becker (20)

Master thesis
Master thesisMaster thesis
Master thesis
 
Moss2007
Moss2007Moss2007
Moss2007
 
Master's Thesis
Master's ThesisMaster's Thesis
Master's Thesis
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel Belasker
 
Abstract contents
Abstract contentsAbstract contents
Abstract contents
 
bkremer-report-final
bkremer-report-finalbkremer-report-final
bkremer-report-final
 
Aregay_Msc_EEMCS
Aregay_Msc_EEMCSAregay_Msc_EEMCS
Aregay_Msc_EEMCS
 
Predictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chaptersPredictive Modeling and Analytics select_chapters
Predictive Modeling and Analytics select_chapters
 
Vba uk
Vba ukVba uk
Vba uk
 
Systems se
Systems seSystems se
Systems se
 
Software architecture for developers
Software architecture for developersSoftware architecture for developers
Software architecture for developers
 
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
Thesis - Nora Szepes - Design and Implementation of an Educational Support Sy...
 
DBMS_Lab_Manual_&_Solution
DBMS_Lab_Manual_&_SolutionDBMS_Lab_Manual_&_Solution
DBMS_Lab_Manual_&_Solution
 
Windows Internals Part 1_6th Edition.pdf
Windows Internals Part 1_6th Edition.pdfWindows Internals Part 1_6th Edition.pdf
Windows Internals Part 1_6th Edition.pdf
 
bonino_thesis_final
bonino_thesis_finalbonino_thesis_final
bonino_thesis_final
 
Hadoop as an extension of DW
Hadoop as an extension of DWHadoop as an extension of DW
Hadoop as an extension of DW
 
Report
ReportReport
Report
 
Dynamics AX/ X++
Dynamics AX/ X++Dynamics AX/ X++
Dynamics AX/ X++
 
An Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing UnitsAn Optical Character Recognition Engine For Graphical Processing Units
An Optical Character Recognition Engine For Graphical Processing Units
 
Mobile d
Mobile dMobile d
Mobile d
 

More from 12th International Conference on Digital Preservation (iPRES 2015)

More from 12th International Conference on Digital Preservation (iPRES 2015) (9)

Project Chrysalis – Transforming the Digital Business of the National Archive...
Project Chrysalis – Transforming the Digital Business of the National Archive...Project Chrysalis – Transforming the Digital Business of the National Archive...
Project Chrysalis – Transforming the Digital Business of the National Archive...
 
DataNet Federation Consortium Preservation Policy Toolkit. Reagan Moore, Arco...
DataNet Federation Consortium Preservation Policy Toolkit. Reagan Moore, Arco...DataNet Federation Consortium Preservation Policy Toolkit. Reagan Moore, Arco...
DataNet Federation Consortium Preservation Policy Toolkit. Reagan Moore, Arco...
 
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
Characterization of CDROMs for Emulation-based Access. Klaus Rechert, Thomas ...
 
Beyond the Binary: Pre-Ingest Preservation of Metadata. Jessica Moran and Jay...
Beyond the Binary: Pre-Ingest Preservation of Metadata. Jessica Moran and Jay...Beyond the Binary: Pre-Ingest Preservation of Metadata. Jessica Moran and Jay...
Beyond the Binary: Pre-Ingest Preservation of Metadata. Jessica Moran and Jay...
 
Experiment, Document & Decide: A Collaborative Approach to Preservation Plann...
Experiment, Document & Decide: A Collaborative Approach to Preservation Plann...Experiment, Document & Decide: A Collaborative Approach to Preservation Plann...
Experiment, Document & Decide: A Collaborative Approach to Preservation Plann...
 
Functional Access to Forensic Disk Images in a Web Service. Kam Woods, Christ...
Functional Access to Forensic Disk Images in a Web Service. Kam Woods, Christ...Functional Access to Forensic Disk Images in a Web Service. Kam Woods, Christ...
Functional Access to Forensic Disk Images in a Web Service. Kam Woods, Christ...
 
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
 
Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...
Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...
Developing a Framework for File Format Migrations. Joey Heinen and Andrea Goe...
 
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
A Foundational Framework for Digital Curation: The Sept Domain Model. Stephen...
 

Recently uploaded

Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 

Recently uploaded (20)

Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 

Benchmarks for Digital Preservation tools. Kresimir Duretec, Artur Kulmukhametov, Andreas Rauber and Christoph Becker

  • 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks for Digital Preservation Tools
 
 Kresimir Duretec, Artur Kulmukhametov, Andreas Rauber and Christoph Becker
 Vienna University of Technology and University of Toronto
 
 
 
 IPRES 2015 , Chapel Hill, North Carolina, USA November 3, 2015 1
  • 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview 1. Need for evidence in digital preservation •Software tools in digital preservation and evidence around their quality 2. Software benchmarks in digital preservation •What, Why, How •Software quality 3. Benchmark proposals •Photo migration •Property extraction from documents 4. Are we ready? Next steps https://goo.gl/ps6x8I In case you get bored
  • 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview 1. Need for evidence in digital preservation •Software tools in digital preservation and evidence around their quality 2. Software benchmarks in digital preservation •What, Why, How •Software quality 3. Benchmark proposals •Photo migration •Property extraction from documents 4. Are we ready? Next steps https://goo.gl/ps6x8I In case you get bored
  • 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evidence in digital preservation Important topic !!!
  • 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evidence in digital preservation Important topic !!!
  • 6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software tools in digital preservation •many tools (>400 tools in COPTR registry http://coptr.digipres.org) •categories •identification, validation & characterisation •migration •emulation •…. •high quality required Droid
  • 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . However !!! Blog post at OPF by paul on 19th Oct 2012. http://openpreservation.org/blog/2012/10/19/practitioners-have-spoken-we-need-better-characterisation/
  • 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . However !!! Blog post at OPF by johan on 31st Jan 2014. http://openpreservation.org/blog/2014/01/31/why-cant-we-have-digital-preservation-tools-just-work/
  • 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . However !!! Panel at IPRES 2014
  • 10. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Evidence of software quality in digital preservation •the need for evaluation is not new ! •testbeds and numerous individual evaluation initiatives •often take-up is missing 2002 2006 2010 Dutch Digital Preservation Testbed DELOS Digital PreservationTestbed Planets Testbed ?
  • 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview 1. Need for evidence in digital preservation •Software tools in digital preservation and evidence around their quality 2. Software benchmarks in digital preservation •What, Why, How •Software quality 3. Benchmark proposals •Photo migration •Property extraction from documents 4. Are we ready? Next steps https://goo.gl/ps6x8I In case you are already bored
  • 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software benchmarks •Proposal : Start creating software benchmarks to extend the evidence base around software quality of digital preservation tools •major characteristics •collaborative •open •public
  • 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software benchmarks - definition Source Definition key words SPEC glossary “a "benchmark" is a test, or set of tests, designed to compare the performance of one computer system against the performance of others” test, compare, performance IEEE Glossary “1. a standard against which measurements or comparisons can be made. 2. a procedure, problem, or test that can be used to compare systems or components to each other or to a standard. 3. a recovery file” measurements, comparison, test W. Tichy “a benchmark is a task domain sample executed by a computer or by a human and computer. During execution, the human or computer records well-defined performance measurements. A benchmark provides a level playing field for competing ideas, and (assuming the benchmark is sufficiently representative) allows repeatable and objective comparisons” task domain sample, performance measurements, repeatable objective comparison Sim et al. “a test or set of tests used to compare the performance of alternative tools or techniques” test, compare
  • 14. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Software benchmarks - definition Source Definition key words SPEC glossary “a "benchmark" is a test, or set of tests, designed to compare the performance of one computer system against the performance of others” test, compare, performance IEEE Glossary “1. a standard against which measurements or comparisons can be made. 2. a procedure, problem, or test that can be used to compare systems or components to each other or to a standard. 3. a recovery file” measurements, comparison, test W. Tichy “a benchmark is a task domain sample executed by a computer or by a human and computer. During execution, the human or computer records well-defined performance measurements. A benchmark provides a level playing field for competing ideas, and (assuming the benchmark is sufficiently representative) allows repeatable and objective comparisons” task domain sample, performance measurements, repeatable objective comparison Sim et al. “a test or set of tests used to compare the performance of alternative tools or techniques” test, compare DP Software Benchmark is a set of tests which enable rigorous comparison of software solutions for a specific task in digital preservation
  • 15. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks in other fields •Software Engineering •transactions •virtualisation •databases •… •Information Retrieval •text •video •music •…
  • 16. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarks in other fields community specifications tools & methods that need benchmarks yearly meeting resultsiterative process
  • 17. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmark components Software engineering Motivating comparison Task sample Performance measures
  • 18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmark components Information retrieval Tasks Datasets Answersets Measures Representation formats
  • 19. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmark components Software engineering Motivating comparison Task sample Performance measures Information retrieval Tasks Datasets Answersets Measures Representation formats Digital preservation Motivating comparison Function Dataset Ground truth Performance measures
  • 20. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motivating comparison Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •What is benchmark supposed to compare ? •Which quality aspects are addressed ? •What would be the benefits ?
  • 21. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Motivating comparison Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •software quality •many aspects •we need a model •ISO SQUARE
  • 22. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Function Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •specific task tool is expected to perform
  • 23. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dataset Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •set of digital objects •images •audio •video •relevant •accurate •complete
  • 24. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ground truth Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •correct answers •optional element •machine readable
  • 25. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance measures Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •fitness of the benchmarked tool •type •quantitative •qualitative •calculated •by human •by machine
  • 26. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview 1. Need for evidence in digital preservation •Software tools in digital preservation and evidence around their quality 2. Software benchmarks in digital preservation •What, Why, How •Software quality 3. Benchmark proposals •Raw Photo migration •Property extraction from documents 4. Are we ready? Next steps https://goo.gl/ps6x8I In case you are bored
  • 27. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . RAW Photo migration benchmark Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •various RAW file formats •migrate to DNG •many tools should be tested •migrate from RAW to DNG •collection of photographs in RAW format •SSIM
  • 28. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Property values extraction benchmark Digital preservation Motivating comparison Function Dataset Ground truth Performance measures •different document properties •coverage of needed properties •correctness of extracted values •extract property values from a document •collection of documents in MS Word 2007 file format •correct values •coverage •accuracy •exact match ratio
  • 29. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview 1. Need for evidence in digital preservation •Software tools in digital preservation and evidence around their quality 2. Software benchmarks in digital preservation •What, Why, How •Software quality 3. Benchmark proposals •Raw Photo migration •Property extraction from documents 4. Are we ready? Next steps https://goo.gl/ps6x8I In case you are bored
  • 30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Benchmarking - community driven process •benchmarks emerge from the community consensus •community participation needs to be enabled •based on established results where possible •community needs to incur the costs •continuous evolution of benchmarks is necessary •Are we ready ?
  • 31. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Is the community ready ? Indicators •projects •workshops •own conference •several other journals and conferences • concerns •vocabularies •metrics
  • 32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion
  • 33. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Next step create benchmark specifications Benchmarking forum workshop Friday 6th November 9:00 - 17:00
  • 34. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Questions ? https://goo.gl/ps6x8I In case you were not bored