2. You cannot teach a man anything;
you can only help him find it within himself
Galileo Galilei
3. Simplified definition of digitisation
Digitisation is the managed conversion
of analogue material to a digital format
for ongoing access by electronic
devices during the intended life cycle
of the digital object
5. The library needs to use technology effectively in
reaching out to users. In the academy, this
means bringing innovation to our thinking
http://www.llrx.com/node/2177/print
Stuart Basefsky, 16 June 2009
6. Following benchmarks and best practices that are not a
good fit for your [university] or its culture can be
counterproductive. The most effective way of using
benchmarks and best practices is as a creative
mechanism for raising questions about your own
[situation]. Following what others do is rarely a form of
good leadership.”
Leadership & The Role of Information:
Making The Creatively Informed Questioner
By Stuart Basefsky, Published on October 29, 2008
http://www.llrx.com/features/leadershipandroleofinformation.htm
7. Identify a project
• Know your collections
– what is valuable
– what others need to “see”
– core business of institution
– what is used often
– benefit of such a project (collection as well as
stakeholders)
8. Project planning
As part of a digitisation project planning, you’ll
have to decide on the scanning and format
specifications such as the:
• bit depth (bitonal, greyscale or 24-bit colour)
• scanning resolution (400 dpi, etc.)
• image manipulation options (deskewing,
etc.)
• file format (TIFF, etc.)
9. Cost
• Hard to provide a general price range, variation in
collections and requirements for digitisation
• Digitisation projects, services and costs can be
as unique as the collections selected for
digitisation
• Projects have fundamental similarities (dpi selection,
derivative file creation, source material format, etc.)
other characteristics can make apparently similar
projects completely different
10. Policy making
Institutions should be able to define and defend their
choices related to digitisation in terms of their institutional
mission of teaching and research, and to avoid the
distraction of commercialising their products
11. Think – don’t tumble
• Will digital assets increase access to information that
is hard to obtain otherwise?
• Will digital assets increase the information value of
the physical material?
12. Questions
• Does digitisation fit the organisation’s mission?
• Is there a known potential audience for the materials
that are planned to be digitised?
• Will digitisation increase access, functionality or
intellectual control?
13. Questions
• Will digitising these materials fill a need that is
currently unmet?
• Are the materials in the public domain or can proper
rights be secured?
• Is funding in place for the digitisation program?
14.
15. Workflow
• Identify a project
• Selection criteria
• Copyright
• Basic preservation on physical material
• Scanning
• Manipulation
• Web ready
• Submit or hand over
16. Selection criteria
• know the history and rationale behind selection of
sources
• start with collection items that are often used
• embrittled material
• published between a certain time-line
• materials have to be Africana
• language limitations
• forming part of a certain collection
• make sure no doubles are included
17. Copyright
• stay clear of copyright
• try to avoid material still in copyright
• where necessary start with copyright clearance
first – may take long to sort out
• note every step along the way – keep the evidence
18. Physical preservation
• Basic cleaning of material
– dust
– tears / broken corners
– mould
– remove selotype / glue / pritt
– remove staplers, gem clips, anything that can
cause rust marks
– store in acid free containers if possible
19. QA QA
Unique URI created for
Metadata Editor object
UPSpace I R
QA QA
Send to submitters via
• email
• external hard drive Reviewer
• DVD/CD/Flash drive
• baseline submission UPSpace I R
QA QA
•Copy from AS
•Quality Control
•Scan directly to
•Deskew/cleaning/
archival server
derivation/filter
Archival server
•Safe web ready
20. Selection criteria of material
Lecturer / Vet library
Preparation of material
Lecturer/Vet library personnel
Baseline metadata
Copyright clearance Service Unit Staff
Jacob
Access rights Scan material
Lecturer Digitization office/EI
Baseline metadata Conversion of image + OCR*
Service Unit Staff Digitization office
Webready process Store master image
Digitization office + VET library
Cataloguing on UPSpace
Amelia/Cataloguer
add
LCSH
subjects
Link images
Digitization office/Amelia
*OCR of books – only Preface/Contents/Index
UPSpace Administrator
Amelia Breytenbach (Vet)
13 Apr 2005
21. Scanning
• Start with the easy part
– photo collection
– black and white documents
• Phase it
• Reward yourself when finished
23. Imaging requirements
• Printed text
Resolution Bit depth Enhancements
allowed
400-600 dpi Bitonal Sharpening,
descreening,
cropping, deskewing,
and despeckling
24. Imaging requirements
• Rare/damaged printed text
Resolution Bit depth Enhancements
allowed
400-600 dpi 8-gray or Contrast stretching
24 colour Minimal adjustments
for tone and colour
25. Imaging requirements
• Book illustrations
Resolution Bit depth Enhancements
allowed
400 dpi - 8-gray or Contrast stretching
600 dpi with 24 colour Minimal adjustments
enhancement for tone and colour
26. Image manipulation
• Less is more
– don’t fiddle just do the necessary amendments
– get it ready for web display
– remember the technical metadata
– note everything
27. Redaction
• Identify material for redaction
– Once redactions have been identified and
agreed upon, decisions need to be recorded
– Do not remove a whole sentence or
paragraph if only one or two words are non-
disclosable
– be consistent throughout the collection
28. Storage
• Archival image
– each image need its own unique identifier
– keep apart – do not work on archival image make
a COPY
– save the copy apart from archival image
– note every step in database
29. Storage
• More is better
– archival image
– at least one TIFF original on DVD/ hard disk /
external hard disk
– at least one derivate copy on DVD/ hard disk/
external hard disk
– store apart, if possible keep a copy in another
building
30. Codex Sinaiticus is one of the world's outstanding manuscripts. Together with
Codex Vaticanus, it is one of the earliest extant Bibles, containing the oldest
complete New Testament. This treasured codex is indispensable for
understanding the earliest text of the Greek Bible, the transmission of its text, the
establishment of the Christian canon, and the history of the book. Over 400
leaves survive and are held across four institutions
http://www.codexsinaiticus.org/en/project/digitisation.aspx
31. Test image of a Codex Sinaiticus Test image of a Codex Sinaiticus
page on a white background page on a black background
Through testing, the decision was made to opt for a compromise colour. A light
brown background was chosen that was close enough to the colour of the
parchment to give a sense of its warmth, while reducing the show-through to a
point where it rarely makes reading the page difficult.
http://www.codexsinaiticus.org/en/project/digitisation.aspx
42. Optical Character Recognition
MR. GLADSTONE ON FAIR T: AD'.
AND RUNT JUC
Puctios-jTHE nkxt I.IIiKt.AI. LRADKk?
LORD
?AKIINOTON's NEW ATTITUDE AND
WHAT
MR. CHAMBERLAIN THINKS OF IT?
MR.
RI.AINK AND LOUIS KOSSUTH?
AX ANARCHIST CARDINAL
BISMARCK AND BROWNING
??ART AND LITERA?
RY NOT I 8.
fBT CABLR TO THIS TRIBUNE.|
48. Risk analysis for digital objects
• Hard drive failure
• URL error – linked broken
• Storage medium failure
• Loss of information/data
• Human error and memory
• Hackers
www.fotosearch.com
49. Preservation
• Preservation strategies should enable subsequent users
to work with digital resources in the same way that they
would be able to continue to work with older, analogue
materials.
• Can we afford to scan at a low resolution, or make other
compromises in the digitisation life-cycle
50. Digital preservation
• budget for a possible migration strategy
• consider digital formats carefully
• metadata standards (technical and preservation)
• the organisation must be committed to the program
• follow best practices and international standards
• IT must adapt to long-term needs of digital
preservation
• develop a technology infrastructure plan
51. PREMIS MODEL
Agent:
•The role of the person undertaking
the event (name/organization)
Intellectual entity (photo) •Software name and version no.
•OS type
Converted to digital object
Preserve for
interoperability,
access and readability
TIFF image file
Object: Rights:
Rights = Object -
•File size •License agreement
instructed user what
it represent •Date created
•Exact permissions
•File format
granted over
Transform to JPEG •Creating preservation of the
for web display application object