This document discusses transferring metadata from the 20th Century Press Archives to Wikidata. It begins by describing the press archives collection. It then explains why Wikidata is a good platform, being sustainable, editable, and with linked open data capabilities. The document outlines the process of linking the archive's metadata to existing Wikidata items, creating new items, and adding metadata to items. It provides an example of using the linked data to create a map of economists in the collection. Future plans include linking more archive folders to items and creating pages for each folder on the archive's website.
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
ZBW links historic press archives to Wikidata
1. ZBW is member of the Leibniz Association
Wikidata as opportunity for special collections:
the 20th Century Press Archives use case
Joachim Neubert
ZBW – Leibniz Information Centre for Economics, Kiel/Hamburg
LIBER 2019, Linked Open Data Working Group
26.06.2019, Dublin (Ireland)
2. Agenda
1. What are we dealing with?
2. Why Wikidata?
3. Transfering metadata
a. Link to existing items
b. Create missing items
c. Add metadata to the items
4. Using the data
5. Future work
Page 2
4. Page 4
What are we dealing with?
Historic Press Archives, founded in 1909 (Hamburg) and 1914 (Kiel)
• Some material dating back to 1826
• Collections closed in 2005
Thematic dossiers covering
Persons
• Companies
• Products
• General subjects and events
5. Page 5
Current state
Former DFG funded project, resulting in
• Digitized roll films (material before 1949)
• Relational database about dossiers, often with GND ID
• Big filesystem (containing more than 2m pages)
• Accessible via
• custom application “Pressemappe 20. Jahrhundert” and
• DFG-Viewer (METS/MODS files, per dossier)
All metadata available under CC 0 license
8. Wikidata basics
• Knowledge base for Wikimedia projects
• All kinds of entities: concepts, places, people, works …
• Editable and extensible by everyone
• Data available under CC0
• http://query.wikidata.org/ (SPARQL)
• JSON API & database dumps
• Sustainable foundation for long-term available data
Page 8
11. Linking mechanism: external identifiers
• Property value: unique IDs from external database
• + URL stub in the property definition („formatter URL“)
• Almost 4,000 external identifier properties
• Examples:
• GND
• proteins
• African plants
• Swedish cultural heritage objects
Page 11
12. Transfering collection metadata to Wikidata
1. What are we dealing with?
2. Why Wikidata?
3. Transfering metadata
a. Link to existing items
b. Create missing items
c. Add metadata to the items
4. Using the data
5. Future work
Page 12
13. Wikidata property P4293 (PM20 folder ID)
• Property proposal and discussion within the community
Additional prerequisite:
• RDF representation of PM20 contents and a SPARQL endpoint,
allowing federated queries with the Wikidata endpoint
Page 13
14. Link to existing items
• Automatically inserted links derived from GND IDs
• Tool-supported manual linking
• Wikidata‘s Mix-n-match (great for persons, crowd-sourced)
• custom tools (like this)
• others (OpenRefine, …)
~ 95% of PM20 person folders linked by mid-June 2019!
Page 14
16. Add missing items to Wikidata - automatically
Recommendations for item creation:
• Pay attention to Wikidata’s notability criteria
• Explain your plan and ask for feedback in the Wikidata project chat
• Apply for a bot account to make mass edits (example)
• Source every statement
Process:
• Transform query results to QuickStatements input file
• Copy & paste into QuickStatements
Page 16
17. QuickStatements input from PM20
• using a federated query to exclude existing Wikidata items
• query output transformed by a script
Page 17
19. Add metadata to Wikidata items
e.g., for all persons in Wikidata with PM20 ID and the PM20 “field of
activity”: “economics” or “business economics”, insert the according
occupation into the WD person item (script, query)
Page 19
23. Future work
• Build community support for further extension of the PM20 metadata
• Create an item structure for the subject and ware archives, and link
the folders (~ 12,000)
• Link/create items for company folders (~ 8,000)
• Create a static HTML site with one page per folder (+ additional
navigation pages) on the PM20 web site which hosts the digitized
images (= permanent reference)
• Optionally, create additional Wikidata-based searching/browsing
facilities
• Retire the present ColdFusion application
Page 23
24. Wikiproject 20th Century Press Archives
Page 24
https://www.wikidata.org/wiki/Wikidata:WikiProject_20th_Century_Press_Archives
25. Page 25
Thanks for listening!
Joachim Neubert
ZBW – Leibniz Information Centre for Economics
j.neubert@zbw.eu
http://zbw.eu/labs
https://www.wikidata.org/wiki/User:Jneubert