Federico Nurra - Toward a long term data preservation strategy and interoperability at the French National Institute for Preventive Archaeological Research
Presentation by Federico Nurra of INRAP given at the ARIADNE winter school about work to develop a long term data preservation strategy and framework for interoperability. DOLIA, INRAP's Inrap’s catalogue of documents (reports) provided the starting point. The data structure has been mapped to the ARIADNE Catalogue Data Model (ACDM) and the PACTOLS vocabulary has been mapped to ARIADNE concepts from the Getty's Art and Architecture Thesaurus.
Bulgaria: ARIADNE - Success stories from partners and the research community
Similar to Federico Nurra - Toward a long term data preservation strategy and interoperability at the French National Institute for Preventive Archaeological Research
LD4 conference 2020 The Use of Linked Data at the ISSN International CentreISSN International Centre
Similar to Federico Nurra - Toward a long term data preservation strategy and interoperability at the French National Institute for Preventive Archaeological Research (7)
Federico Nurra - Toward a long term data preservation strategy and interoperability at the French National Institute for Preventive Archaeological Research
1. Toward a long term data preservation strategy
and interoperability at the French National
Institute for Preventive Archaeological
Research (Inrap)
Federico Nurra
Service activités internationales,
Direction Scientifique et Technique,
INRAP
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
2. The French Institute for Preventive Archaeology
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Organisation
8 Regional Headquarters
About 50 archaeological centres
About:
2000 archaeologists
2300 operations a year
85 % Trial trenching
15% Excavations
~ 45.000 Archaeological Operations
3. Starting point
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Dolia:
• Inrap’s catalog of
documentary
collection / digital
library
Dolia:
• ~ 28.500 reports
4. Mapping Dolia (UNIMARC) - ACDM
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Champ DOLIA (UNIMARC) ACDM
Titre UNIMARC 200 [$a + $e] dcterm:title
Description UNIMARC 330 [$a] dcterms:description
Date UNIMARC 210 [$d] dcterms:issued
Mots Clés UNIMARC 610 [$a] dcat:keyword
Langue UNIMARC 101 [$a] dcterms:language
Chronologie UNIMARC 634 [$5] acdm:temporal
… … …
Responsable Sc. UNIMARC 700 [$a] acdm:scientificResponsible
Sujet UNIMARC 606 [$5] acdm:nativeSubject
… … …
5. Thesaurus Mapping (Pactols-AAT)
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Original Concept
(Pactols)
Exact Match
Close Match
Broad Match
Narrow Match
ARIADNE Concept
(AAT)
Results:
• 1814 Concepts (on 5149)
• 1282 Exact (70,7%)
• 132 Close (7,5%)
• 389 Broad (21,4%)
• 7 Narrow (0,4%)
http://pactols.frantiq.fr/opentheso/index.xhtml http://vocab.getty.edu/
[SKOS]
7. The French Institute for Preventive Archaeology
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
153.851
110.824 exact 72,03
9.212 close 5,99
33.606 broad 21,84
209 narrow 0,14
5.898 no subject
8. Geocoding
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
121, Rue d’Alésia, 75014 Paris
API Géoportail IGN
(Géocodage) + API BAN
Lng: 2.323169
Lat: 48.829001
Results:
• 28.357
• 3.636 Exact points (12,8%)
• 8.327 ~ Street (29,4%)
• 16.267 ~ Town (57,4%)
• 127 No geolocation (0,4%)
9. Geocoding
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Heatmap Clustermap
10. Chronological Mapping
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
Période (Pactols) Earliest start Latest stop
… … …
Néolithique -6000 -2201
Néolithique ancien -6000 -5301
Néolithique moyen -5300 -4501
Néolithique récent -4500 -2201
… … …
Protohistoire -2200 -51
Âge du Bronze -2200 -801
Bronze ancien -2200 -1601
… … …
Results:
• 124 Chronological
concepts
• LOD on http://perio.do/
11. Chronological Mapping
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
MED ~2.200 reportsNP ~3.200 reports
12. The ARIADNE Portal
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
URI Ark - DoliaSubject AAT (LOD)
Geonames (LOD)
PeriodO (LOD)
13. All that glitters is not gold…
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
National Preservation Infrastructures
14. A case study
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
15. A case study
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
16. A case study
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
17. A case study
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
18. A case study
• 11,6 Km2
• 7 «Prescriptions» of the regional services of the state (Emprises)
• 80 Evaluation tranches (Ouvertures)
• 532 Archaeological remains (Unités d’Observation)
• Archaeological investigations from 2001 to 2015
• 11,6 Gb of legacy data (it’s not a joke!)
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
19. Questions
• How to treat the legacy data
• How to manage the 60 different recording systems? What standards to use?
• How to integrate artifacts?
• How to open the process to the pubblic? (define the users)
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
20. Proposed Solutions
ARIADNE Summer School, Athens, 12-17 June 2016 - Digital curation of archaeological knowledge
• Collect all the legacy data
• Finding a formal management system
• Go for basic preservation of digital content and registration
• Reach the researcher’s community
• Follow a multistructured approach for the next 5-6 years
• Organise awarness workshop
• Use the example of legacy data or the ARIADNE portal to convience for the
necessity
• Develop paradata journal /blog
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
21. Pain Point(s)
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
22. Solution(s)
Unique ID
(e.g. Code INSEE + N. Oper.)
1_DOCUMENTATION
ADMINISTRATION
FILE_1.docx
FILE_2.xlsx
FILE_N.xxx
2_GIS_DATA
3_DATA
N_ETC…
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry
24. Needed Activities
• Short Term Strategy (a couple of months)
• Case Study (7 archaeological operations)
• ARIADNE Portal
• Medium Term Strategy (one year)
• Specific training for archaeologists
• Definition of «Best Pratices»
• Basic curation approach
• Long Term Strategy (5-6 years)
• Collect all data (almost 40.000 archaeological interventions)
• Authomatic ingest of new data (almost 2300 a.i. / year)
• Store and publish all data in a specific repository
• Curation strategy for a long term preservation
ARIADNE Winter School, Prato, 12-15 December 2016 - Legacy datasets and their inclusion in the ARIADNE Registry