SlideShare a Scribd company logo
1 of 31
Download to read offline
dans.knaw.nl
DANS is een instituut van KNAW en NWO
My data manager is a robot!
Mass ingests and migrations & network integrations
Valentijn Gilissen, MA: Data Manager / Preservation Officer
April 2019, CAA, Krakow
Use-cases
• The SWORD-ingest of Dutch archaeological datasets by the network of governmental
depots into the central DANS hub.
• Mass migrations and transformations of archived data to new standards.
• The promotion and integration of local data from the Portable Antiquities of the
Netherlands (PAN) in an international network, making use of thesauri, data mining
and Linked Open-Data techniques.
“How is humanity saved if it's not
allowed to... evolve?”
--Ultron
Avengers: Age of Ultron.
Directed by Joss Whedon. Marvel Studios, 2015
To support the ingest and validation of
increasing volumes of data, the role of
the data manager will need to adapt.
--Valentron
Institute of
Dutch Academy
and Research
Funding
Organisation
(KNAW & NWO)
since 2005
First predecessor
dates back to
1964 (Steinmetz
Foundation),
Historical Data
Archive 1989
Mission: promote
and provide
permanent
access to digital
research
resources
https://dans.knaw.nl
Data Archiving and Networked Services
https://easy.dans.knaw.nl
https://dataverse.nl
https://www.narcis.nl
DANS core data services
NARCIS: Gateway to scholarly
information in the Netherlands
DataverseNL for short- and
mid-term data storage
EASY: certified long-term Electronic
Archiving System for self-deposit
http://www.brill.com/rdj
https://data.mendeley.com/
https://datadryad.org
Background Archive
Research Data Journal for the
Humanities and Social Sciences
Training &
Consultancy
http://datasupport.researchdata.nl/
DANS additional services
Ingest via SWORD protocol
(Simple Web-service Offering
Repository Deposit)
The e-Depot for Dutch Archaeology
>40.000
76%
Field drawings/GIS
Images
Publications
Data tables
Photographs
available without restrictions
archaeological datasets
Open Archival Information System
• Mission to provide the designated community
with trustworthy long-term access to curated
digital resources
• Constant monitoring, planning and maintenance
• Knowledge of/measures against: threats and
risks within systems
• Regular checking and/or certification
• Certificates: 3 standards, 3 levels
What is a ‘Trusted Digital Repository’?
http://www.trusteddigitalrepository.eu
OAIS
(ISO 14721)
Trusted Digital
Repositories:
Attributes and
Responsibilities
TRAC
Audit and
Certification of
Trustworthy Digital
Repositories
(ISO 16363 )
Bodies Providing
Audit And
Certification
(ISO 16919 )
Formal
Certification
See http://wiki.digitalrepositoryauditandcertification.org and
http://www.alliancepermanentaccess.org/membership/member-resources/audit-and-certification
Standards will be available free from http://www.ccsds.org
trustworthiness of digital repositories using ISO
16363.
It covers principles needed to inspire
confidence that third party certification of the
management of the digital repository has been
performed with impartiality, competence,
responsibility, openness, confidentiality, and
responsiveness to complaints
Metrics concerning:
• Organizational Infrastructure
• e.g. The repository shall have a documented history of the
changes to its operations, procedures, software, and
hardware.
• Digital Object Management
• e.g. The repository shall have access to necessary tools
and resources to provide authoritative Representation
Information for all of the digital objects it contains.
• Infrastructure and Security Risk Management
• eg. The repository shall have procedures in place to
evaluate when changes are needed to current
software.
Basic
Certification
Data Seal of
Approval
Extended
Certification
EUROPEAN
FRAMEWORK FOR
AUDIT AND
CERTIFICATION OF
DIGITAL
REPOSITORIES
to be promoted by
the EU
Monitored self-
audit using DSA
metrics
Monitored self-audit using ISO 16363 (or
DIN31644 in Germany)
Audit by
external
auditors
Electronic Archiving SYstemEASY Register Log in
New deposit
BrowseAdvanced search
Search help
Search
Disclaimer
Legal information
Property Rights Statement
How to cite data
https://easy.dans.knaw.nl
CoreTrustSeal/ Nestor Seal 2016
Overview
Cite as
Description
Data files (N)
Electronic Archiving SYstemEASY
Title
Alternative title
Creator
Contributor
Date created
Description
Subject
Coverage
Identifier
Relation
Temporal
Spatial
Type
Format
Language
Upload Files
Qualified Dublin Core metadata
Self-depositing
Access rights
Date available
Remarks
Rights holder
Publisher
Audience
Source
Date
Data-managing
• Check Dublin Core, edit/modify where necessary
• Assign project codes (if required)
• Download files, check for completeness / privacy-sensitive data
• Migrate files to preferred formats (if required/necessary)
• Modify directory structure (if necessary)
• Upload preferred formats
• Check individual file metadata, edit/modify if necessary
• Add individual file metadata
• Publish files (set visibility/accessibility rights)
• Create a ‘Jumpoff’ presentation page
• Check workflow
• Publish dataset
• Relate dataset to related datasets or web pages
• End administration
Case 1: I, Robot
The SWORD-ingest of Dutch archaeological datasets by the network of
governmental depots into the central DANS hub.
“I’d give you advice, but you wouldn’t listen. No one ever does.”
--Marvin the Paranoid Android
(Adams, Douglas, 1952-2001. The Hitchhiker's Guide to the Galaxy;
New York :Harmony Books, 1980. Print.)
Reality: guidance => monitoring => feedback => effect change
--Valentijn the Preservation Officer
Provincial Depots
Front-office/Back-office model
PDBS
Provinciaal Depot Beheer Systeem
(Provincial Depot Management System)
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Open Archival Information System
Persistent Identifier Citation
Front-office
Machine to Machine
SWORD
OAI-PMH
REST-API
P
R
O
D
U
C
E
R
C
O
N
S
U
M
E
R
Guides to Good Practice
Before depositing
Metadata
What DANS does
Legal aspects
Quoting data
https://dans.knaw.nl/en
Deposit => Read more about depositing data
File Formats
http://www.parthenos-project.eu/portal/policies_guidelines
Documentation
During depositing
After depositing
Case 2: Transformers!
Mass migrations and transformations of archived data to new standards.
“Upgrading is compulsory.”
--the Cybermen
Doctor Who, BBC Studios, 1963-2019
Reality: guiding => monitoring => migrating where relevant => update documents
--the Archiving staff (Trusted Digital Repositories)
Preferred Formats
Preferred Formats
Non-preferred format(s)
As a general guideline, DANS considers that the file
formats best suited for longtime preservation and
accessibility are file formats which:
-are commonly used
-have open specifications
-are independent of specific software, developers or
suppliers
Archaeological data deposited in EASY
Publications
CAD drawings/GIS maps
Field drawings (scans)
Data tables
(databases / spreadsheets)
Photographs
Reports
Vector Images
JPEG + TIFF
JPEG + TIFF
SVG
CSV
PDF/A
PDF/A
DXF R12 / MID+MIF
CSV
PDF/AWord, WordPerfect
Access
Mass migrations to Preferred Formats
Mass migrations to Preferred Formats
File identification
(mediatype)
Selection filter:
visible files
Extraction from
archive (Python)
Checksum
validation Checksum
validation
Checksum
validation
Checksum
validation
Double conversion
(Python)
Adding
provenance
metadata to
file ID’s
Generatin
g logfiles
Archival
storage
Case 3: Automatic for the People
The promotion and integration of local data from the Portable Antiquities of
the Netherlands (PAN) in an international network, making use of thesauri,
data mining and Linked Open-Data techniques.
“I am fluent in over six million forms of communication.”
--Protocol droid C3PO
Star Wars: Episode VI -Return of the Jedi. Directed by Richard Marquand. Lucasfilm Ltd. LCC, 1983
Reality: mapping metadata => harvesting => adding sources => enable access
--Protocol-operating data manager V@L3NT1JN
PAN – Portable Antiquities of the Netherlands
CARARE-project:
‘Open Access’ archaeological publications visible in Europeana
http://www.carare.eu/
ARIADNE-portal:
http://portal.ariadne-infrastructure.eu/
Initiatives
Researchers Excavators
Depot holders
National
Initiatives
International
portals
International
searching &
downloading
searching &
downloading
searching &
downloading
depositing
depositingdepositing
depositing
depositing
depositing
OAI-PMH
harvesting
Depositing
via SWORD
General contact:
Info@DANS.KNAW.NL
Head Data Archive:
Hella.Hollander@DANS.KNAW.NL
Senior Data Steward / Preservation
Officer:
Valentijn.Gilissen@DANS.KNAW.NL
Watch our videos on YouTube:
https://www.youtube.com/user/
DANSDataArchiving
Thanks for listening!

More Related Content

Similar to 02 2019 caa_krakowvg

TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data ArchivingRainStor
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsSlim Baltagi
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007Neil Matatall
 
Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Thorsten Huelsmann
 
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxLaurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxFIWARE
 
Big data in freight transport
Big data in freight transportBig data in freight transport
Big data in freight transportPer Olof Arnäs
 
Meeting the future - Big data in freight transport
Meeting the future - Big data in freight transportMeeting the future - Big data in freight transport
Meeting the future - Big data in freight transportPer Olof Arnäs
 
KELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFKELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFHernanKlint
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectLeandro Ciuffo
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13Kristi Holmes
 
de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservationFIAT/IFTA
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-apanttipursula
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  Bilot
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amirydatastack
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 

Similar to 02 2019 caa_krakowvg (20)

TDWI Checklist Report: Active Data Archiving
TDWI Checklist Report:  Active Data ArchivingTDWI Checklist Report:  Active Data Archiving
TDWI Checklist Report: Active Data Archiving
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Datamining
DataminingDatamining
Datamining
 
Educause Annual 2007
Educause Annual 2007Educause Annual 2007
Educause Annual 2007
 
Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...Industrial Data Space Association - New Members, New Insights, New Future Dir...
Industrial Data Space Association - New Members, New Insights, New Future Dir...
 
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptxLaurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
Laurent Curnier – Monaco DataPlatform - LaurentCURNIER_.pptx
 
Big data in freight transport
Big data in freight transportBig data in freight transport
Big data in freight transport
 
Wiser2009 Luis Martinez
Wiser2009 Luis MartinezWiser2009 Luis Martinez
Wiser2009 Luis Martinez
 
Meeting the future - Big data in freight transport
Meeting the future - Big data in freight transportMeeting the future - Big data in freight transport
Meeting the future - Big data in freight transport
 
Iot presentation
Iot presentationIot presentation
Iot presentation
 
KELLY_MANOVERV.PDF
KELLY_MANOVERV.PDFKELLY_MANOVERV.PDF
KELLY_MANOVERV.PDF
 
The e-Ciber Superfacility Project
The e-Ciber Superfacility ProjectThe e-Ciber Superfacility Project
The e-Ciber Superfacility Project
 
Information Systems
Information SystemsInformation Systems
Information Systems
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
de theory and practice of digital preservation
de theory and practice of digital preservationde theory and practice of digital preservation
de theory and practice of digital preservation
 
170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap170131 tryggve-at ssi-biobanks-ap
170131 tryggve-at ssi-biobanks-ap
 
Powering the Future of Data  
Powering the Future of Data	   Powering the Future of Data	   
Powering the Future of Data  
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Burton - Security, Privacy and Trust
Burton - Security, Privacy and TrustBurton - Security, Privacy and Trust
Burton - Security, Privacy and Trust
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 

More from ariadnenetwork

ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfariadnenetwork
 
DANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for ArchaeologistsDANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for Archaeologistsariadnenetwork
 
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2ariadnenetwork
 
Eaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaEaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaariadnenetwork
 
Eaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusEaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusariadnenetwork
 
Eaa2021 session 476 abstracts
Eaa2021 session 476 abstractsEaa2021 session 476 abstracts
Eaa2021 session 476 abstractsariadnenetwork
 
Eaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaEaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaariadnenetwork
 
Eaa2021 476 izeta cattaneo idacordig and suquia
 Eaa2021 476 izeta cattaneo idacordig and suquia Eaa2021 476 izeta cattaneo idacordig and suquia
Eaa2021 476 izeta cattaneo idacordig and suquiaariadnenetwork
 
Eaa2021 476 preserving historic building documentation pakistan
Eaa2021 476 preserving historic building documentation  pakistanEaa2021 476 preserving historic building documentation  pakistan
Eaa2021 476 preserving historic building documentation pakistanariadnenetwork
 
Eaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaEaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaariadnenetwork
 
Preferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed FormatsPreferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed Formatsariadnenetwork
 
Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020ariadnenetwork
 
D6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesD6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesariadnenetwork
 
ARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key ResultsARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key Resultsariadnenetwork
 
ARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportariadnenetwork
 
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_2019042404 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424ariadnenetwork
 
03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrapariadnenetwork
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425ariadnenetwork
 
00 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_201900 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_2019ariadnenetwork
 

More from ariadnenetwork (20)

ARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdfARIADNE plus - vms workshop.pdf
ARIADNE plus - vms workshop.pdf
 
DANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for ArchaeologistsDANS Data Trail Data Management Tools for Archaeologists
DANS Data Trail Data Management Tools for Archaeologists
 
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2Eaa2021 476 natália botica - from 2_archis to datarepositorium2
Eaa2021 476 natália botica - from 2_archis to datarepositorium2
 
Eaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgariaEaa2021 476 kecheva_nekhrizov_bulgaria
Eaa2021 476 kecheva_nekhrizov_bulgaria
 
Eaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimusEaa2021 476 norwegian_unimus
Eaa2021 476 norwegian_unimus
 
Eaa2021 session 476 abstracts
Eaa2021 session 476 abstractsEaa2021 session 476 abstracts
Eaa2021 session 476 abstracts
 
Eaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbiaEaa2021 476 ways and capacity in archaeological data management in serbia
Eaa2021 476 ways and capacity in archaeological data management in serbia
 
Eaa2021 476 izeta cattaneo idacordig and suquia
 Eaa2021 476 izeta cattaneo idacordig and suquia Eaa2021 476 izeta cattaneo idacordig and suquia
Eaa2021 476 izeta cattaneo idacordig and suquia
 
Eaa2021 476 preserving historic building documentation pakistan
Eaa2021 476 preserving historic building documentation  pakistanEaa2021 476 preserving historic building documentation  pakistan
Eaa2021 476 preserving historic building documentation pakistan
 
Eaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seaddaEaa2021 s476 ariadne-seadda
Eaa2021 s476 ariadne-seadda
 
Preferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed FormatsPreferred Formats = Pre-FAIRed Formats
Preferred Formats = Pre-FAIRed Formats
 
Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020Heeren pan-seadda-leiden-17mrt2020
Heeren pan-seadda-leiden-17mrt2020
 
D6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activitiesD6.1 initial report-innovation-strategy-and-targeted-activities
D6.1 initial report-innovation-strategy-and-targeted-activities
 
ARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key ResultsARIADNEplus Community Needs Survey - Key Results
ARIADNEplus Community Needs Survey - Key Results
 
ARIADNEplus survey-2019-report
ARIADNEplus survey-2019-reportARIADNEplus survey-2019-report
ARIADNEplus survey-2019-report
 
05 caa hasil_novak
05 caa hasil_novak05 caa hasil_novak
05 caa hasil_novak
 
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_2019042404 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
04 ariadn eplus_caa2019_cnrs_open_archaeo_20190424
 
03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap03 ariadn eplus_caa_2019_inrap
03 ariadn eplus_caa_2019_inrap
 
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 2019042501 caa2019 ariadn_eplus_snd_uj_krakow 20190425
01 caa2019 ariadn_eplus_snd_uj_krakow 20190425
 
00 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_201900 jdr introduction caa_ariadn_eplus_2019
00 jdr introduction caa_ariadn_eplus_2019
 

Recently uploaded

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

02 2019 caa_krakowvg

  • 1. dans.knaw.nl DANS is een instituut van KNAW en NWO My data manager is a robot! Mass ingests and migrations & network integrations Valentijn Gilissen, MA: Data Manager / Preservation Officer April 2019, CAA, Krakow
  • 2. Use-cases • The SWORD-ingest of Dutch archaeological datasets by the network of governmental depots into the central DANS hub. • Mass migrations and transformations of archived data to new standards. • The promotion and integration of local data from the Portable Antiquities of the Netherlands (PAN) in an international network, making use of thesauri, data mining and Linked Open-Data techniques. “How is humanity saved if it's not allowed to... evolve?” --Ultron Avengers: Age of Ultron. Directed by Joss Whedon. Marvel Studios, 2015 To support the ingest and validation of increasing volumes of data, the role of the data manager will need to adapt. --Valentron
  • 3. Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research resources https://dans.knaw.nl Data Archiving and Networked Services
  • 4. https://easy.dans.knaw.nl https://dataverse.nl https://www.narcis.nl DANS core data services NARCIS: Gateway to scholarly information in the Netherlands DataverseNL for short- and mid-term data storage EASY: certified long-term Electronic Archiving System for self-deposit
  • 5. http://www.brill.com/rdj https://data.mendeley.com/ https://datadryad.org Background Archive Research Data Journal for the Humanities and Social Sciences Training & Consultancy http://datasupport.researchdata.nl/ DANS additional services Ingest via SWORD protocol (Simple Web-service Offering Repository Deposit)
  • 6. The e-Depot for Dutch Archaeology >40.000 76% Field drawings/GIS Images Publications Data tables Photographs available without restrictions archaeological datasets
  • 8. • Mission to provide the designated community with trustworthy long-term access to curated digital resources • Constant monitoring, planning and maintenance • Knowledge of/measures against: threats and risks within systems • Regular checking and/or certification • Certificates: 3 standards, 3 levels What is a ‘Trusted Digital Repository’? http://www.trusteddigitalrepository.eu OAIS (ISO 14721) Trusted Digital Repositories: Attributes and Responsibilities TRAC Audit and Certification of Trustworthy Digital Repositories (ISO 16363 ) Bodies Providing Audit And Certification (ISO 16919 ) Formal Certification See http://wiki.digitalrepositoryauditandcertification.org and http://www.alliancepermanentaccess.org/membership/member-resources/audit-and-certification Standards will be available free from http://www.ccsds.org trustworthiness of digital repositories using ISO 16363. It covers principles needed to inspire confidence that third party certification of the management of the digital repository has been performed with impartiality, competence, responsibility, openness, confidentiality, and responsiveness to complaints Metrics concerning: • Organizational Infrastructure • e.g. The repository shall have a documented history of the changes to its operations, procedures, software, and hardware. • Digital Object Management • e.g. The repository shall have access to necessary tools and resources to provide authoritative Representation Information for all of the digital objects it contains. • Infrastructure and Security Risk Management • eg. The repository shall have procedures in place to evaluate when changes are needed to current software. Basic Certification Data Seal of Approval Extended Certification EUROPEAN FRAMEWORK FOR AUDIT AND CERTIFICATION OF DIGITAL REPOSITORIES to be promoted by the EU Monitored self- audit using DSA metrics Monitored self-audit using ISO 16363 (or DIN31644 in Germany) Audit by external auditors
  • 9. Electronic Archiving SYstemEASY Register Log in New deposit BrowseAdvanced search Search help Search Disclaimer Legal information Property Rights Statement How to cite data https://easy.dans.knaw.nl CoreTrustSeal/ Nestor Seal 2016
  • 10. Overview Cite as Description Data files (N) Electronic Archiving SYstemEASY
  • 11. Title Alternative title Creator Contributor Date created Description Subject Coverage Identifier Relation Temporal Spatial Type Format Language Upload Files Qualified Dublin Core metadata Self-depositing Access rights Date available Remarks Rights holder Publisher Audience Source Date
  • 12. Data-managing • Check Dublin Core, edit/modify where necessary • Assign project codes (if required) • Download files, check for completeness / privacy-sensitive data • Migrate files to preferred formats (if required/necessary) • Modify directory structure (if necessary) • Upload preferred formats • Check individual file metadata, edit/modify if necessary • Add individual file metadata • Publish files (set visibility/accessibility rights) • Create a ‘Jumpoff’ presentation page • Check workflow • Publish dataset • Relate dataset to related datasets or web pages • End administration
  • 13. Case 1: I, Robot The SWORD-ingest of Dutch archaeological datasets by the network of governmental depots into the central DANS hub. “I’d give you advice, but you wouldn’t listen. No one ever does.” --Marvin the Paranoid Android (Adams, Douglas, 1952-2001. The Hitchhiker's Guide to the Galaxy; New York :Harmony Books, 1980. Print.) Reality: guidance => monitoring => feedback => effect change --Valentijn the Preservation Officer
  • 15. Front-office/Back-office model PDBS Provinciaal Depot Beheer Systeem (Provincial Depot Management System)
  • 16. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 17. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 18. Open Archival Information System Persistent Identifier Citation Front-office Machine to Machine SWORD OAI-PMH REST-API P R O D U C E R C O N S U M E R
  • 19. Guides to Good Practice Before depositing Metadata What DANS does Legal aspects Quoting data https://dans.knaw.nl/en Deposit => Read more about depositing data File Formats http://www.parthenos-project.eu/portal/policies_guidelines Documentation During depositing After depositing
  • 20. Case 2: Transformers! Mass migrations and transformations of archived data to new standards. “Upgrading is compulsory.” --the Cybermen Doctor Who, BBC Studios, 1963-2019 Reality: guiding => monitoring => migrating where relevant => update documents --the Archiving staff (Trusted Digital Repositories)
  • 22. Preferred Formats Non-preferred format(s) As a general guideline, DANS considers that the file formats best suited for longtime preservation and accessibility are file formats which: -are commonly used -have open specifications -are independent of specific software, developers or suppliers
  • 23. Archaeological data deposited in EASY Publications CAD drawings/GIS maps Field drawings (scans) Data tables (databases / spreadsheets) Photographs Reports Vector Images JPEG + TIFF JPEG + TIFF SVG CSV PDF/A PDF/A DXF R12 / MID+MIF
  • 25. Mass migrations to Preferred Formats File identification (mediatype) Selection filter: visible files Extraction from archive (Python) Checksum validation Checksum validation Checksum validation Checksum validation Double conversion (Python) Adding provenance metadata to file ID’s Generatin g logfiles Archival storage
  • 26. Case 3: Automatic for the People The promotion and integration of local data from the Portable Antiquities of the Netherlands (PAN) in an international network, making use of thesauri, data mining and Linked Open-Data techniques. “I am fluent in over six million forms of communication.” --Protocol droid C3PO Star Wars: Episode VI -Return of the Jedi. Directed by Richard Marquand. Lucasfilm Ltd. LCC, 1983 Reality: mapping metadata => harvesting => adding sources => enable access --Protocol-operating data manager V@L3NT1JN
  • 27. PAN – Portable Antiquities of the Netherlands
  • 28. CARARE-project: ‘Open Access’ archaeological publications visible in Europeana http://www.carare.eu/
  • 30. Initiatives Researchers Excavators Depot holders National Initiatives International portals International searching & downloading searching & downloading searching & downloading depositing depositingdepositing depositing depositing depositing OAI-PMH harvesting Depositing via SWORD
  • 31. General contact: Info@DANS.KNAW.NL Head Data Archive: Hella.Hollander@DANS.KNAW.NL Senior Data Steward / Preservation Officer: Valentijn.Gilissen@DANS.KNAW.NL Watch our videos on YouTube: https://www.youtube.com/user/ DANSDataArchiving Thanks for listening!