SlideShare a Scribd company logo
1 of 35
Semantic
Interoperability
Courses
Course Module 3
Reference Data
Management
V0.10
June 2014
ISA Programme, Action 1.1
Disclaimer
This training material was prepared for the ISA programme of the European Commission by PwC
EU Services.
The views expressed in this report are purely those of the authors and may not, in any
circumstances, be interpreted as stating an official position of the European Commission.
The European Commission does not guarantee the accuracy of the information included in this
study, nor does it accept any responsibility for any use thereof.
Reference herein to any specific products, specifications, process, or service by trade name,
trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement,
recommendation, or favouring by the European Commission.
All care has been taken by the author to ensure that s/he has obtained, where necessary,
permission to use any parts of manuscripts including illustrations, maps, and graphs, on which
intellectual property rights already exist from the titular holder(s) of such rights or from her/his or
their legal representative.
SEMIC
SEMANTIC
INTEROPERABILITY
COMMUNITY
2
Learning Objectives
By the end of this training you should have an
understanding of:
 What reference data is, its context and purpose and how it
creates value for organisations.
 Why it is important to manage and govern the reference
data lifecycle.
 How to work with reference data using open-source tools.
3
Outline
• Definitions
• Reference data
• Reference data has many names: code list, taxonomy, thesaurus,
mapping, name authority list
• Importance & relevance
1. Introduction: what is reference data?
• What is reference data management
• Why is managing reference data important?
• Design
• Change management
• Documentation
• Harmonisation
2. Why must reference data be properly managed?
4
Definitions
What is reference data?
Reference data is small, discrete sets of values that are
not updated as part of business transactions but are usually
used to impose consistent classification. Reference data
normally has a low update frequency. Reference data is
relevant across more than one business system belonging
to different organisations and sectors.
European Commission – ISA Programme, 2014 (1)
5
Example: Country Code
Named Authority Lists
The table below displays an extract of the “Countries” code list as
published on the Metadata Registry (MDR) of the EU:
Authority Code Short Name Long Name
AND Andorra Principality of Andorra
ALB Albania Republic of Albania
AUT Austria Republic of Austria
BIG Bosnia and Herzegovina Bosnia and Herzegovina
… … …
http://publications.europa.eu/mdr/resource/authority/country/html/countries-eng.html#description
6
Reference data has many names
What is considered reference data?
7
• Code list: Complete set of data element values of a coded simple data
element [ISO 9735-1:2002, 4.14]
• Taxonomy: scheme of categories and subcategories that can be used to
sort and otherwise organize items of knowledge or information [ISO/DIS
25964-2].
• Thesaurus: controlled and structured vocabulary in which concepts are
represented by terms, organized so that relationships between concepts
are made explicit, and preferred terms are accompanied by lead-in entries
for synonyms or quasi-synonyms [ISO 25964-1:2011]
• Mapping: relationship between a concept in one vocabulary and one or
more concepts in another [ISO/DIS 25964-2].
• Name authority list: controlled vocabulary for use in naming particular
entities consistently [ISO/DIS 25964-2]
Source: https://joinup.ec.europa.eu/svn/adms/ADMS_v1.00/ADMS_SKOS_v1.00.html
Importance
Where is reference data used?
Within information systems
• For categorising and identifying data
• E.g. assigning personnel to a department from a list of predefined values
Between Information Systems
• For information sharing
• E.g. using a code list to describing the context of data which is exchanged between systems over
different member states. This ensures that member states are ‘talking’ about the same data.
Reference data… is just data!
8
Relevance of common reference data
Why is common reference data important?
To avoid semantic interoperability conflicts
•By using a common set of values for describing data which is exchanged
between different systems, interoperability conflicts can be avoided.
•Please refer to training module 1 for more information on interoperability
concepts and more specifically semantic interoperability
To avoid the need for mappings
•Mappings between different value sets of reference data are often inaccurate.
By using common value sets of reference data across domains and IT systems,
the need for creating mappings can be avoided.
9
Outline
• Definitions
• Reference data
• Reference data has many names: code list, taxonomy, thesaurus,
mapping, name authority list
• Importance & relevance
1. Introduction: what is reference data?
• What is reference data management
• Why is managing reference data important?
• Design
• Change management
• Documentation
• Harmonisation
2. Why must reference data be properly managed?
10
Definitions
What is reference data management?
Reference data management comprises planning, implementation
& control activities to ensure consistency with “golden version”
of contextual data values.
Reference Data Management is control over defined domain
values (also known as vocabularies), including control over
standardized terms, code values and other unique identifiers,
business definitions for each value, business relationships within
and across domain value lists, and the consistent, shared used of
accurate, timely and relevant reference data values to classify and
categorize data
11
DAMA International, 2009,
http://www.dama.org/i4a/pages/index.cfm?pageid=1
Relevance
Why is metadata management important?
• To ensure the use of a common setting
• To ensure continuity and quality of service
• To take decisions and manage changes in a controlled fashion
• To prevents conflicts between versions (version control)
• To improve data quality
12
Reference data management
Lifecycle
1. Data Design
2. Change Management
3. Documentation
4. Harmonisation
5. Implementation
13
1. Data Design
What
•Develop thesauri,
value sets, code
lists, etc.
•Select and reuse
existing reference
data sets
Why
•Impose consistent
classification of data
•Improve data quality
•Reduce
interoperability
issues
How
•Tools
•VocBench
•PoolParty
•ListPoint
14
1. Reference Data Design | Tools
PoolParty
PoolParty is a tool for creating thesauri, taxonomies and knowledge graphs based
on W3C standards such as SKOS, RDF and SPARQL.
(Semantic Web Company, 2014)
15
1. Reference Data Design | Tools
VocBench
VocBench is a web-based, multilingual, vocabulary editing and workflow tool
(W3C, 2001). It manages thesauri , authority lists and glossaries using SKOS-XL.
(FAO, 2014)
Listpoint
Listpoint is an open reference data platform combined with online tools to find and
combine data standards and code lists. Moreover, it helps users to make datasets
interoperable and kept up-to-date with updates. (Listpoint, 2014).
16
2. Change Management
What
•A combination of
management
processes for
incorporating
changes to value
sets.
Why
•Maintaining control
over the value sets
and the change
process
•Taking into account
the needs of
stakeholders when
adapting reference
data
How
•Defining each step in
the change process
and assigning roles
which are described
in a governance
structure
•Incorporating quality
control measures
17
2. Change Management
In a managed master data environment, specific individuals have the role of a business
data steward. They have the authority to create, update, and retire reference data
values, and to a lesser extent, in some circumstances, master data values,. Business
data stewards work with data professionals to ensure the highest quality reference and
master data. Many organizations define more specific roles and responsibilities, with
individuals often performing more than one role.
Steps in change management are:
1. Create and receive change requests
2. Identify the related stakeholders and understand their interest
3. Identify and evaluate the impacts of the proposed changes
4. Decide to accept or reject the change, or recommend a decision to management or
governance
5. Review and approve or deny the recommendation, if needed
6. Communicate the decision to stakeholders prior to making the change
7. Update the data
8. Inform stakeholders the change has been made
18
3. Documentation
What
•Representing the
reference data value
sets following
international
standards
•Describing the value
sets in a uniform
way
Why
•To avoid
misinterpretation of
the value set
•To ensure machine-
readability
•Description: to
facilitate searching
and retrieving
reference data from
a repository
How
•Representation
•SKOS
•GeneriCode
•XSD
•HTML
•Publication
•Metadata Registry
•Description
•ADMS
19
3. Documentation | XML representation
XML Extensible Markup Language (XML) is a simple, very flexible text format derived
from SGML (ISO 8879). It permits to represent reference data in many different
ways.
XML extract for representing Andalusia in the Countries NAL
<record adm.status="current" date.creation="2010-01-01" IMMC.approval.date="2012-06-27"
IMMC.proposal.date="2011-10-06" pub="false" celex="false" deprecated="false" id="COU0001">
<code-3166-1-alpha-2>AD</code-3166-1-alpha-2>
<code-3166-1-alpha-3>AND</code-3166-1-alpha-3>
<code-3166-1-num>020</code-3166-1-num>
<authority-code>AND</authority-code>
<code-iana>.ad</code-iana>
<code-tir>AND</code-tir>
<name><original.name>
<lg.version lg="cat">Andorra</lg.version>
</original.name></name>
<label>
<lg.version lg="bel" script="Cyrillic">Андора</lg.version>
<lg.version lg="bos">Andora</lg.version>
…
</label>
</record>
20
3. Documentation | SKOS representation
SKOS is an area of work developing specifications and standards to support
the use of knowledge organization systems (KOS).
SKOS extract for representing Andalusia in the Countries NAL
<skos:Concept rdf:about="http://publications.europa.eu/resource/authority/country/AND"
at:deprecated="false">
<skos:inScheme
rdf:resource="http://publications.europa.eu/resource/authority/country"/>
<at:authority-code>AND</at:authority-code>
<at:op-code>AND</at:op-code>
<atold:op-code>AND</atold:op-code>
<dc:identifier>AND</dc:identifier>
<at:start.use>1950-05-09</at:start.use>
<skos:prefLabel xml:lang="be">Андора</skos:prefLabel>
<skos:prefLabel xml:lang="bs">Andora</skos:prefLabel>
<skos:prefLabel xml:lang="bg">Андора</skos:prefLabel>
…
</skos:Concept>
21
3. Documentation | XSD representation
XSD: XML Schemas express shared vocabularies and allow machines to carry out
rules made by people.
XSD extract for representing a country in the Countries NAL
<!--RECORD DEFINITION-->
<xs:element name="record">
<xs:complexType>
<xs:sequence>
<xs:element ref="code-3166-1-alpha-2"/>
<xs:element ref="code-3166-1-alpha-3"/>
<xs:element ref="code-3166-1-num"/>
<xs:element maxOccurs="unbounded" ref="code-3166-3"
minOccurs="0"/>
<xs:element ref="authority-code"/>
<xs:element ref="op-styleguide" minOccurs="0"/>
<xs:element maxOccurs="unbounded" ref="code-iana"
minOccurs="0"/>
<xs:element ref="code-tir" minOccurs="0"/>
<xs:element ref="name"/>
<xs:element ref="label"/>
…
</xs:sequence>
</xs:complexType>
</xs:element >
22
3. Documentation | Genericode representation
GeneriCode
Genericode defines a standard format for defining code lists (also known as
enumerations or controlled vocabularies). It contains:
• a standard model and XML representation for the contents of a code list;
• a standard model and XML representation for data associated with items
in a code list;
• a standard model and XML representation for how new code lists are
derived from existing code lists.
“Genericode not only provides a representation of the items in a code list, it
also provides an audit trail for how that code list is related to previous versions
of the code list, or to other code lists. This simplifies the effort of
understanding how a new code list version differs from the previous version,
and simplifies the effort in calculating the impact of the change on existing
systems and processes.” (Genericode, 2014)
23
3. Documentation | HTML representation
HTML (Hyper Text Markup Language) is used to describe documents
24
3. Documentation | ADMS description
ADMS
The Asset Description Metadata Schema (ADMS) is a common way to describe
semantic interoperability assets making it possible for everyone to search and
discover them.
ADMS allows public administrations, businesses, standardisation bodies and
academia to (European Commission – ISA Programme, 2011):
• “describe semantic assets in a common way so that they can be seamlessly cross-
queried and discovered by ICT developers from a single access point, such as
Joinup;
• search, identify, retrieve, compare semantic assets to be reused avoiding
duplication and expensive design work through a single point of access;
• keep their own system for documenting and storing semantic assets;
• improve indexing and visibility of their own assets;
• link semantic assets to one another in cross-border and cross-sector settings.”
25
https://joinup.ec.europa.eu/asset/adms/description
3. Documentation | Publication
Metadata Registry
A best practice in reference data management is to publish value sets on an
authoritative source. An example of such a source is the Metadata Registry
(MDR) of the EU, which is maintained by the Publications Office. The MDR
registers and maintains metadata used by European Institutions involved in the
legal decision making process.
26
http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=35343
4. Harmonisation
What Why
•To foster
interoperability with
reference data value
sets which are
represented using a
different standard
How
•Reference Data
Mappings
•Tool: Silk Workbench
27
• The alignment of
structural metadata
used for information
exchange either
through the creation
of mappings between
terms of two or more
specifications for
structural metadata or
by forging a wide
consensus on the use
of a common
specification.
4. Harmonisation | Example
The table below shows a mapping of the Publications Office
Named Authority List for countries with the ISO 3166 standard.
Authority Code ISO 3166 Short Name Long Name
AND AND Andorra Principality of Andorra
ALB ALB Albania Republic of Albania
AUT AUT Austria Republic of Austria
BIH BIH Bosnia and Herzegovina Bosnia and Herzegovina
… … … …
http://publications.europa.eu/mdr/resource/authority/country/html/countries-eng.html#description
28
4. Harmonisation | Silk Workbench
Tool: Silk Workbench
The Silk framework is a tool for discovering relationships between data
items within different Linked Data sources.
European Commission – ISA Programme, 2014 (2)
29
5. Implementation
What
•Propagating
reference data
changes into the
software
development
lifecycle
•Manage and support
the exchange of
information between
systems
Why
•Coordinated use of
reference data
•Reference data has a
lifecycle and needs
to be updated
•Improving reusability
How
•Manual or automatic
propagation
•In case of automatic
propagation,
changes to reference
data into operational
systems should be
controlled by
governance
processes
30
Implementation of reference data in
information systems
• To manage and support the exchange of information
between systems, the propagation of changes to reference
data is needed
• Can be done automatically or manual
• Propagation of reference data changes needs to be part of
the software development lifecycle in order to ensure
coordinated and timely updates of reference data in all
information systems involved.
31
References
European Commission – ISA Programme. (2011). Asset Description Metadata
Schema (ADMS). Brussels.
European Commission - ISA Programme. (2012). D7.1.3 - Study on persistent
URIs, with identification of best practices and recommendations on the topic for
the MSs and the EC. Brussels.
European Commission - ISA Programme. (2012). Asset Description Metadata
Schema for Software. Brussels.
European Commission – ISA Programme. (2014). D4.1. Metadata management
requirements and existing solutions in EU Institutions and Member States.
Brussels.
European Commission – ISA Programme. (2014). D4.5. Metadata alignment
pilot in the EU institutions and MSs. Brussels.
32
References
W3C. (2001). VocBench. Available at
http://www.w3.org/2001/sw/wiki/VocBench.
DAMA International. (2009). DAMA International. Available at
http://www.dama.org/
FAO. (2014). VocBench. Available at http://aims.fao.org/tools/vocbench-2.
Genericode. (2014). What is ‘genericode’? Available at
http://www.genericode.org/.
Listpoint. (2014). Welcome to Listpoint. Available at
https://www.listpoint.co.uk/.
Mosley, M., Brackett, M., Earley, S., & Henderson, D. (2009). The DAMA Guide
to The Data Management Body of Knowledge (DAMA-DMBOK Guide). New
Jersey: Technics Publications, LLC. 33
References
European Commission - ISA Programme. (2012). ADMS Controlled
Vocabularies. Available at
https://joinup.ec.europa.eu/svn/adms/ADMS_v1.00/ADMS_SKOS_v1.00.html
European Commission - ISA Programme. (2012). SEMIC – 10 Rules for
Persistent URIs. Available at
https://joinup.ec.europa.eu/community/semic/document/10-rules-persistent-
uris
Publications Office of the EU. (2014). Metadata Registry. Available at
http://publications.europa.eu/mdr/resource/authority/country/html/countries-
eng.html#description.
ISO. (2014). ISO/IEC 11179-1:2004 - Information technology - Metadata
registries (MDR) - Part 1: Framework. Available at
http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumb
er=35343
34
Join the SEMIC group on LinkedIn
Follow @SEMICeu on Twitter
Join the SEMIC community on Joinup
Project Officers Vassilios.Peristeras@ec.europa.eu
Suzanne.Wigard@ec.europa.eu
Athanasios.Karalopoulos@ec.europa.eu
Get involvedVisit our initiatives
ADMS.
SW
CORE
VOCABULARY
PUBLIC
SERVICE
REGISTERED
VOCABULARY
ORGANIZATION

More Related Content

What's hot

Service Providers within the UK Access Management Federation
Service Providers within the UK Access Management FederationService Providers within the UK Access Management Federation
Service Providers within the UK Access Management FederationJISC.AM
 
The linked open government data and metadata lifecycle
The linked open government data and metadata lifecycleThe linked open government data and metadata lifecycle
The linked open government data and metadata lifecycleOpen Data Support
 
Promoting the re use of open data through ODIP
Promoting the re use of open data through ODIPPromoting the re use of open data through ODIP
Promoting the re use of open data through ODIPOpen Data Support
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactHerman Stehouwer
 
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...European Data Forum
 
Semic 2011 highlights report
Semic 2011 highlights report Semic 2011 highlights report
Semic 2011 highlights report Semic.eu
 
Organisational Identifiers for UK research
Organisational Identifiers for UK researchOrganisational Identifiers for UK research
Organisational Identifiers for UK researchChristopher Brown
 
Online Educa: JISC Access and Identity Management
Online Educa: JISC Access and Identity ManagementOnline Educa: JISC Access and Identity Management
Online Educa: JISC Access and Identity ManagementJISC.AM
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation Research Data Alliance
 
JISC License Workshop
JISC License WorkshopJISC License Workshop
JISC License WorkshopJISC.AM
 
Designing and developing vocabularies in RDF
Designing and developing vocabularies in RDFDesigning and developing vocabularies in RDF
Designing and developing vocabularies in RDFOpen Data Support
 
Six Ways to Simplify Metadata Management
Six Ways to Simplify Metadata ManagementSix Ways to Simplify Metadata Management
Six Ways to Simplify Metadata ManagementEnterprise Knowledge
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessSusanna-Assunta Sansone
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | EUDAT
 
The PSI Directive and Open Government Data
The PSI Directive and Open Government DataThe PSI Directive and Open Government Data
The PSI Directive and Open Government DataOpen Data Support
 
Horizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilotHorizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilotSarah Jones
 

What's hot (20)

Service Providers within the UK Access Management Federation
Service Providers within the UK Access Management FederationService Providers within the UK Access Management Federation
Service Providers within the UK Access Management Federation
 
The linked open government data and metadata lifecycle
The linked open government data and metadata lifecycleThe linked open government data and metadata lifecycle
The linked open government data and metadata lifecycle
 
Promoting the re use of open data through ODIP
Promoting the re use of open data through ODIPPromoting the re use of open data through ODIP
Promoting the re use of open data through ODIP
 
Research Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected ImpactResearch Data Alliance: Current Activities and Expected Impact
Research Data Alliance: Current Activities and Expected Impact
 
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...
EDF2013: Selected Talk Nikolaos Loutas, João Rodrigues Frade: Linked Open Gov...
 
Semic 2011 highlights report
Semic 2011 highlights report Semic 2011 highlights report
Semic 2011 highlights report
 
Organisational Identifiers for UK research
Organisational Identifiers for UK researchOrganisational Identifiers for UK research
Organisational Identifiers for UK research
 
Semantic Technology in Publishing & Finance
Semantic Technology in Publishing & FinanceSemantic Technology in Publishing & Finance
Semantic Technology in Publishing & Finance
 
Online Educa: JISC Access and Identity Management
Online Educa: JISC Access and Identity ManagementOnline Educa: JISC Access and Identity Management
Online Educa: JISC Access and Identity Management
 
Collaborate to Share
Collaborate to ShareCollaborate to Share
Collaborate to Share
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
JISC License Workshop
JISC License WorkshopJISC License Workshop
JISC License Workshop
 
Designing and developing vocabularies in RDF
Designing and developing vocabularies in RDFDesigning and developing vocabularies in RDF
Designing and developing vocabularies in RDF
 
Metadata : Concentrating on the data, not on the scheme
Metadata : Concentrating on the data, not on the schemeMetadata : Concentrating on the data, not on the scheme
Metadata : Concentrating on the data, not on the scheme
 
Six Ways to Simplify Metadata Management
Six Ways to Simplify Metadata ManagementSix Ways to Simplify Metadata Management
Six Ways to Simplify Metadata Management
 
What is a DMP
What is a DMPWhat is a DMP
What is a DMP
 
FAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRnessFAIRsharing: how we assist with FAIRness
FAIRsharing: how we assist with FAIRness
 
Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu | Research engagement in EUDAT| www.eudat.eu |
Research engagement in EUDAT| www.eudat.eu |
 
The PSI Directive and Open Government Data
The PSI Directive and Open Government DataThe PSI Directive and Open Government Data
The PSI Directive and Open Government Data
 
Horizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilotHorizon 2020 and the open research data pilot
Horizon 2020 and the open research data pilot
 

Similar to Semantic interoperability courses training module 3 - reference data v0.10

chapter8-220725121547-f85998bb.pdf
chapter8-220725121547-f85998bb.pdfchapter8-220725121547-f85998bb.pdf
chapter8-220725121547-f85998bb.pdfMahmoudSOLIMAN380726
 
‏‏Chapter 8: Reference and Master Data Management
‏‏Chapter 8: Reference and Master Data Management ‏‏Chapter 8: Reference and Master Data Management
‏‏Chapter 8: Reference and Master Data Management Ahmed Alorage
 
Credit Suisse: Multi-Domain Enterprise Reference Data
Credit Suisse: Multi-Domain Enterprise Reference DataCredit Suisse: Multi-Domain Enterprise Reference Data
Credit Suisse: Multi-Domain Enterprise Reference DataOrchestra Networks
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookSusanna-Assunta Sansone
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 
Instructions1.Project Assignment OutlineDecision level of t.docx
Instructions1.Project Assignment OutlineDecision level of t.docxInstructions1.Project Assignment OutlineDecision level of t.docx
Instructions1.Project Assignment OutlineDecision level of t.docxnormanibarber20063
 
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATION
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATIONCLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATION
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATIONIRJET Journal
 
Ecm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleEcm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleChristopher Wynder
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData Blueprint
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingDATAVERSITY
 
Data-Ed: Metadata Strategies
 Data-Ed: Metadata Strategies Data-Ed: Metadata Strategies
Data-Ed: Metadata StrategiesData Blueprint
 
Data Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataData Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataDATAVERSITY
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData Blueprint
 
Credential Transparency Initiative (CTI) Step 1 – Functional Requirements
Credential Transparency Initiative (CTI)Step 1 – Functional RequirementsCredential Transparency Initiative (CTI)Step 1 – Functional Requirements
Credential Transparency Initiative (CTI) Step 1 – Functional RequirementsIllinois workNet
 
FAIR, standards and FAIRsharing - MAQC Society 2019
FAIR, standards and FAIRsharing - MAQC Society 2019FAIR, standards and FAIRsharing - MAQC Society 2019
FAIR, standards and FAIRsharing - MAQC Society 2019Susanna-Assunta Sansone
 

Similar to Semantic interoperability courses training module 3 - reference data v0.10 (20)

chapter8-220725121547-f85998bb.pdf
chapter8-220725121547-f85998bb.pdfchapter8-220725121547-f85998bb.pdf
chapter8-220725121547-f85998bb.pdf
 
‏‏Chapter 8: Reference and Master Data Management
‏‏Chapter 8: Reference and Master Data Management ‏‏Chapter 8: Reference and Master Data Management
‏‏Chapter 8: Reference and Master Data Management
 
Credit Suisse: Multi-Domain Enterprise Reference Data
Credit Suisse: Multi-Domain Enterprise Reference DataCredit Suisse: Multi-Domain Enterprise Reference Data
Credit Suisse: Multi-Domain Enterprise Reference Data
 
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR CookbookFAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
FAIRification is a Team Sport: FAIRsharing and the FAIR Cookbook
 
FAIR: standards and services
FAIR: standards and servicesFAIR: standards and services
FAIR: standards and services
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
Instructions1.Project Assignment OutlineDecision level of t.docx
Instructions1.Project Assignment OutlineDecision level of t.docxInstructions1.Project Assignment OutlineDecision level of t.docx
Instructions1.Project Assignment OutlineDecision level of t.docx
 
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATION
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATIONCLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATION
CLUSTERING MODELS FOR MUTUAL FUND RECOMMENDATION
 
Dc2010 fanning
Dc2010 fanningDc2010 fanning
Dc2010 fanning
 
Ecm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sampleEcm implementation planning_workshop_hospital_sample
Ecm implementation planning_workshop_hospital_sample
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
 
Data-Ed: Metadata Strategies
 Data-Ed: Metadata Strategies Data-Ed: Metadata Strategies
Data-Ed: Metadata Strategies
 
Knowledge Services
Knowledge ServicesKnowledge Services
Knowledge Services
 
Bi
BiBi
Bi
 
Data Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: MetadataData Systems Integration & Business Value Pt. 1: Metadata
Data Systems Integration & Business Value Pt. 1: Metadata
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
 
Credential Transparency Initiative (CTI) Step 1 – Functional Requirements
Credential Transparency Initiative (CTI)Step 1 – Functional RequirementsCredential Transparency Initiative (CTI)Step 1 – Functional Requirements
Credential Transparency Initiative (CTI) Step 1 – Functional Requirements
 
FAIR, standards and FAIRsharing - MAQC Society 2019
FAIR, standards and FAIRsharing - MAQC Society 2019FAIR, standards and FAIRsharing - MAQC Society 2019
FAIR, standards and FAIRsharing - MAQC Society 2019
 
Unit Ii
Unit IiUnit Ii
Unit Ii
 

More from Semic.eu

Semic 2016 highlight report
Semic 2016 highlight reportSemic 2016 highlight report
Semic 2016 highlight reportSemic.eu
 
Jose Arcos DG COMM- Data Visualisation on ec.europa.eu
Jose Arcos DG COMM- Data Visualisation on ec.europa.euJose Arcos DG COMM- Data Visualisation on ec.europa.eu
Jose Arcos DG COMM- Data Visualisation on ec.europa.euSemic.eu
 
The creation of an international core data model
The creation of an international core data modelThe creation of an international core data model
The creation of an international core data modelSemic.eu
 
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabulary
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabularyIsa case study_how_to_describe_organizations_in_rdf_core_business_vocabulary
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabularySemic.eu
 
Semic 2014 highlights report
Semic 2014 highlights report Semic 2014 highlights report
Semic 2014 highlights report Semic.eu
 
Semic 2013 highlights report
Semic 2013 highlights report Semic 2013 highlights report
Semic 2013 highlights report Semic.eu
 
Semic 2012 highlights report
Semic 2012 highlights report Semic 2012 highlights report
Semic 2012 highlights report Semic.eu
 
SEMIC 2015 Highlights Report
SEMIC 2015 Highlights ReportSEMIC 2015 Highlights Report
SEMIC 2015 Highlights ReportSemic.eu
 
Semantische Standards in der Öffentlichen Verwaltung in Europa
Semantische Standards in der Öffentlichen Verwaltung in EuropaSemantische Standards in der Öffentlichen Verwaltung in Europa
Semantische Standards in der Öffentlichen Verwaltung in EuropaSemic.eu
 
Promoting semantic interoperability between public administrations in Europe
Promoting semantic interoperability between public administrations in EuropePromoting semantic interoperability between public administrations in Europe
Promoting semantic interoperability between public administrations in EuropeSemic.eu
 
Survey on metadata management and governance in Europe
Survey on metadata management and governance in EuropeSurvey on metadata management and governance in Europe
Survey on metadata management and governance in EuropeSemic.eu
 

More from Semic.eu (11)

Semic 2016 highlight report
Semic 2016 highlight reportSemic 2016 highlight report
Semic 2016 highlight report
 
Jose Arcos DG COMM- Data Visualisation on ec.europa.eu
Jose Arcos DG COMM- Data Visualisation on ec.europa.euJose Arcos DG COMM- Data Visualisation on ec.europa.eu
Jose Arcos DG COMM- Data Visualisation on ec.europa.eu
 
The creation of an international core data model
The creation of an international core data modelThe creation of an international core data model
The creation of an international core data model
 
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabulary
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabularyIsa case study_how_to_describe_organizations_in_rdf_core_business_vocabulary
Isa case study_how_to_describe_organizations_in_rdf_core_business_vocabulary
 
Semic 2014 highlights report
Semic 2014 highlights report Semic 2014 highlights report
Semic 2014 highlights report
 
Semic 2013 highlights report
Semic 2013 highlights report Semic 2013 highlights report
Semic 2013 highlights report
 
Semic 2012 highlights report
Semic 2012 highlights report Semic 2012 highlights report
Semic 2012 highlights report
 
SEMIC 2015 Highlights Report
SEMIC 2015 Highlights ReportSEMIC 2015 Highlights Report
SEMIC 2015 Highlights Report
 
Semantische Standards in der Öffentlichen Verwaltung in Europa
Semantische Standards in der Öffentlichen Verwaltung in EuropaSemantische Standards in der Öffentlichen Verwaltung in Europa
Semantische Standards in der Öffentlichen Verwaltung in Europa
 
Promoting semantic interoperability between public administrations in Europe
Promoting semantic interoperability between public administrations in EuropePromoting semantic interoperability between public administrations in Europe
Promoting semantic interoperability between public administrations in Europe
 
Survey on metadata management and governance in Europe
Survey on metadata management and governance in EuropeSurvey on metadata management and governance in Europe
Survey on metadata management and governance in Europe
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Recently uploaded (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Semantic interoperability courses training module 3 - reference data v0.10

  • 1. Semantic Interoperability Courses Course Module 3 Reference Data Management V0.10 June 2014 ISA Programme, Action 1.1
  • 2. Disclaimer This training material was prepared for the ISA programme of the European Commission by PwC EU Services. The views expressed in this report are purely those of the authors and may not, in any circumstances, be interpreted as stating an official position of the European Commission. The European Commission does not guarantee the accuracy of the information included in this study, nor does it accept any responsibility for any use thereof. Reference herein to any specific products, specifications, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favouring by the European Commission. All care has been taken by the author to ensure that s/he has obtained, where necessary, permission to use any parts of manuscripts including illustrations, maps, and graphs, on which intellectual property rights already exist from the titular holder(s) of such rights or from her/his or their legal representative. SEMIC SEMANTIC INTEROPERABILITY COMMUNITY 2
  • 3. Learning Objectives By the end of this training you should have an understanding of:  What reference data is, its context and purpose and how it creates value for organisations.  Why it is important to manage and govern the reference data lifecycle.  How to work with reference data using open-source tools. 3
  • 4. Outline • Definitions • Reference data • Reference data has many names: code list, taxonomy, thesaurus, mapping, name authority list • Importance & relevance 1. Introduction: what is reference data? • What is reference data management • Why is managing reference data important? • Design • Change management • Documentation • Harmonisation 2. Why must reference data be properly managed? 4
  • 5. Definitions What is reference data? Reference data is small, discrete sets of values that are not updated as part of business transactions but are usually used to impose consistent classification. Reference data normally has a low update frequency. Reference data is relevant across more than one business system belonging to different organisations and sectors. European Commission – ISA Programme, 2014 (1) 5
  • 6. Example: Country Code Named Authority Lists The table below displays an extract of the “Countries” code list as published on the Metadata Registry (MDR) of the EU: Authority Code Short Name Long Name AND Andorra Principality of Andorra ALB Albania Republic of Albania AUT Austria Republic of Austria BIG Bosnia and Herzegovina Bosnia and Herzegovina … … … http://publications.europa.eu/mdr/resource/authority/country/html/countries-eng.html#description 6
  • 7. Reference data has many names What is considered reference data? 7 • Code list: Complete set of data element values of a coded simple data element [ISO 9735-1:2002, 4.14] • Taxonomy: scheme of categories and subcategories that can be used to sort and otherwise organize items of knowledge or information [ISO/DIS 25964-2]. • Thesaurus: controlled and structured vocabulary in which concepts are represented by terms, organized so that relationships between concepts are made explicit, and preferred terms are accompanied by lead-in entries for synonyms or quasi-synonyms [ISO 25964-1:2011] • Mapping: relationship between a concept in one vocabulary and one or more concepts in another [ISO/DIS 25964-2]. • Name authority list: controlled vocabulary for use in naming particular entities consistently [ISO/DIS 25964-2] Source: https://joinup.ec.europa.eu/svn/adms/ADMS_v1.00/ADMS_SKOS_v1.00.html
  • 8. Importance Where is reference data used? Within information systems • For categorising and identifying data • E.g. assigning personnel to a department from a list of predefined values Between Information Systems • For information sharing • E.g. using a code list to describing the context of data which is exchanged between systems over different member states. This ensures that member states are ‘talking’ about the same data. Reference data… is just data! 8
  • 9. Relevance of common reference data Why is common reference data important? To avoid semantic interoperability conflicts •By using a common set of values for describing data which is exchanged between different systems, interoperability conflicts can be avoided. •Please refer to training module 1 for more information on interoperability concepts and more specifically semantic interoperability To avoid the need for mappings •Mappings between different value sets of reference data are often inaccurate. By using common value sets of reference data across domains and IT systems, the need for creating mappings can be avoided. 9
  • 10. Outline • Definitions • Reference data • Reference data has many names: code list, taxonomy, thesaurus, mapping, name authority list • Importance & relevance 1. Introduction: what is reference data? • What is reference data management • Why is managing reference data important? • Design • Change management • Documentation • Harmonisation 2. Why must reference data be properly managed? 10
  • 11. Definitions What is reference data management? Reference data management comprises planning, implementation & control activities to ensure consistency with “golden version” of contextual data values. Reference Data Management is control over defined domain values (also known as vocabularies), including control over standardized terms, code values and other unique identifiers, business definitions for each value, business relationships within and across domain value lists, and the consistent, shared used of accurate, timely and relevant reference data values to classify and categorize data 11 DAMA International, 2009, http://www.dama.org/i4a/pages/index.cfm?pageid=1
  • 12. Relevance Why is metadata management important? • To ensure the use of a common setting • To ensure continuity and quality of service • To take decisions and manage changes in a controlled fashion • To prevents conflicts between versions (version control) • To improve data quality 12
  • 13. Reference data management Lifecycle 1. Data Design 2. Change Management 3. Documentation 4. Harmonisation 5. Implementation 13
  • 14. 1. Data Design What •Develop thesauri, value sets, code lists, etc. •Select and reuse existing reference data sets Why •Impose consistent classification of data •Improve data quality •Reduce interoperability issues How •Tools •VocBench •PoolParty •ListPoint 14
  • 15. 1. Reference Data Design | Tools PoolParty PoolParty is a tool for creating thesauri, taxonomies and knowledge graphs based on W3C standards such as SKOS, RDF and SPARQL. (Semantic Web Company, 2014) 15
  • 16. 1. Reference Data Design | Tools VocBench VocBench is a web-based, multilingual, vocabulary editing and workflow tool (W3C, 2001). It manages thesauri , authority lists and glossaries using SKOS-XL. (FAO, 2014) Listpoint Listpoint is an open reference data platform combined with online tools to find and combine data standards and code lists. Moreover, it helps users to make datasets interoperable and kept up-to-date with updates. (Listpoint, 2014). 16
  • 17. 2. Change Management What •A combination of management processes for incorporating changes to value sets. Why •Maintaining control over the value sets and the change process •Taking into account the needs of stakeholders when adapting reference data How •Defining each step in the change process and assigning roles which are described in a governance structure •Incorporating quality control measures 17
  • 18. 2. Change Management In a managed master data environment, specific individuals have the role of a business data steward. They have the authority to create, update, and retire reference data values, and to a lesser extent, in some circumstances, master data values,. Business data stewards work with data professionals to ensure the highest quality reference and master data. Many organizations define more specific roles and responsibilities, with individuals often performing more than one role. Steps in change management are: 1. Create and receive change requests 2. Identify the related stakeholders and understand their interest 3. Identify and evaluate the impacts of the proposed changes 4. Decide to accept or reject the change, or recommend a decision to management or governance 5. Review and approve or deny the recommendation, if needed 6. Communicate the decision to stakeholders prior to making the change 7. Update the data 8. Inform stakeholders the change has been made 18
  • 19. 3. Documentation What •Representing the reference data value sets following international standards •Describing the value sets in a uniform way Why •To avoid misinterpretation of the value set •To ensure machine- readability •Description: to facilitate searching and retrieving reference data from a repository How •Representation •SKOS •GeneriCode •XSD •HTML •Publication •Metadata Registry •Description •ADMS 19
  • 20. 3. Documentation | XML representation XML Extensible Markup Language (XML) is a simple, very flexible text format derived from SGML (ISO 8879). It permits to represent reference data in many different ways. XML extract for representing Andalusia in the Countries NAL <record adm.status="current" date.creation="2010-01-01" IMMC.approval.date="2012-06-27" IMMC.proposal.date="2011-10-06" pub="false" celex="false" deprecated="false" id="COU0001"> <code-3166-1-alpha-2>AD</code-3166-1-alpha-2> <code-3166-1-alpha-3>AND</code-3166-1-alpha-3> <code-3166-1-num>020</code-3166-1-num> <authority-code>AND</authority-code> <code-iana>.ad</code-iana> <code-tir>AND</code-tir> <name><original.name> <lg.version lg="cat">Andorra</lg.version> </original.name></name> <label> <lg.version lg="bel" script="Cyrillic">Андора</lg.version> <lg.version lg="bos">Andora</lg.version> … </label> </record> 20
  • 21. 3. Documentation | SKOS representation SKOS is an area of work developing specifications and standards to support the use of knowledge organization systems (KOS). SKOS extract for representing Andalusia in the Countries NAL <skos:Concept rdf:about="http://publications.europa.eu/resource/authority/country/AND" at:deprecated="false"> <skos:inScheme rdf:resource="http://publications.europa.eu/resource/authority/country"/> <at:authority-code>AND</at:authority-code> <at:op-code>AND</at:op-code> <atold:op-code>AND</atold:op-code> <dc:identifier>AND</dc:identifier> <at:start.use>1950-05-09</at:start.use> <skos:prefLabel xml:lang="be">Андора</skos:prefLabel> <skos:prefLabel xml:lang="bs">Andora</skos:prefLabel> <skos:prefLabel xml:lang="bg">Андора</skos:prefLabel> … </skos:Concept> 21
  • 22. 3. Documentation | XSD representation XSD: XML Schemas express shared vocabularies and allow machines to carry out rules made by people. XSD extract for representing a country in the Countries NAL <!--RECORD DEFINITION--> <xs:element name="record"> <xs:complexType> <xs:sequence> <xs:element ref="code-3166-1-alpha-2"/> <xs:element ref="code-3166-1-alpha-3"/> <xs:element ref="code-3166-1-num"/> <xs:element maxOccurs="unbounded" ref="code-3166-3" minOccurs="0"/> <xs:element ref="authority-code"/> <xs:element ref="op-styleguide" minOccurs="0"/> <xs:element maxOccurs="unbounded" ref="code-iana" minOccurs="0"/> <xs:element ref="code-tir" minOccurs="0"/> <xs:element ref="name"/> <xs:element ref="label"/> … </xs:sequence> </xs:complexType> </xs:element > 22
  • 23. 3. Documentation | Genericode representation GeneriCode Genericode defines a standard format for defining code lists (also known as enumerations or controlled vocabularies). It contains: • a standard model and XML representation for the contents of a code list; • a standard model and XML representation for data associated with items in a code list; • a standard model and XML representation for how new code lists are derived from existing code lists. “Genericode not only provides a representation of the items in a code list, it also provides an audit trail for how that code list is related to previous versions of the code list, or to other code lists. This simplifies the effort of understanding how a new code list version differs from the previous version, and simplifies the effort in calculating the impact of the change on existing systems and processes.” (Genericode, 2014) 23
  • 24. 3. Documentation | HTML representation HTML (Hyper Text Markup Language) is used to describe documents 24
  • 25. 3. Documentation | ADMS description ADMS The Asset Description Metadata Schema (ADMS) is a common way to describe semantic interoperability assets making it possible for everyone to search and discover them. ADMS allows public administrations, businesses, standardisation bodies and academia to (European Commission – ISA Programme, 2011): • “describe semantic assets in a common way so that they can be seamlessly cross- queried and discovered by ICT developers from a single access point, such as Joinup; • search, identify, retrieve, compare semantic assets to be reused avoiding duplication and expensive design work through a single point of access; • keep their own system for documenting and storing semantic assets; • improve indexing and visibility of their own assets; • link semantic assets to one another in cross-border and cross-sector settings.” 25 https://joinup.ec.europa.eu/asset/adms/description
  • 26. 3. Documentation | Publication Metadata Registry A best practice in reference data management is to publish value sets on an authoritative source. An example of such a source is the Metadata Registry (MDR) of the EU, which is maintained by the Publications Office. The MDR registers and maintains metadata used by European Institutions involved in the legal decision making process. 26 http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=35343
  • 27. 4. Harmonisation What Why •To foster interoperability with reference data value sets which are represented using a different standard How •Reference Data Mappings •Tool: Silk Workbench 27 • The alignment of structural metadata used for information exchange either through the creation of mappings between terms of two or more specifications for structural metadata or by forging a wide consensus on the use of a common specification.
  • 28. 4. Harmonisation | Example The table below shows a mapping of the Publications Office Named Authority List for countries with the ISO 3166 standard. Authority Code ISO 3166 Short Name Long Name AND AND Andorra Principality of Andorra ALB ALB Albania Republic of Albania AUT AUT Austria Republic of Austria BIH BIH Bosnia and Herzegovina Bosnia and Herzegovina … … … … http://publications.europa.eu/mdr/resource/authority/country/html/countries-eng.html#description 28
  • 29. 4. Harmonisation | Silk Workbench Tool: Silk Workbench The Silk framework is a tool for discovering relationships between data items within different Linked Data sources. European Commission – ISA Programme, 2014 (2) 29
  • 30. 5. Implementation What •Propagating reference data changes into the software development lifecycle •Manage and support the exchange of information between systems Why •Coordinated use of reference data •Reference data has a lifecycle and needs to be updated •Improving reusability How •Manual or automatic propagation •In case of automatic propagation, changes to reference data into operational systems should be controlled by governance processes 30
  • 31. Implementation of reference data in information systems • To manage and support the exchange of information between systems, the propagation of changes to reference data is needed • Can be done automatically or manual • Propagation of reference data changes needs to be part of the software development lifecycle in order to ensure coordinated and timely updates of reference data in all information systems involved. 31
  • 32. References European Commission – ISA Programme. (2011). Asset Description Metadata Schema (ADMS). Brussels. European Commission - ISA Programme. (2012). D7.1.3 - Study on persistent URIs, with identification of best practices and recommendations on the topic for the MSs and the EC. Brussels. European Commission - ISA Programme. (2012). Asset Description Metadata Schema for Software. Brussels. European Commission – ISA Programme. (2014). D4.1. Metadata management requirements and existing solutions in EU Institutions and Member States. Brussels. European Commission – ISA Programme. (2014). D4.5. Metadata alignment pilot in the EU institutions and MSs. Brussels. 32
  • 33. References W3C. (2001). VocBench. Available at http://www.w3.org/2001/sw/wiki/VocBench. DAMA International. (2009). DAMA International. Available at http://www.dama.org/ FAO. (2014). VocBench. Available at http://aims.fao.org/tools/vocbench-2. Genericode. (2014). What is ‘genericode’? Available at http://www.genericode.org/. Listpoint. (2014). Welcome to Listpoint. Available at https://www.listpoint.co.uk/. Mosley, M., Brackett, M., Earley, S., & Henderson, D. (2009). The DAMA Guide to The Data Management Body of Knowledge (DAMA-DMBOK Guide). New Jersey: Technics Publications, LLC. 33
  • 34. References European Commission - ISA Programme. (2012). ADMS Controlled Vocabularies. Available at https://joinup.ec.europa.eu/svn/adms/ADMS_v1.00/ADMS_SKOS_v1.00.html European Commission - ISA Programme. (2012). SEMIC – 10 Rules for Persistent URIs. Available at https://joinup.ec.europa.eu/community/semic/document/10-rules-persistent- uris Publications Office of the EU. (2014). Metadata Registry. Available at http://publications.europa.eu/mdr/resource/authority/country/html/countries- eng.html#description. ISO. (2014). ISO/IEC 11179-1:2004 - Information technology - Metadata registries (MDR) - Part 1: Framework. Available at http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumb er=35343 34
  • 35. Join the SEMIC group on LinkedIn Follow @SEMICeu on Twitter Join the SEMIC community on Joinup Project Officers Vassilios.Peristeras@ec.europa.eu Suzanne.Wigard@ec.europa.eu Athanasios.Karalopoulos@ec.europa.eu Get involvedVisit our initiatives ADMS. SW CORE VOCABULARY PUBLIC SERVICE REGISTERED VOCABULARY ORGANIZATION