Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
ALIGNED Data Curation Methods and Tools
1. ALIGNED Data Curation Methods and
Tools
Rob Brennan, ALIGNED Coordinator
SWIMing VoCamp Workshop,
Dublin, 22 March 2016
2. 3/25/20162
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644055.
This communication reflects only the author’s view and the Commission is not responsible for any use that may be made of the information it contains.
4. Data Quality and Data Curation in
ALIGNED
• Building high quality data-intensive systems
requires high quality datasets
• But
– Datasets are now first class citizens with lifecycles
that are independent of the consuming apps
– Quality still problematic
• We observe:
– Rich data models support quality engineering
– Linked Data entering the enterprise
5. ALIGNED Tools for Data Curation
Productivity, Agility, Quality
Data
Engineering
Data Quality
Validation
Unified
Process
Governance
Data Integrity
Assurance
Data
Integration
Assurance
Semi-
Supervised
Data Curation
See:
http://aligned-project.eu/open-source-tools/
https://www.poolparty.biz/
Linked Data
Extract,
Transform,
Load
Taxonomy
Management
Dataset Release
Automation
6. ALIGNED Validates in Real-World, Data
Intensive Systems
Global History
Databank
Legal Information
System
Nucleus for the
Web of Data
Semantic
Middleware
7. Data
Consumers
Community of experts &
Volunteers
Electronic Archives
Example: Seshat Target System
databases
Seshat
Databank
Collective
Intelligence
High
Quality
Open
Data
Feedback
“improve the extraction of collective
intelligence from electronic archives,
research communities and data consumers
to improve the quality of published data”
8. Seshat Data Web
Wiki
RDF Triple Store
Linked Data
Publication
User
Management
Schema
Management
tool
Wiki Data
Entry/Validati
on Tool
Errors
Data
Visualisations
Data
Transformations
Links to other
Datasets
Seshat Data
Web Pages
Read/query
Enter
Data
Validate
Candidate
Time Series
Analysis
Data Export
Tool
Data Dump
File (TSV )
Candidate
Generation/
Filtering tools
Seshat Editor Seshat AdministratorSeshat Contributors Seshat Analyst
Copy of
Seshat Data
Seshat Schema
Knowledge
Model
Seshat Data
Knowledge
Model
Seshat Reader
FeedbackView
Data
Data Quality
Controls
Read
Data
DBpedia
External candidate
source
Workflow
Management
Wiki
Generation
tool
generate
Global History Databank Pilot Data Curation System
9. Goal is to minimise
work requirements
from expert users
(domain expert,
architect) and to
ensure data-quality
in different
dimensions at
different steps in
the process.
Dacura: Generic, Quality-Oriented
Data Curation Process
11. • Knowledge and Data Engineering Group/ADAPT Centre,
Trinity College Dublin
• Software Engineering Group,
University of Oxford
• Institute of Cognitive and Evolutionary Anthropology,
University of Oxford
• Agile Knowledge Engineering and Semantic Web Group
Universität Leipzig
• Semantic Web Company GmbH
• Content Strategy and Architecture Department,
Wolters Kluwer Germany,
Wolters Kluwer Poland
• Institute of Prehistory
Adam Mickiewicz University at Poznan
Partners
12. We want to help you!
The ALIGNED Consultancy Program
• Are you a business?
• Do any of these apply:
– Are you building data-intensive applications?
– Do you want to curate high quality data?
– Need help integrating Linked Data + apps?
– Want to integrate your software and data
engineering teams?
Call on the ALIGNED consultancy program!
http://aligned-project.eu/aligned-consultancy-program-opportunities/