Indexing Structures in Database Management system.pdf
ERIM, REDm-MED and Orbital (II)
1. Managing Engineering Research Data
at the University of Lincoln
Part 2 of 2
Mansur Darlington
20 January 2012
1 Orbital DMP Meeting 20.01.12
2. THE NATURE OF ENGINEERING RESEARCH DATA
2 Orbital DMP Meeting 20.01.12
3. ERIM
Engineering Research Information
Management
JISC MRD Programme Phase 1
http://www.ukoln.ac.uk/projects/erim/
3 Orbital DMP Meeting 20.01.12
4. ERIM Project Overview
• Primarily associated with the engineering
research domain.
• To better understand the research data that
are collected, generated and used in
engineering research activities.
• To better understand the context in which the
data are collected, generated and used.
• To inform the way that the data can be
managed so that they are more easily used
or re-used.
• To increase their value to the community.
4 Orbital DMP Meeting 20.01.12
5. The Aims
• Achieving an understanding of the diversity
and character of research data.
• Devising a means by which research data
can be classified in respect of research data
management.
• Developing models of the research data life-
cycle which characterize the information flow
in the research process and identify critical
points in the management process.
• Providing exemplars of best-practice
research data management strategies.
5 Orbital DMP Meeting 20.01.12
6. The Objectives
• Identify opportunities for and the benefits of
research data re-use and re-purposing.
• Identify the contextual, technical, legal and
social barriers to the re-use and repurposing
of research data.
• Establish whether and what data might be
used in a raw form, what data would require
reprocessing and how this might be achieved.
• Understand what contextual information is
required for research data to be understood
for the purpose of re-use.
6 Orbital DMP Meeting 20.01.12
7. Theoretical Elements to the Research
1. Understanding & Defining the ‘Space’
2. Terminology
https://wiki.bath.ac.uk/display/ERIMterminology/ERIM+Terminology+V4
3. Identifying the Objects in the space
4. Understanding the Relationships between
Objects
5. Modelling the Relationships
6. Understanding the Outcomes
7 Orbital DMP Meeting 20.01.12
8. Understanding & Defining the ‘Space’
for Current Research
known
Making data available and fit for the current
CURRENT-ACTIVITY MANAGEMENT SPACE
PURPOSING:
Management for A by A
purpose
FUTURE-ACTIVITY MANAGEMENT SPACE
Management for X by Y,
where Y can be X Management for X by Y
for
supporting data RE-USE: RE-PURPOSING:
Raw Future
Research Managing data such that it will be available Making data available and Research
Data for a future unknown purpose fit for a future known purpose
8 Orbital DMP Meeting 20.01.12
9. ERIM Project Terminology for
Research Data Mangement
https://wiki.bath.ac.uk/display/ERIMterminolo
gy/ERIM+Terminology+V4
(source definitions for terms found in slides 10-14)
9 Orbital DMP Meeting 20.01.12
10. 2. The Terminology I
Preparation Activities:
• Data Purposing Making research data available and fit for the
current research activity.
• Data Re-purposing Making existing research data available
and fit for a future known research activity
• Supporting Data Re-use: Managing existing research data
such that it will be available for a future unknown research activity
Use Activities
• Data Use Using research data for the current research
purpose/activity to infer new knowledge about the research subject.
• Data Re-use Using research data for a research purpose other
than that for which it was intended.
10 Orbital DMP Meeting 20.01.12
11. 2. The Terminology II
What do we Mean by DATA?
• Data. Reinterpretable representations of information
in a formalized manner suitable for communication,
interpretation or processing.
• Information. Any type of knowledge that can be
exchanged. In an exchange, it is represented by
data.
***********************
• Data Object. Either a physical object or a digital
object containing data.
• Data Record. A data object created, received and
maintained as evidence of an activity.
• Data Case. The set of data records associated with
some discrete research activity (project, task,
experiment, etc.).
11 Orbital DMP Meeting 20.01.12
13. 4. Understanding the Relationships
Data Preparation in
Data Development Purposing Supporting Re-use Re-
purposing
Association
Aggregation
Annotation
Augmentation
Collection
Collation
Generation
Derivation
Refinement
Migration
13 Orbital DMP Meeting 20.01.12
14. 6. Understanding the Outcomes
Management and Development Side-effects
• Information Loss
• Information Gain
• Function Loss
• State Loss
14 Orbital DMP Meeting 20.01.12
15. 5. Modelling the Objects’ Relationships
RESEARCH TIME LINE
Data level
Gather RDR1 Refine RDR1' Refine RDR1''
Associate
Derive RDR4
Gather RDR2 Derive RDR3
CDR1 Aggregate RDR6
RDR5
15 Orbital DMP Meeting 20.01.12
17. Research Data Scoping Survey
• 12 Research Cases
• 46 Data Assets
• 12 questions for each
• Researchers from:
Bath, Lancaster, Leeds,
Salford, Strathclyde,
Heriot-Watt
17 Orbital DMP Meeting 20.01.12
18. Scoping Survey Targets
1. Airframe Stress Data Reuse
2. Snow Mobile Design Activity Observation
3. Aerospace Cost Forecasting
4. Large-Scale Metrology Shared Resources
5. Form-fill-feed Packaging Modelling
6. CNC Machine Measurement
7. Cryogenic Machining
8. Information Management Tool
9. Knowledge Enhanced Notes
10. Service Design Research
11. Design Activity & Knowledge Capture Research
12. Understanding the Learning Organization
18 Orbital DMP Meeting 20.01.12
19. Selecting 5 Case Studies using Binary Classification
Research Generated Data vs Pre-existing Data
Homogeneous Media vs Heterogeneous Media
1. Costing
Descriptive vs Prescriptive
2. Company
case studies
Real vs Simulated 3. Programming
4. Interview analysis 5. Metrology
19 Orbital DMP Meeting 20.01.12
20. ERIM Research Summary
• Designed and carried out scoping survey.
• Theory development and revision
• Designed and carried out audit on 5 cases
• RAID Modelled the case research activities.
• Complete analysis and characterization of
audit cases
• Identified barriers to data re-use and
strategies to mitigate.
• Established critical points in Information Flow.
• Developed research data management plan
process ‘cascade’.
20 Orbital DMP Meeting 20.01.12
21. Key ERIM Research Findings I
• Great diversity of data type and quality.
• Complex and chaotic nature of data
development.
• Outputs not linked to data.
• Supporting documents not situated with the data
files.
• Little use of metadata to support future use.
• Immature understanding of benefits of sharing
and thus need for management.
• Limited understanding of the barriers to or
opportunities for information sharing and re-use.
21 Orbital DMP Meeting 20.01.12
22. Key ERIM Research Findings II
Poor framework for:
• pre-project considerations of data management.
• data management during the research.
• during-project data management for post-project re-
use.
Poor knowledge of context in which data were
generated:
• engineering research data are very diverse.
• large number of diverse research data records.
• relations between data records complex.
Knowing the context is vital for understanding data.
22 Orbital DMP Meeting 20.01.12
23. What Needs Managing?
• Research Data:
– Data Data Objects Information.
• Their life cycle processes:
– Collection
– Generation
– Development
– Organization
– Disposal, etc.
• The process of data management itself.
23 Orbital DMP Meeting 20.01.12
25. What is the Purpose of RDM Planning?
• Reduction in duplicated work.
• Inspiration for new/continuation research &
funding.
• Greater transparency of research.
• Improved basis for validation.
• Obviating the need for re-collection and
generation.
• Providing basis for reliable data citation.
• Increasing scholarly output.
Relies upon RE-USING DATA
25 Orbital DMP Meeting 20.01.12
26. Amenability Criteria & ‘Re-usefulness’
‘To manage research data such that they are
highly amenable to re-use.’
‘What is the nature of these data that makes them more
or less amenable to re-use?’
• Findability
• Readability
• Comprehensibility
• Interpretability
• Admissibility
• Desirability
Data ‘RE-USEFULNESS’
(some data will remain forever re-useless)
26 Orbital DMP Meeting 20.01.12
28. DMP Task Dimensions and Topics
• Access, Data Sharing and Re-use.
• Data Types, Format, Standards and Capture
Methods.
• Ethical and Privacy Concerns.
• Resourcing.
• Short-term Storage and Data Management.
• Deposit and Long-term preservation.
• DMP Adherence and Review.
DCC DMP Checklist
28 Orbital DMP Meeting 20.01.12
29. Data Management is not just about
STORAGE!
29 Orbital DMP Meeting 20.01.12
30. Guidance Required to Support Tasks
• Data management planning.
• Data management execution.
• Bid submission.
• Project planning.
• During-project management whilst doing the
research.
• Collaboration with colleagues, industry and
others.
• Supporting data use, re-use, re-purposing.
• Preparation for long-term preservation.
• End-of-life concerns.
30 Orbital DMP Meeting 20.01.12
31. The Two Stages of RDM
‘Managing research data such that they are
highly amenable to re-use.’
• Good DM planning provides the
potential to increase data re-usefulness.
• The execution of good DMPs promotes
data re-usefulness.
31 Orbital DMP Meeting 20.01.12
32. Key Building Blocks for Practical Data Management
1. DMP Guidance, Documentation &
Procedures.
2. Storage and security
3. Data Organization
4. Data Documentation
32 Orbital DMP Meeting 20.01.12
33. 1. DMP Guidance, Documentation & Procedures
Principles for Engineering Research Thematic Analysis of Data Management
Data Management. Plan Tools and Exemplars.
(erim6rep101028mjd) (erim6rep100701ab)
Engineering Research Data Management
1 Plan Requirement Specification Being a specification for 2
(erim6rep100901ab)
REDm-MED
The Draft IdMRC Projects Data
Model DMP for
Being an implementation of 1 2 Management Plan
Mech. Eng. Depts.
(erim6rep101015mjd)
RAIDmap Use Cases RAID Associative Tool Specification Prototype
(erim6rep101125mjd) (erim6rep101109mjd) RAIDmap Tool
33 Orbital DMP Meeting 20.01.12
34. Security: Document/Data Access Levels
Level 1 – In the public domain with an unrestricted
readership and can distributed at will.
Level 2 – Viewable by any individual who has
password access to the main project web site.
Level 3 – Viewable by any individual who has
password access to the main project file store and is
a project-affiliated member of a university research
team.
Level 4 – For sensitive documents. Must carry a
distribution list identifying for whom it is intended and
be disseminated through a nominated secure
passwork-protected portal. Distribution controlled by
PI and Collaborator liaison officer.
34 Orbital DMP Meeting 20.01.12
39. Work package Task Author initials
number number
Project
name
Revision
number
kim12rep05pjw01.doc
Document Document File type
type rank number
39 Orbital DMP Meeting 20.01.12
40. 3 Data Organization: 3 steps to document happiness
1. Fill in the document properties (use
metadata!).
2. Make them visible by browsing or by search
– in particular rehabitate the record TITLE.
3. Use a file NAME coding convention that
captures human-readable context
information.
40 Orbital DMP Meeting 20.01.12
41. 4. Data Documentation
• RAID-mapping: provision of context.
• Project Document Records:
– Existence Location of key Project Documents
• Project Plan
• Project data management plan
• Project document manifest
– Location of Data Records
– Some description of relations between data
records
41 Orbital DMP Meeting 20.01.12