The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
Β
Sharing Data Across Memory Institutions
1. Sharing Data
Across Memory Institutions
David Newbury
Software & Data Architect
J. Paul Getty Trust
July 9th, 2019
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 1
2. My job is to bring together:
β’ an Archive
β’ a Library
β’ a Conservation Science lab
β’ a Publishing house
β’ a Museum
β’ a Foundation
...through software and data.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 2
3. Background:
β Software Developer
β Filmmaker
β Advertiser
β Robot-builder
β Underwear modeler
β Provenance data specialist
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 3
4. Two parts to my job:
Data
Software
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 4
5. Two parts to my job:
Data
Software
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 5
6. What is data?
Data is information,
structured in a way
that enables use.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 6
10. Museums are
interested in
the buildings.
(but only the important buildings.)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 10
11. Librarians are
interested in
the addresses.
(how do you access the buildings?)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 11
12. Archivists are
interested in
the zoning.
(How is the city organized?)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 12
15. On Modeling & Mapping the Real
When we create data,
we're creating an
abstraction of reality.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 15
16. "When you design and build a
computer system, you first
formulate a model of the problem
you want it to solve, and then
construct the computer program
in its terms."
- Brian Cantwell Smith, The Limits of Correctness (1985)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 16
17. On October 5, 1960, the
American Ballistic Missile Early-
Warning System indicated Soviet
missiles headed towards the
United States.
The moon had risen, and was
reflecting radar signals back to
earth. Needless to say, this lunar
reflection hadn't been predicted
by the system's designers.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 17
18. Whose fault was it?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 18
19. "...every act of conceptualization,
analysis, categorization, does a
certain amount of violence to its
subject matter, in order to get at the
underlying regularities that group
things together."
- Brian Cantwell Smith, The Limits of Correctness (1985)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 19
20. On Exactitude in Science
A 1:1 scale map
is not a useful
abstraction.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 20
21. There will never be
a correct data model.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 21
22. There will only be
useful data models.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 22
23. What is a useful Data Model?
As memory institutions,
we structure and record
information about objects.
We do this because objects
are representations of
shared cultural knowledge.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 23
24. Seven ways to look at objects.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 24
25. Objects are Things.
Objects exist in space and time.
They have weight, and size, and are made of materials.
They are conserved, moved, bought, sold, and described.
These objects are related to people, events, and places
through physical, legal, or social interaction.
Example Data Models:
LIDO, CIDOC-CRM, Schema.org, CDWA, MARC
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 25
26. Objects exist in Context.
Objects are grouped, ordered, described, & summarized.
These intellectual structures provide context and
meaning to the objects as part of a larger whole.
Example Data Models:
EAD, ISAD(G), Dewey Decimal, RiC
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 26
27. Objects can be Reproductions.
Objects have shared heritage with other objects.
The physical or intellectual connections between a
specific instance and an abstract work it reproduces can
be essential to our understanding of that object.
Example Data Models:
FRBR, Bibframe
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 27
28. Objects have Proxies.
Objects are represented as structured data.
This includes data and digitized representations of an
object. Proxies often have their own metadata
describing the characteristics and context of the proxy.
Example Data Models:
IIIF, METS, Dublin Core, EXIF
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 28
29. Objects are Managed.
Policies and rules govern interaction with objects.
These codify how an object should be stored and what
environment it should be kept in, who can access the
object, and what restrictions apply to that access.
Example Data Models:
Rightsstatement.org, PREMIS
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 29
30. Objects can contain or embody Representations.
Objects often depict or describe referents. These may be
real times, places, objects, and people, or they may be
fictitious.
Example Data Models:
GeoJSON, EAC, Iconclass,TEI
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 30
31. Objects are Intellectual Works.
Objects interpret of the world around them.
They are made with intent, within intellectual
frameworks and genres, and others assign meaning and
value to them. They can both participate in and
engender argumentation and scholarship.
Example Data Models:
???
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 31
32. It's a little overwhelming,
isn't it?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 32
33. If you wish to make an apple pie
from scratch, you must first
invent the universe.
β Carl Sagan
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 33
34. When eating an elephant take
one bite at a time.
β Creighton Williams Abrams Jr.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 34
35. Lessons Learned: Art Tracks
How do you design a data model
that represents the history of an object,
and how do you express it using Linked Data?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 35
36. Art TracksFunded by the Institute of Museum and Library Services.
ca. 2013-2015
National Endowment for the Humanities,
Kress Foundation,
Paul Mellon Center
ca.2016-2017
Originally, Art Tracks was
a data visualization project.
Only, we didn't have data.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 36
37. Traditional Provenance
Durand-Ruel, Paris, August 23, 1872 [1];
Catholina Lambert, New Jersey;
Lambert sale, American Art Association, Plaza Hotel, New York, NY,
February 21, 1916 until February 24, 1916, no. 67;
Durand-Ruel, Paris, until at least 1930;
purchased by Simon Bauer, Paris, by June 1936 [2];
anonymous sale, Parke-Bernet Galleries, Inc., February 25, 1970, no. 19 [3];
Sam Salz, Inc., New York, NY;
purchased by Museum, May 1971.
Notes:
[1] bought from the artist.
[2] Listed and illustrated in "List of Property Removed from France
during the War 1939-1945" (no. 7114, as belonging to Simon Bauer).
[3] "Highly Important Impressionist, Post-Impressionist &
Modern Paintings and Drawings", illustrated.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 37
40. Why do we
need Linked Data?
(When modeling our objects)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 40
41. Linked Open Data.
I'm not going to
talk about if we
should share our data.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 41
43. Are we
Publishers?
Yes.
But we do more than publish
informationβwe generate our own.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 43
44. Are we
Researchers?
Yes.
But we don't generate random
information, we research specific
objects.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 44
45. We are
Collections.
We don't just collect.
We research, collect, and preserve
information about our objects, as
well as the events, people, and
topics that give them context.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 45
46. The Promise of Linked Data:
[The] creation of a common framework that allows data
to be shared and reused across application, enterprise,
and community boundaries, to be processed
automatically by tools as well as manually, including
revealing possible new relationships among pieces of
data.
β W3C Semantic Web Working Group
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 46
47. Linked Data:
Where is this
magical future?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 47
48. What doesn't Linked Data do?
Enable
Web Scale AI
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 48
49. What doesn't Linked Data do?
Create Easy
Interoperability
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 49
50. What doesn't Linked Data do?
Automate
Reconciliation
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 50
51. What doesn't Linked Data do?
Reduce
our Workload
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 51
52. An awkward
moment goes here.
(This could be a very short talk.)
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 52
53. Linked Data is
not a magic bullet.
It's one of a many possible abstract
data models, each of which have
trade-offs.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 53
54. Art Tracks, Phase II
Funded by the National Endowment for the Humanities.
ca. 2016-2017
How to express provenance information as:
β’ Linked Open Data
β’ JSON data structure
β’ Standardized text
All three forms must contain
the same information, and
we must be able to convert
between them.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 54
55. Content Mgmt. Systems
What we have.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 55
61. the Four Levels
of Provenance
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 61
62. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 62
63. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France; Galeries Durand-Ruel,
Paris, France, by August 1892 [1]; Durand-Ruel Galleries,
New York, NY, 1895; purchased by Department of Fine
Arts, Carnegie Institute, Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 63
65. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France; Galeries Durand-Ruel,
Paris, France, by August 1892 [1]; Durand-Ruel Galleries,
New York, NY, 1895; purchased by Department of Fine
Arts, Carnegie Institute, Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Mary Cassatt: see http://viaf.org/viaf/2478969/
Galeries Durand-Ruel: see http://viaf.org/viaf/153354503
Durand-Ruel Galleries: see http://viaf.org/viaf/134060200
Department of Fine Arts, Carnegie Institute: see http://viaf.org/viaf/147742484
France: see http://vocab.getty.edu/tgn/1000070
Paris, France: see http://vocab.getty.edu/tgn/7008038
New York, NY: see http://vocab.getty.edu/tgn/7007567
Pittsburgh, PA: see http://vocab.getty.edu/tgn/7013927
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 65
66. Reason #1:
Linking to Other
Authorities
and the Local Heroes Problem
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 66
67. Authority,
Identity, & Trust.
We're making authoritative
assertions about identity.
We want to be the "source of truth"
for the objects in our collections.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 67
68. Authority isn't free.
Maintaining authority takes
enormous time and resources.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 68
69. The world is vast.
To fully describe everything that
connects to our collection, we
must describe the universe.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 69
70. Budgets are...less vast.
How can we be authoritative
without being encyclopedic?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 70
71. Asserted Authority.
When you want to be
the authority of record
for something or someone.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 71
73. Delegated Authority.
When you want to point to
someone who you trust to be
the authority of record.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 73
74. Getty Vocabularies
Shared Authority files
One source of authority maintained
by an trusted institution.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 74
75. Reluctant Authority.
When you cannot find
an authority you trust.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 75
76. MicroAuthority
A minimalist
CSV-based authority file
Enables small institutions
to connect to their local heroes
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 76
77. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France;
Galeries Durand-Ruel, Paris, France, by August 1892 [1];
Durand-Ruel Galleries, New York, NY, 1895;
purchased by Department of Fine Arts, Carnegie Institute,
Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Mary Cassatt: see http://viaf.org/viaf/2478969/
Galeries Durand-Ruel: see http://viaf.org/viaf/153354503
Durand-Ruel Galleries: see http://viaf.org/viaf/134060200
Department of Fine Arts, Carnegie Institute: see http://viaf.org/viaf/147742484
France: see http://vocab.getty.edu/tgn/1000070
Paris, France: see http://vocab.getty.edu/tgn/7008038
New York, NY: see http://vocab.getty.edu/tgn/7007567
Pittsburgh, PA: see http://vocab.getty.edu/tgn/7013927
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 77
78. Reason #2:
Shared Semantics
How do we know we're talking about the same thing?
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 78
79. JSON Data
Structure
This is understandable,
If you're me.
But you're not me.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 79
80. Linked Data as
Documentation
When I say "Transfer of Custody",
I mean...
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 80
81. JSON-LD &
CIDOC-CRM
This is more complex,
but that complexity
is documented.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 81
82. What about
the gaps?
Nothing is comprehensive.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 82
85. Galeries Durand-Ruel, Paris, France, by August 1892 [1];
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Durand-Ruel Galleries: #1 http://viaf.org/viaf/134060200
Paris, France: see http://vocab.getty.edu/tgn/7008038
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 85
87. Getty Provenance Index
Remodel Project
Similar project, different goals.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 87
93. To Recap:
1. Shared Authority
2. Shared Understanding
3. Easy Collaboration
4. Planning for the Future
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 93
94. Two parts to my job:
Data
Software
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 94
95. Nothing we've
talked about yet
needs a computer.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 95
96. Digital data and
software are
utterly
intertwined.
We digitize data so that software
can interact with it.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 96
97. What is Software?
Software automates practice,
allowing us to be more efficient.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 97
98. Automation
We can only automate what we
understand well enough to
explain.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 98
99. Why digital
metadata?
Who are our users?
What do they want?
Adolf von Menzel (German, 1815 - 1905)
Figure Studies, 1872, Carpenter's pencil
The J. Paul Getty Museum, Los Angeles
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 99
100. Three user types:
β’ Cataloguers describing collections
β’ Researchers using digital methods
β’ Developers enabling access
Workshop of Rembrandt Harmensz. van Rijn
Young Scholar and his Tutor, 1629β1630, Oil on canvas
The J. Paul Getty Museum, Los Angeles
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 100
101. β Catalogers interpret and structure the real world.
β Researchers ask novel questions using the dataset.
β Developers consume data and enable users.
These use cases are in conο¬ict.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 101
102. Lessons Learned
β Our tasks are Search, Browse, & Display
β There is no primary entity
β Reconciliation is essential
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 102
103. You need to try to use the data.
β Consistent modeling patterns are needed
β Semantic correctness is not sufficient
β Structure is as important as semantics
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 103
104. LOUD:
Linked Open Usable Data
Semanatically modeled
cultural heritage data
for web developers.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 104
105. http://linked.art
Linked.art is a RDF profile of the CIDOC-CRM
that uses JSON-LD and the Getty Vocabularies
to describe object-based cultural heritage in
an event-based framework for consumption
by software applications.
It uses a subset of classes from the CIDOC-CRM
along with other commonly-used RDF ontologies
to provide interoperable patterns and models
that can be interpreted either as JSON or as RDF.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 105
106. Balancing complexity and usability
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 106
107. RDF Graphs as
JSON-LD documents.
Complexity is hidden,
not eliminated.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 107
108. Linked Art:
A standardized data model using
CIDOC-CRM that describes the
objectness of objects, designed
to enable software development
against Linked Open Data.
http://linked.art
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 108
109. We know how to describe objects.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 109
110. We are learning to describe
relationships.
Sharing Data Across Memory Institutions β David Newbury (@workergnome) 110