Webinar, Feb. 20, 2018. David Newbury discusses how data is modeled and presented in memory institutions. He talks about his experiences with Art Tracks, Linked Art, the American Art collaboration, and other projects, discussing how those experiences helped him better understand data modeling and how we can represent objects.
The U.S. Budget and Economic Outlook (Presentation)
Β
NDSR Learning Enrichment: Data Models and Linked Data
1. Learning Enrichment Session
David Newbury
Software & Data Architect
J. Paul Getty Trust
February 20th, 2018
Learning Enrichment Session β David Newbury (@workergnome) 1
2. My job is to bring together:
β’ an Archive
β’ a Library
β’ a Conservation science lab
β’ a Publishing house
β’ a Museum
β’ a Foundation
...through software and data.
Learning Enrichment Session β David Newbury (@workergnome) 2
14. On Modeling & Mapping the Real
When we create data,
we're creating an
abstraction of reality.
Learning Enrichment Session β David Newbury (@workergnome) 14
15. "When you design and build a
computer system, you first
formulate a model of the problem
you want it to solve, and then
construct the computer program
in its terms."
- Brian Cantwell Smith, The Limits of Correctness (1985)
Learning Enrichment Session β David Newbury (@workergnome) 15
16. On October 5, 1960, the
American Ballistic Missile Early-
Warning System indicated Soviet
missiles headed towards the
United States.
The moon had risen, and was
reflecting radar signals back to
earth. Needless to say, this lunar
reflection hadn't been predicted
by the system's designers.
Learning Enrichment Session β David Newbury (@workergnome) 16
17. Whose fault was it?
Learning Enrichment Session β David Newbury (@workergnome) 17
18. "...every act of conceptualization,
analysis, categorization, does a
certain amount of violence to its
subject matter, in order to get at the
underlying regularities that group
things together.""
- Brian Cantwell Smith, The Limits of Correctness (1985)
Learning Enrichment Session β David Newbury (@workergnome) 18
19. On Exactitude in Science
A 1:1 scale map
is not a useful
abstraction.
Learning Enrichment Session β David Newbury (@workergnome) 19
20. There will never be
a correct data model.
Learning Enrichment Session β David Newbury (@workergnome) 20
21. There will only be
useful data models.
Learning Enrichment Session β David Newbury (@workergnome) 21
22. What is a useful Data Model?
As memory institutions,
we structure and record
information about objects.
We do this because objects
are representations of
shared cultural knowledge.
Learning Enrichment Session β David Newbury (@workergnome) 22
23. Seven ways to look at objects.
Learning Enrichment Session β David Newbury (@workergnome) 23
24. Objects are Things.
Objects exist in space and time.
They have weight, and size, and are made of materials.
They are conserved, moved, bought, sold, and described.
These objects are related to people, events, and places
through physical, legal, or social interaction.
Example Data Models:
LIDO, CIDOC-CRM, Schema.org, CDWA, MARC
Learning Enrichment Session β David Newbury (@workergnome) 24
25. Objects exist in Context.
Objects are grouped, ordered, described, & summarized.
These intellectual structures provide context and
meaning to the objects as part of a larger whole.
Example Data Models:
EAD, ISAD(G), Dewey Decimal, RiC
Learning Enrichment Session β David Newbury (@workergnome) 25
26. Objects can be Reproductions.
Objects have shared heritage with other objects.
The physical or intellectual connections between a
specific instance and an abstract work it reproduces can
be essential to our understanding of that object.
Example Data Models:
FRBR, Bibframe
Learning Enrichment Session β David Newbury (@workergnome) 26
27. Objects have Proxies.
Objects are represented as structured data.
This includes data and digitized representations of an
object. Proxies often have their own metadata
describing the characteristics and context of the proxy.
Example Data Models:
IIIF, METS, Dublin Core, EXIF
Learning Enrichment Session β David Newbury (@workergnome) 27
28. Objects are Managed.
Policies and rules govern interaction with objects.
These codify how an object should be stored and what
environment it should be kept in, who can access the
object, and what restrictions apply to that access.
Example Data Models:
Rightsstatement.org, PREMIS
Learning Enrichment Session β David Newbury (@workergnome) 28
29. Objects can contain or embody Representations.
Objects often depict or describe referents. These may be
real times, places, objects, and people, or they may be
fictitious.
Example Data Models:
GeoJSON, EAC, Iconclass,TEI
Learning Enrichment Session β David Newbury (@workergnome) 29
30. Objects are Intellectual Works.
Objects interpret of the world around them.
They are made with intent, within intellectual
frameworks and genres, and others assign meaning and
value to them. They can both participate in and
engender argumentation and scholarship.
Example Data Models:
???
Learning Enrichment Session β David Newbury (@workergnome) 30
31. It's a little overwhelming,
isn't it?
Learning Enrichment Session β David Newbury (@workergnome) 31
32. If you wish to make an apple pie
from scratch, you must first
invent the universe.
β Carl Sagan
Learning Enrichment Session β David Newbury (@workergnome) 32
33. When eating an elephant take
one bite at a time.
β Creighton Williams Abrams Jr.
Learning Enrichment Session β David Newbury (@workergnome) 33
34. Lessons Learned: Art Tracks
How do you design a data model
that represents the history of an object,
and how do you express it using Linked Data?
Learning Enrichment Session β David Newbury (@workergnome) 34
35. Art TracksFunded by the Institute of Museum and Library Services.
ca. 2013-2015
National Endowment for the Humanities,
Kress Foundation,
Paul Mellon Center
ca.2016-2017
Originally, Art Tracks was
a data visualization project.
Only, we didn't have data.
Learning Enrichment Session β David Newbury (@workergnome) 35
36. Traditional Provenance
Durand-Ruel, Paris, August 23, 1872 [1];
Catholina Lambert, New Jersey;
Lambert sale, American Art Association, Plaza Hotel, New York, NY,
February 21, 1916 until February 24, 1916, no. 67;
Durand-Ruel, Paris, until at least 1930;
purchased by Simon Bauer, Paris, by June 1936 [2];
anonymous sale, Parke-Bernet Galleries, Inc., February 25, 1970, no. 19 [3];
Sam Salz, Inc., New York, NY;
purchased by Museum, May 1971.
Notes:
[1] bought from the artist.
[2] Listed and illustrated in "List of Property Removed from France
during the War 1939-1945" (no. 7114, as belonging to Simon Bauer).
[3] "Highly Important Impressionist, Post-Impressionist &
Modern Paintings and Drawings", illustrated.
Learning Enrichment Session β David Newbury (@workergnome) 36
42. Are we
Publishers?
Yes.
But we do more than publish
informationβwe generate our own.
Learning Enrichment Session β David Newbury (@workergnome) 42
43. Are we
Researchers?
Yes.
But we don't generate random
information, we research specific
objects.
Learning Enrichment Session β David Newbury (@workergnome) 43
44. We are
Collections.
We don't just collect.
We research, collect, and preserve
information about our objects, as
well as the events, people, and
topics that give them context.
Learning Enrichment Session β David Newbury (@workergnome) 44
45. The Promise of Linked Data:
[The] creation of a common framework that allows data
to be shared and reused across application, enterprise,
and community boundaries, to be processed
automatically by tools as well as manually, including
revealing possible new relationships among pieces of
data.
β W3C Semantic Web Working Group
Learning Enrichment Session β David Newbury (@workergnome) 45
46. Linked Data:
Where is this
magical future?
Learning Enrichment Session β David Newbury (@workergnome) 46
47. What doesn't Linked Data do?
Enable
Web Scale AI
Learning Enrichment Session β David Newbury (@workergnome) 47
48. What doesn't Linked Data do?
Create Easy
Interoperability
Learning Enrichment Session β David Newbury (@workergnome) 48
49. What doesn't Linked Data do?
Automate
Reconciliation
Learning Enrichment Session β David Newbury (@workergnome) 49
50. What doesn't Linked Data do?
Reduce
our Workload
Learning Enrichment Session β David Newbury (@workergnome) 50
51. An awkward
moment goes here.
(This could be a very short talk.)
Learning Enrichment Session β David Newbury (@workergnome) 51
52. Linked Data is
not a magic bullet.
It's one of a many possible abstract
data models, each of which have
tradeoffs.
Learning Enrichment Session β David Newbury (@workergnome) 52
53. Art Tracks, Phase II
Funded by the National Endowment for the Humanities.
ca. 2016-2017
How to express provenance information as:
β’ Linked Open Data
β’ JSON data structure
β’ Standardized text
All three forms must contain
the same information, and
we must be able to convert
between them.
Learning Enrichment Session β David Newbury (@workergnome) 53
60. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art.
Learning Enrichment Session β David Newbury (@workergnome) 60
61. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France; Galeries Durand-Ruel,
Paris, France, by August 1892 [1]; Durand-Ruel Galleries,
New York, NY, 1895; purchased by Department of Fine
Arts, Carnegie Institute, Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Learning Enrichment Session β David Newbury (@workergnome) 61
63. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France; Galeries Durand-Ruel,
Paris, France, by August 1892 [1]; Durand-Ruel Galleries,
New York, NY, 1895; purchased by Department of Fine
Arts, Carnegie Institute, Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Mary Cassatt: see http://viaf.org/viaf/2478969/
Galeries Durand-Ruel: see http://viaf.org/viaf/153354503
Durand-Ruel Galleries: see http://viaf.org/viaf/134060200
Department of Fine Arts, Carnegie Institute: see http://viaf.org/viaf/147742484
France: see http://vocab.getty.edu/tgn/1000070
Paris, France: see http://vocab.getty.edu/tgn/7008038
New York, NY: see http://vocab.getty.edu/tgn/7007567
Pittsburgh, PA: see http://vocab.getty.edu/tgn/7013927
Learning Enrichment Session β David Newbury (@workergnome) 63
64. Reason #1:
Linking to Other
Authorities
and the Local Heroes Problem
Learning Enrichment Session β David Newbury (@workergnome) 64
65. Authority,
Identity, & Trust.
We're making authoritative
assertions about identity.
We want to be the "source of truth"
for the objects in our collections.
Learning Enrichment Session β David Newbury (@workergnome) 65
66. Authority isn't free.
Maintaining authority takes
enormous time and resources.
Learning Enrichment Session β David Newbury (@workergnome) 66
67. The world is vast.
To fully describe everything that
connects to our collection, we
must describe the universe.
Learning Enrichment Session β David Newbury (@workergnome) 67
68. Budgets are...less vast.
How can we be authoritative
without being encyclopedic?
Learning Enrichment Session β David Newbury (@workergnome) 68
69. Asserted Authority.
When you want to be
the authority of record
for something or someone.
Learning Enrichment Session β David Newbury (@workergnome) 69
71. Delegated Authority.
When you want to point to
someone who you trust to be
the authority of record.
Learning Enrichment Session β David Newbury (@workergnome) 71
72. Getty Vocabularies
Shared Authority files
One source of authority maintained
by an trusted institution.
Learning Enrichment Session β David Newbury (@workergnome) 72
73. Reluctant Authority.
When you cannot find
an authority you trust.
Learning Enrichment Session β David Newbury (@workergnome) 73
75. Mary Cassatt, Young Women Picking Fruit.
Carnegie Museum of Art, 1894.
Mary Cassatt [1844-1926], France;
Galeries Durand-Ruel, Paris, France, by August 1892 [1];
Durand-Ruel Galleries, New York, NY, 1895;
purchased by Department of Fine Arts, Carnegie Institute,
Pittsburgh, PA, October 1922.
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Mary Cassatt: see http://viaf.org/viaf/2478969/
Galeries Durand-Ruel: see http://viaf.org/viaf/153354503
Durand-Ruel Galleries: see http://viaf.org/viaf/134060200
Department of Fine Arts, Carnegie Institute: see http://viaf.org/viaf/147742484
France: see http://vocab.getty.edu/tgn/1000070
Paris, France: see http://vocab.getty.edu/tgn/7008038
New York, NY: see http://vocab.getty.edu/tgn/7007567
Pittsburgh, PA: see http://vocab.getty.edu/tgn/7013927
Learning Enrichment Session β David Newbury (@workergnome) 75
76. Reason #2:
Shared Semantics
How do we know we're talking about the same thing?
Learning Enrichment Session β David Newbury (@workergnome) 76
77. JSON Data
Structure
This is understandable,
If you're me.
But you're not me.
Learning Enrichment Session β David Newbury (@workergnome) 77
78. Linked Data as
Documentation
When I say "Transfer of Custody",
I mean...
Learning Enrichment Session β David Newbury (@workergnome) 78
79. JSON-LD &
CIDOC-CRM
This is more complex,
but that complexity
is documented.
Learning Enrichment Session β David Newbury (@workergnome) 79
83. Galeries Durand-Ruel, Paris, France, by August 1892 [1];
Notes:
[1]. Recorded in stock book in August 1892.
Authorities:
Durand-Ruel Galleries: #1 http://viaf.org/viaf/134060200
Paris, France: see http://vocab.getty.edu/tgn/7008038
Learning Enrichment Session β David Newbury (@workergnome) 83
91. To Recap:
1. Shared Authority
2. Shared Understanding
3. Easy Collaboration
4. Planning for the Future
Learning Enrichment Session β David Newbury (@workergnome) 91
92. Two parts to my job:
Data
Software
Learning Enrichment Session β David Newbury (@workergnome) 92
93. What is Software?
Software automates practice,
allowing us to be more efficient.
Learning Enrichment Session β David Newbury (@workergnome) 93
94. Digital data and
software are
utterly
intertwined.
We digitize data so that software
can interact with it.
Learning Enrichment Session β David Newbury (@workergnome) 94
96. Which of these is truer?
β Probably born 1950.
β Likely born 1950.
Humans don't think about "Truer" that way...
but computers do.
Learning Enrichment Session β David Newbury (@workergnome) 96
98. American Art
Collaborative
Modeling the collections of
14 institutions so that software can
interact with them.
Learning Enrichment Session β David Newbury (@workergnome) 98
99. Linked Art:
A standardized data model using
CIDOC-CRM that describes the
objectness of objects, designed
to enable software development
against Linked Open Data.
http://linked.art
Learning Enrichment Session β David Newbury (@workergnome) 99
100. Thank you for listening!
Questions?
Learning Enrichment Session β David Newbury (@workergnome) 100