The document discusses developing a dynamic semantic content model to support continuous acquisition and use of content at Elsevier. It proposes a model with extensible classes for different content and asset types, properties, and relationships. This would allow new types, properties, and formats to be added over time. The model represents content and assets as graphs that can be combined and extended with additional metadata and analytics. This comprehensive and flexible approach aims to increase content volume, coverage, utility, and drive operational efficiencies for content management.
2. |
Entity cloud to
knowledge graph
Researchers
Institutions
Articles
Journals
Patents
Funding bodies
Grants
Research domains
Countries
Labs
Projects
Research data sets
Publishing cluster
Usage cluster
Editors
Reviewers
Authors
Inventors
Funding
cluster
Opportunities
Corporations
Publishers
Conferences
Societies
Res Eval Agencies
Counter
3. |
Implications for a Enterprise Content Model
3
Increase content volume and Quality
•Create new content and asset types
•Track license agreements
•Have multiple editions of quality
Expand content coverage
•Add classes , add data types, add relationships into non-Journal, non Book materials
•Make content management aware of extended content coverage
•New content is operational in search , storage , services and discovery
Increase content utility
•Increase metadata properties for search and discovery
•Extend content objects with features that activate interaction
•Create data-driven workflows
Combine content with analytics and technology
•Add properties as diagnostics evolve and mature
•Expose meaningful properties now locked in content
Drive operational efficiency and effectivity
•Empower third parties to contribute and collaborate in high-value ways
•On-board content and data suppliers fast
•Enable instant adoption of content supply chains
5. |
Meta-model and content model
• Modelled for variety
• Two layered model:
• Ontology describing a
content model typology
and inter-dependency
• Asset-level content models
for fine grained, detailed
content mark-up
• Realisation through OWL
and XML Schema
• Serialisation through
JSON-LD under JSON
Schema control
Nodes in the graph organized
to:
• Support collections of
nodes
• Record containment to
other nodes
• Be typed for class
membership
• Edges in the graph
organized to:
• support loose- and tightly
coupled nodes
• Express roles
• Capture features
• Be fixed and protected
• Membership
• Containment
• Provenance
• Be extensible
• Node features
• Search index
Content Type
Content Type
Content Type
property
property
property
ÁssetType
ÁssetTypeÁssetType
Format
Format
Format
property
property
property
ÁssetType
ÁssetTypeÁssetType
Format
Format
Format
property
property
property
ÁssetType
ÁssetTypeÁssetType
Format
Format
Format
property
property
property
CO-CO property
CO-CO property
MP4
x264
Non RDF
RDF
6. |
Generations capture:
Grouping, Variants, Versions, Dependencies
• Authority type with reference Asset
• Workflow state
• Provenance
• Inter-dependency
• Intra-dependency
• Containment
• Accessibility
6
evolution
8. |
Extending Content Model using named graphs
8
Content
Object
Conten
t Model
Asset
Object
hasGeneration
Genera
tion
Metada
ta
AssetM
etadata
hasAssetMetadata
hasAsset
Messag
e
Service
Call
EventN
otificati
on
service
event
about
Conte
ntObj
ectD
Contribu
tor/
Consum
er
from
to
parentGener
ationI
DConte
nt
Object
ID Asset
Object
ID
Resource /
target
parent
about
Conte
ntObj
ectD
Servi
ceCa
llID
DataType
Content
Object
Literal Datatype
CO property Controlled
Vocabulary
Object type
CO property
Content
Object type
CO property
Person
Institute
CO property
CO property
Corporatio
ns
Adds-On-To
Resear
chers
Funding
bodies
Publishing
cluster
Funding
cluster
9. |
Continuous acquisition of Adding value to Content Objects
Article Object
ADD-ON Object
Type: “KnowledgeGraph”
ADD-ON Object
Type: “Document Graph”
ADD-ON Object
Type: “n-gram distance”
ADD-ON Object
Type: “Mapping”
XML PDF CAR
XML PDF CAR Content Object
- Patent
- Grant
- Concept
- Contract
- Article
- Chapter
- ...
Generation v1
Generation v2
Resear
chers
Funding
bodies
Publishing
cluster
Funding
cluster
Integrate and deploy
into a knowledge
graph
JSON-LD serialisation
10. |
Continuous content acquisition, coverage and utility: use cases
10
New User to the System : Add name to the consumer / contributor Class
New, unknown content: Add Content type to the Content Object Class
New derived asset: Add Asset type to the Asset Class
New format: Add new Format type to the Asset Class
New property: Add property to the Content or Asset Object Class
New datatype: Create a datatype for the Data type Object Class
New concept: Add a concept to the Controlled Vocabulary Class
New Service: Add a Service Call to Service Call Class
New event: Add to EventNotification Class
New relationship to other objects: Add Add-On object to Content Object Class
11. |
Versions of content models
• New objects can be added
• New properties on Content Objects to Content Objects (with
ID control)
• New properties on Content Objects to literal/CV
• New Asset types can be added
• New Asset formats can be added
• New properties on Asset objects
• New asset models can be introduced (schema/json
schema)
• New Add-On properties can be introduced
• New Property to External Content Object (no ID control)
9/29/2016
3 4 1: . :527.
Content Object
+ CO properties
Asset Type +
Asset properties
CO properties
Asset Format
12. |
Questions towards versioning
• How do I get the difference of the model that I use and the one that is available?
• How do I know which Content and Asset types have ben added?
• How do I know/discover which properties are available ?
• How do I know what add-on types have been added?
• How do I understand the impact on produced, integrated graphs?
• Do I need to know the nature of the properties?; ie which ontology governs them?
9/29/2016 12
14. |
Take aways: Key elements of a dynamic, self servicing content model
• Provide authority for identification of objects and assets
• Understand and model core assets in context of knowledge graphs
• Manage the organisation of both the content objects and the objects themselves; metadata and
properties must be able to travel across content object boundaries
• Devise a version and variant management system; understand implications for addressing
• Extensible to allow new objects, new schemas and new semantics; ontologies with classes and
properties lend themselves well for this role
• Versioning of content models is expressed through feature availability on Content Objects, Asset Types
and distributed properties.
• Establish the connection between content model with workflow model
• Use JSON-LD for serialisation as a light-weight, extensible format while being conscious of namespaces
and RDF data
14