Currently Experience API (xAPI) mostly focuses on providing “structural” interoperability of xAPI statements via JavaScript Object Notation Language (JSON). Structural interoperability defines the syntax of the data exchange and ensures the data exchanged between systems can be interpreted at the data field level. In comparison, semantic interoperability leverages the structural interoperability of the data exchange, but provides a vocabulary so other systems and consumers can also interpret the data. Analytics produced by xAPI statements would benefit from more consistent and semantic approaches to describing domain-specific verbs, activityTypes, attachments, and extensions. The xAPI specification recommends implementers to adopt community-defined vocabularies, but the only current guidance is to provide very basic, human-readable identifier metadata (e.g., literal string name(display), description). The main objective of the Vocabulary and Semantic Interoperability Working Group (WG) is to research machine-readable, semantic technologies (e.g., RDF, JSON-LD) in order to produce guidance for Communities of Practice (CoPs) on creating, publishing, or managing controlled vocabulary datasets (e.g., verbs). In this session, you will see a brief introduction to modern controlled vocabulary practices and how they can be applied to xAPI to add semantic expressiveness of controlled vocabularies. The progress and resources from the Vocabulary WG (started in April 2015) will also be shared.
6. Current Practices
• Initial focus of xAPI has been on “structural”
interoperability
• defined the syntax and ensured data can be
exchanged/processed
– RESTful API principles / style / simplicity
– syntax of data exchange (JSON)
– This was needed to support multiple
devices/platforms and scalability
7. Current Practices
– Spec guidance only mentions human readable
identifier metadata (literal name(display),
description)
– Based on natural language, but…
• No focus on prescribing semantic meaning and
disambiguation for controlled vocabularies
8. Problem Definition
1. The xAPI spec recommends CoPs to create
controlled vocabularies…BUT
- No guidance on creating, publishing, or managing
controlled vocabulary data (e.g., verbs, activityTypes,
attachments, extensions)
2. Natural language presents 2 challenges:
– 2 or more terms can be used to represent a single
concept
– 2 or more words that have the same spelling can
represent different concept
11. Johnny passed life
Actor Verb Object
mbox: johnny@appleseed.com http://adlnet.gov/expapi/verbs/passed http://www.gameoflifeonline.com/play-game-of-life-online
{
"actor": {
"mbox": "mailto:johnny@appleseed.com",
"name": "Johnny",
"objectType": "Agent"
},
"verb": {
"id": "http://adlnet.gov/expapi/verbs/passed",
"display": {
"en-US": "passed"
}
},
"object": {
"id": "http://www.gameoflifeonline.com/play-game-of-life-online",
"definition": {
"name": {
"en-US": "Life"
},
"description": {
"en-US": "The game of life"
}
}
12. Current Guidance - Verbs
“The IRI contained in an id SHOULD contain a
human-readable portion which SHOULD
provide meaning enough for a person reviewing
the raw statement to disambiguate the Verb
from other similar(in syntax) Verbs.”
13. Current Guidance - Activity Definitions
"If an Activity IRI is an IRL, an LRS SHOULD
attempt to GET that IRL, and include in HTTP
headers: "Accept: application/json, /".”
“An Activity with an IRL identifier MAY host
metadata using the Activity Definition JSON
format which is used in Statements, with a
Content-Type of "application/json"
14. Johnny passed life
Actor Verb Object
mbox: johnny@appleseed.com http://adlnet.gov/expapi/verbs/passed http://www.gameoflifeonline.com/play-game-of-life-online
http://activitystrea.ms/schema/1.0/game
activityType
Outcome: Human readable Text/HTML Outcome: Human readable Error Page
21. A few ‘words’ about WordNet
• WordNet is not required for xAPI Controlled
Vocabularies, but…
– Provides an example of linking terms to an
authoritative lexical source / semantic database
– Useful for natural language processing (verbs,
nouns) and disambiguation
– Originally provided as English, but also has
translations in several other languages
22. How can we improve semantics
and understanding for xAPI?
25. Semantic Web Technologies
• Can improve meaning/understanding of xAPI
data for LRS, Systems, Analytics
• Reduce potential for duplication of terms
(e.g., verbs, activityTypes, attachments,
extensions)
• Improve discoverability and reusability of
vocabularies for designers/developers,
repositories, registries
26. Why is Semantic Tech Important?
• Allows a copy of the data to be returned and
understood by different agents (both humans
& machines/apps)
– Exposes the data (linked open data)
– Provides the latest version of the data
– Promotes & facilitates reuse of terms
– Authoring tools could provide a dynamic look-up
of existing Verbs, Activities, Attachments,
Extensions
– Any applications (e.g., not just LRS or LMS) could
obtain additional metadata
– Potential Reasoner engine/ applications
27. Data Modeling & Formats
• Linked Data can be serialized into several
different formats:
– RDF/XML, Turtle, N-Triples, N3, RDFa, JSON-LD
28. From JSON to JSON-LD
• JSON = Human-readable + machine-
interchangeable (processed)
• JSON Linked Data (JSON-LD) = Human-readable +
machine-readable (processed and
understandable)
30. What is Linked Data?
• LD is semantic metadata based on the RDF
data model
• Different from “SCORM metadata aka LOM”
– Not static XML: dynamically linked and naturally
discoverable and open via the web and
applications
• LD is about using the web to connect related
data and resources that weren’t previously
linked
31. 4 Linked Data Principles
1. Use IRIs as names for things
2. Use HTTP IRIs, so that people can look up
those names
3. When someone looks up a IRI, provide more
information/metadata using the standards
(RDF, SPARQL)
4. Include links to other IRIs, so they (humans
and machines) can discover more things
32. RDF Data Model
XAPI = statements are about learning
experiences
RDF= statements are about xAPI resources
subject predicate object
[resource] [relationship, properties] [thing]
39. • Identify requirements /
document use cases
pertaining to xAPI vocabulary
usage (discoverability,
reusability, and
interoperability).
• Determine the vocabulary
metadata properties that are
needed. What do machines
need? What do humans
need?
• Discuss and determine
options for vocabulary
publishing and management
WG Objectives
40. Draft CV Ontology/RDF Schema
Human Readable: http://purl.org/xapi/ontology
Available as dereferenced HTML/RDFa (human + machine readable)
44. Vocabulary Working Group
• WG Charter: http://purl.org/xapi/vocab-wg-charter
• Vocab Use Cases: http://purl.org/xapi/vocab-use-cases
• WG Files: http://purl.org/xapi/vocab-wg-files
• Ontology / RDF Schema: http://purl.org/xapi/ontology
• Controlled Vocabulary Datasets
– ADL Example (HTML/RDFa): http://adlnet.gov/expapi/verbs
– Verb & Activity Type Concepts
– Currently uses Dublin Core, SKOS
• On Github: https://github.com/adlnet/xapi-vocabulary
• IRI Design Guidelines: http://purl.org/xapi/iri-design
Progress (so far)
45. Persistent & Stable IRI Design
• Reasons why an IRI target might change/move
– CoP change in membership or turnover
– Dependency on an organization, application, system, tool,
or server
– Rapid design and prototyping
– Controlled Vocabulary lifecycle maintenance
– Change in structure
• Recommended Practices (for NEW IRIs)
– Follow strong pattern for IRI persistence
– Use dedicated service when minting IRIs
46. Vocab WG Next Steps
• Apply approach to new CoP controlled vocabularies
• Add new Classes to the ontology (Attachments, Extensions,
Profiles, Recipes)
– SKOS Labels for multilingual support
– SKOS for broader/narrower relationships
– PROV for versioning and provenance metadata
• Update the core xAPI specification (IRI metadata)
• Process documentation, refined examples
• Automated tools for xAPI Community, repository (evaluate
open source options)
• Registry support (e.g., TinCan Registry)
47. Summary/Big Picture Opportunities
• Learning analytics require consistent approaches to
describing controlled vocabularies and domain-specific
concepts such as xAPI Verbs
• Concepts in xAPI should be "thing, not a string”
• Machine readability/understanding could result in
opportunities for machine learning, recommendation
algorithms/engine, competency definitions
• If accessible as linked data, we have new opportunities for
dynamic vocabulary look-up in authoring tools
• Improve potential discoverability and reuse of existing xAPI
vocabularies (semantic search) by CoPs
48. Thank You!
48
Jason Haag
ADL Technical Team
jason.haag.ctr@adlnet.gov
Twitter: @mobilejson
Need support in applying this approach? Or Interested in participating?
Contact us: xapi-vocabulary@adlnet.gov
Editor's Notes
Joke slide
Joke slide
Interoperability describes the extent to which systems and devices can exchange data, and interpret that shared data. For two systems to be interoperable, they must be able to exchange data and subsequently present that data such that it can be understood. Currently xAPI mostly focuses on providing “structural” interoperability. Structural interoperability defines the syntax of the data exchange and ensures the data exchanged between systems can be processed at the data field level.
REST standard HTTP methods (e.g., GET, PUT, POST, or DELETE)
Interoperability describes the extent to which systems and devices can exchange data, and interpret that shared data. For two systems to be interoperable, they must be able to exchange data and subsequently present that data such that it can be understood. Currently xAPI mostly focuses on providing “structural” interoperability. Structural interoperability defines the syntax of the data exchange and ensures the data exchanged between systems can be interpreted at the data field level.
REST standard HTTP methods (e.g., GET, PUT, POST, or DELETE)
What might this statement mean? Without more context and semantic definition of these terms it could me mean multiple things. What do you think it means?
Machines don’t know what these natural language words are without IRIs that provide more metadata or links to other things!
So we probably wouldn’t say that somebody passed away but it could be interpreted as such by machines unless the metadata source of the IRI has more meaning.
How many LRS applications will do this or are doing this? All this says is that you should return some JSON string data about the title/display and description. It doesn’t require any semantic links to things.
Why? Because xAPI was intended to be put into place very quickly and while there were little references to metadata and controlled vocabularies, the first authors figured these would be addressed or improved in time. As mentioned earlier, the initial focus was on structural interoperability.
Now humans looking at this statement could probably determine that Johnny passed an online game called life. These are human readable definitions, but an application or machine doesn’t understand this.
In fact, even the human readable definition of game doesn’t provide any meaning at the IRI for Activity Streams. What does the activity type “game” from activity streams mean? According to this it is just an error page.
Veterans of the xapi community might now that game was borrowed from activity streams vocabulary, but someone new to xAPI might not have any idea what this activity type means since the human readable page is an error page and has no other details. And applications trying to interpret the data definitely don’t know what it means since the IRI doesn’t return any metadata.
This is
This is
This is an example of both passed and the life activity type pointing to machine readable data from wordnet, where the hypernym and hyponym relationships can be understood as well as translations in multiple languages.
This is what we want as the results are both human readable and machine readable data.
Passed JSON LD from wordnet.
It provides hyponymy and some language translation. For example, pigeon, crow, eagle and seagull are all hyponyms of bird (their hypernym); which, in turn, is a hyponym of animal.
Any RDF source providing more semantics could be used, but we’ve been looking heavily at wordnet because of it’s synset based ontology and support for multiple languages
How do we improve our readability for applications so they can understand the data while keeping it human readable?
Semantic web technologies started to be developed as far back as 1999 with the initial idea of describing resources on the web, the Resource Description Framework (RDF) was born. Today, we are in a Web 3.0 era that is focused on personalization, knowledge, and semantics. But the semantic web technology stack has only become more mature and updated during the past few years.
RDF/XML is sometimes misleadingly called simply RDF because it was introduced among the other W3C specifications defining RDF and it was historically the first W3C standard RDF serialization format. However, it is important to distinguish the RDF/XML format from the abstract RDF model itself. Although the RDF/XML format is still in use, other RDF serializations are now preferred by many RDF users, both because they are more human-friendly,[18] and because some RDF graphs are not representable in RDF/XML due to restrictions on the syntax of XML QNames.
With a little effort, virtually any arbitrary XML may also be interpreted as RDF using GRDDL (pronounced 'griddle'), Gleaning Resource Descriptions from Dialects of Languages.
RDF triples may be stored in a type of database called a triplestore.
Mention that this is what activity streams is currently adopting (JSON-LD)
FOCUS on providing more richness to a data model that is heavily based on statements. We not only need to accurately describe experiences through these statements, but the systems must be able to interpret them as well.
In comparison to xAPI, RDF provides statements about resources. In our case, the types of resources we want to represent using RDF are verbs, activity types, attachments, and extensions
We might also want to describe and represent profiles and recipes!
We can visualize triples as a connected graph. Graphs consists of nodes and arcs. The subjects and objects of the triples make up the nodes in the graph; the predicates form the arcs. Fig. 1 shows the graph resulting from the sample triples.
Content negotiation requires a technical configuration of the server to send either a human readable version or a machine readable version of the data based on the header request
Putting together the pieces of the puzzle
What do machines (e.g., repositories, registries, LRS applications) need?
What do humans need (e.g., CoPs, learning designers, developers, data scientists)?
Human readable view. The ontology is hosted on a server configured to serve linked data.
Human readable view. The ontology is hosted on a server configured to serve linked data.
This the human readable view of an example of a controlled vocabulary expressed as RDF. It is using RDFa and is embedded in the HTML. The vocabulary working group is determining on how to best update the spec to have xAPI be able to support this by serving up JSON-LD.
This the human readable view of an example of a controlled vocabulary expressed as RDF. It is using RDFa and is embedded in the HTML. The vocabulary working group is determining on how to best update the spec to have xAPI be able to support this by serving up JSON-LD.
Mention earlier example of activity streams vocabulary terms moving!
We need a simple way to express a parent category or child categories to which terms might be grouped under. The standard used for this situation is SKOS, which provides the ability for a vocabulary to become a structured taxonomy.
We need a simple way to express a parent category or child categories to which terms might be grouped under. The standard used for this situation is SKOS, which provides the ability for a vocabulary to become a structured taxonomy.
By publishing the xAPI Controlled Vocabularies as Linked Data in an open environment for anyone to freely use, we are sharing with the world the years of research and meaning