Knowledge Modelling Principles for Grakn Academy

Grakn Academy | Knowledge Modelling Principles
November 11th 2020
Tomás Sabat and Daniel Crowe

a. Logistics and intros
b. The modelling philosophy, principles and techniques around Grakn’s
knowledge schema
c. Best practices for designing schema
Agenda

Logistics
Video On Audio Off
Fist to Five
Check-in

Domain Modelling - best practices
Here we set out the best practice guidelines for creating a Graql schema.
It’s possible to create a successful model that does not conform to the
guidelines.
These guidelines aim to help maximise:
• true-to-domain modelling
• flexibility
• and extensibility

Importance of Modelling Choices
Example
We want to carefully choose our data model, that is, choosing what should be an entity, a
relation, or an attribute.
We want this model to closely reflect the domain. In this way we can know that if new data
becomes available in the domain then it will fit into the data model.
With this schema, when later we find
there is information regarding a
person’s employment, how can we
add it?
If the domain is modelled true-to-life,
then extra schema (and therefore
data) is added trivially.

Importance of Naming - naming concepts
• understand your model
• maintain it
• extend it
• adapt it.
Naming and choosing your entities, relations and attributes are two issues that
are tied together.
The naming should closely link to typical domain terminology, since natural
language is the means to describe the domain.
Using the expected domain terminology is important, this lets another domain
expert easily:
Focussing on terminology to
determine structure is a typical
approach for software architecture in
general, not just for Graql schema

Importance of Naming - naming conventions
• Naming is all in lower case
• Use hyphens
• Indent on newlines after declaring a type e.g.

Choosing Type Names
A Type Name should:
• Ideally be a noun (except for attributes)
• Be singular, the presence of more than one instance lends plurality
• Be as context-specific as possible, using the exact word that describes the
concept in- context
• If you can’t find a noun specific enough then try concatenating two or more
nouns by hyphenating them:
• these could become too verbose, so try to find the balance between
specificity and verbosity!

Choosing Type Names
A Type Name should:
• If a noun can’t be found, then use a present participle or past-participle of a verb as an
adjective to capture the context. Combine this with a context non-specific noun.
• e.g. below authored-content , since content would be too generic, and authored-
book would be too specific (as it would rule out other types of content):
• prepositions may also need to be interjected into the naming if multiple words are
required.
• End the name with a noun, not a preposition
• i.e. avoid -to, -from, -by, -for as they tend to add confusion

Take a Look at Your Domain
Exercise - 3 minutes - individually:
• Identify one area or piece of your domain - includes at least (1) entity and (1) relation
• Draw this as a diagram, making the connections between concepts
• What naming decisions have you made vs. the existing naming you have - be prepared
to talk about one fo these
• Screenshot your diagram and drop it into the #academy channel in discord.

Entities
Choosing things to model as entities
“An entity may be defined as a thing capable of an independent existence that can be
uniquely identified. An entity is an abstraction from the complexities of a domain. When
we speak of an entity, we normally speak of some aspect of the real world that can be
distinguished from other aspects of the real world.”
(https://en.wikipedia.org/wiki/Entity%C3%A2%C2%80%C2%93relationship_model)
Good choices for entities are:
• Any physical thing in the real world should be modelled as an entity (e.g. animal,
person, device, building)
• Anything that exists logically but doesn’t require involvement of other things in
order to exist, groups or collections of things.

Entities
Good choices for entities are:
• Use concrete/proper/common/abstract/collective.
proper – Homo Sapien (normally an attribute name of an instance), common
• concrete – person, tree, car.
• collective – family, government, team, orchestra, set
• Abstract – religion, pain, principle.
• To specialise a general noun, use a combination with another noun – social-
group.

Relations
Choosing things to model as relations
It’s easy to post at certain concepts as definitely entities, e.g. car .
Harder are the more conceptual, to decide whether they are entities or relations.
Relation categories
Binary relations should conform to the mathematical definitions. These definitions say that a
binary relation must either have the property or anti-property for each of the following cases:
Property Anti-property Property description
Symmetric Anti-symmetric Relation is the same in both directions
Transitive Anti-transitive Relation can be chained
Reflexive Anti-reflexive Roleplayer can be related to itself through that relation
This means that if a concept doesn’t have a property or anti-property, then it cannot be a binary
relation. Therefore it is likely an entity or a ternary or N-ary relation.

Relations
Example 1
An employment relation, with roles employee and employer is antisymmetric, antitransitive and
antireflexive
We see that it is logically consistent to define it in these terms
Example 2
A religion is neither:
• symmetric or antisymmetric
• transitive or antitransitive
• reflexive or antireflexive
Therefore we can conclude that a religion is not a binary relation

Relations
In general, relations shouldn’t make sense without their roles. For example, a marriage can’t
logically exist without at least one spouse/husband/wife
• In language, a relation cannot be referred to without the need to reference something else to
contextualise it. marriage is the go-to example here, since it loses context without referring to the
people that were married.
• Ideally, we are looking for the concept that connects two things, not a direct connection (often
those are role names, like employee)
• For instance here, we don’t use “owns” (the edge you might use in a triple) anywhere here:
**do you notice that ownership is the verbal noun of owns

Relations
• Gather together domain terminology that sounds similar to the concept you want to model. Then
determine which are candidate relation, role, and entity names (determining and naming
attributes is often not too hard) .
• Remember that a role describes how a thing behaves in the scope of a relation. Examples with
roleplayer type, role, relation :
• a car behaves like property in an ownership
• a station behaves as a stop along a train-route
• a person behaves as an employee in an employment

Relations
•Relation names could describe membership to a grouping/collection of things (component, group-
membership), an action/ongoing state (marriage, comparison, authorship, participation), or a
description of a direct interrelation between two or more things (friendship, parenthood,
association, drug-protein-interaction)

Relations
• A relation is defined such that an instance should not be able to exist without relating at least
one instance for one of its roles. This is the idea that a relation is dependent upon the existence of
one or more roleplayers.
• A relation should still make sense even if any number of it’s roleplayers are missing
• The roles and roleplayers should make logical sense to be connected in any combination

Relations
Naming a relation
• You should find that you choose names from these categories:
• abstract nouns,
• transitive verbs that can accept 2 or more arguments – decide, agree, marry.
•their verbal nouns are preferable – decision, agreement, marriage.
•(https://www.grammar-monster.com/lessons/nouns_different_types.htm)

Relations
Naming a relation
Ending with Nouns Nouns, verbs and prepositions
How does it look when querying?
Using nouns only
Nouns, verbs and prepositions
Noun combinations can be more exact, but are more verbose. It’s your choice!

Relations
Naming a relation
• Wherever possible, relations should be named in such a way that the name doesn’t include a
‘reference’ to one of its roleplayers in particular.
• Parenthood is an example of a fairly unavoidable case, where the relation naming refers more
to the role of parent than the role of child .
• An example of the ideal case would be:

Discussion
Let’s give it a try on our own domains
• Go back to your diagram and assess the naming choices for your relation(s)
• Add or update role players
• Be prepared to share what you found and why you made the decisions you did

Attributes
Choosing things to model as attributes
• Usually the easiest to choose, since they are the direct description of a set of values that we want
to model.
Naming an attribute
The name of an attribute should refer to a literal value.
•Make attributes context-specific
• Where necessary by concatenating words, ending with a noun (as for entities)
• Abstract nouns e.g. colour
• Adjectives e.g. friendly
• Intransitive verbs (no direct objects, can’t be followed by “who” or “what”) e.g. is-raining,
graduated.

Composition vs. Inheritance
Composition replaces the temptation of multiple inheritance.
“Entity type Y is a subtype (subclass) of an entity type X if and only if every Y is necessarily an
X” (https://en.wikipedia.org/wiki/Enhanced_entity%C3%A2%C2%80%C2%93relationship_model)
Therefore define customer sub person; is a bad idea, since:
• An organisation could be a customer, therefore customer is a behaviour (a role)
• A person who plays the role of a customer could play the role of many other things, e.g. teacher
Mindset
Requires a shift in mindset, instead, see using roles as composition for behaviours of a concept
Try putting names in a context like this:
A [relation] has a [role] in the form of a [thing]

Benefits of Inheritance - Scoping Queries
Wide scope: Get all the posts
Narrow scope: Get all the comments
Intermediate scope: Get all media

Benefits of Inheritance - better constraints
If we use this for companies owning offices:
And we also use this for people owning social groups.
We see a problem. Now a person can own an
office, and a company can own a social
group.
With this schema Grakn cannot enforce this
for us.

Benefits of Inheritance - better constraints
We have the constraints we want, and we can still retrieve the subtypes using:

To help us see the use of ternary relations, consider someone buying a product
Start with only binary relations: Ternary since all 3 occur at the same time
Where do we add value for the sale? This gives us the perfect way to add the value
Ternary and N-ary Relations

Now we can refer to the transaction in
other relations.
Note that this can be favourable over
adding another role to the existing
relation.
This is better for:
• Consistency across schema
• Versatility, we can add more
information to either of the two
relations
Nested Relations

Schema design impacts query performance
Use context-specific relation and role names, this allows the query planner to find a
good path (otherwise all data is homogeneous, it all looks the same)
Optimisation

Writing our Domain Schema
Exercise - 10 minutes - individually:
• Using Workbase or a text editor, try building a schema for your domain!

An added incentive to keep improving your schema
We’re going to give away a swag pack (t-shirt, stickers, etc.) to the best
schema posted to twitter with the tag: @graknlabs + #GraknAcademy
You’ll have 7 days from now to post - Tomás and I will be picking the
winner(s) next Wednesday morning at 9am gmt+1

Knowledge Modelling Principles for Grakn Academy

Recommended

Recommended

More Related Content

Similar to Knowledge Modelling Principles for Grakn Academy

Similar to Knowledge Modelling Principles for Grakn Academy (20)

Recently uploaded

Recently uploaded (20)

Knowledge Modelling Principles for Grakn Academy