SlideShare a Scribd company logo
1 of 33
Constructing a Thesaurus of Irish Folklore
Using Facet Analysis
The MoTIF Project

LAI CMG Annual Seminar, November 8 2013
Project Aims
Thesaurus Construction Guidelines

MoTIF: Pilot Thesaurus of Irish Folklore

LAI CMG Annual Seminar, November 8 2013
Restricted List of Terms

Consistency, remove ambiguity,
improve precision

Controlled
Vocabularies

LAI CMG Annual Seminar, November 8 2013
Restricted List of Terms

Consistency, remove ambiguity,
improve precision

Browsing, navigating

Taxonomies

LAI CMG Annual Seminar, November 8 2013

Hierarchical relationships
Restricted List of Terms

Consistency, remove ambiguity,
improve precision

Browsing, navigating

Synonyms, antonyms, making
connections , definitions, scope

Thesauri

Hierarchical relationships

Equivalence Relationships,
Associative Relationships,
Scope Notes

LAI CMG Annual Seminar, November 8 2013
Custom vocabularies
Adapted vocabularies
International standards
and best practice

LAI CMG Annual Seminar, November 8 2013
Guidelines
Literature review
Main elements of a
thesaurus
Terms and concepts
Relationships (USE, UF, BT,
NT, RT)
Notes, node labels and
arrays

Facet analysis
Construction process

LAI CMG Annual Seminar, November 8 2013
Guidelines
Literature review
Main elements of a
thesaurus
Terms and concepts
Relationships (USE, UF, BT,
NT, RT)
Notes, node labels and
arrays

Facet analysis
Construction process

LAI CMG Annual Seminar, November 8 2013
Guidelines
Literature review
Main elements of a
thesaurus
Terms and concepts
Relationships (USE, UF, BT,
NT, RT)
Notes, node labels and
arrays

Facet analysis
Construction process

LAI CMG Annual Seminar, November 8 2013
Guidelines
Literature review
Main elements of a
thesaurus
Terms and concepts
Relationships (USE, UF, BT,
NT, RT)
Notes, node labels and
arrays

Facet analysis
Construction process

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure
3. Facet Analysis

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure
3. Facet Analysis
4. Relationships

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure
3. Facet Analysis
4. Relationships
5. Alphabetical List

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure
3. Facet Analysis
4. Relationships
5. Alphabetical List
6. Expert Review

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection of
terms
2. Structure
3. Facet Analysis
4. Relationships
5. Alphabetical List
6. Expert Review
7. Documentation

7

3

1

2
8

6

4

5

LAI CMG Annual Seminar, November 8 2013
MoTIF
Construction
Process
1. Selection
2. Structure
3. Facet Analysis
4. Relationships
5. Alphabetical List
6. Expert Review
7. Documentation

7

3

1

2
8

6

Thesaurus

5

LAI CMG Annual Seminar, November 8 2013

4
Term Selection I
Vocabulary Resources
A Handbook of Irish
Folklore by Seán Ó
Suilleabháin
Bealoideas: Journal of
the Folklore of Ireland
Society

LAI CMG Annual Seminar, November 8 2013
Term Selection II
Form of Entry
Nouns: count nouns (cows, dogs) in the plural,
non-count (livestock, milk) in the singular.
Verbs: gerund or verbal, no infinitive.
Adjectives: avoid unless significant.
Adverbs: avoid.
Articles (the, a): avoid.
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
Systematic Structure
Hierarchical (facet) or classified (subject) display
Fundamental facets as top concepts (TT).
Easily updated structure.
Good demonstration of the ISO 25964 hierarchy
rules.
Thing/kind
Whole/part
Particular instances of a class
LAI CMG Annual Seminar, November 8 2013
Facet Analysis
ISO: “grouping of concepts of the same inherent
category”
Objects, materials, people, places, etc.

Ranganathan, 1920s and 1930s
Personality, Matter, Energy, Space, Time

Classification Research Group, 1960s
Thing, kind, part, property, material, process,
operation, agent, patient, product, by-product, space
and time
LAI CMG Annual Seminar, November 8 2013
Time
Place / Space /
Environment
Products
Activities
Processes and
Phenomena
Events

Agents
Objects
Materials
Attributes and
Properties
Parts
Genre
Abstract Entities and
Concepts
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
LAI CMG Annual Seminar, November 8 2013
Current and Future Work
Expansion of the pilot thesaurus to approximately
2,000 preferred terms
Feasibility study into representation in SKOS

Potential Future Work
Multilingual thesaurus (Irish and English)
Representation in SKOS
Mapping to other vocabularies
LAI CMG Annual Seminar, November 8 2013
Thank you!
UF
UF
UF

Thanks!
Cheers!
Ta!

LAI CMG Annual Seminar, November 8 2013

More Related Content

More from dri_ireland

More from dri_ireland (20)

NORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session TwoNORFest 2023 Lightning Talks Session Two
NORFest 2023 Lightning Talks Session Two
 
NORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research AssessmentNORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: Early Career Researcher Panel on Research Assessment
 
NORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects LaunchNORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023: National Open Research Fund 2023, Projects Launch
 
NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session Three
 
NORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session OneNORFest 2023 Lightning Talks Session One
NORFest 2023 Lightning Talks Session One
 
NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)NORFest2023 Keynote address: Chelle Gentemann (NASA)
NORFest2023 Keynote address: Chelle Gentemann (NASA)
 
The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...The Archiving Reproductive Health project as a FAIR data resource for humanit...
The Archiving Reproductive Health project as a FAIR data resource for humanit...
 
Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...Developing a self-care protocol for working with potentially traumatic data: ...
Developing a self-care protocol for working with potentially traumatic data: ...
 
An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland An Introduction to the Digital Repository of Ireland
An Introduction to the Digital Repository of Ireland
 
DRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptxDRI Copyright and Licencing_UCC_Mar23.pptx
DRI Copyright and Licencing_UCC_Mar23.pptx
 
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...The Digital Repository of Ireland Digital Preservation and Research Sustainab...
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
 
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image SharingDRI's role in WorldFAIR: Cultural Heritage / Image Sharing
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
 
Introduction to research data management
Introduction to research data managementIntroduction to research data management
Introduction to research data management
 
Archiving Ports, Ports as Archives
Archiving Ports, Ports as ArchivesArchiving Ports, Ports as Archives
Archiving Ports, Ports as Archives
 
Preservation, Access, Discovery
Preservation, Access, DiscoveryPreservation, Access, Discovery
Preservation, Access, Discovery
 
Dublin in the Fingal Archives
Dublin in the Fingal ArchivesDublin in the Fingal Archives
Dublin in the Fingal Archives
 
Dublin Ghost Signs
Dublin Ghost SignsDublin Ghost Signs
Dublin Ghost Signs
 
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee YouthMapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
Mapping Memories: Participatory Media, Place-Based Stories, Refugee Youth
 
Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation Supporting Activists to Preserve Video Documentation
Supporting Activists to Preserve Video Documentation
 
Making the Future
Making the FutureMaking the Future
Making the Future
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

The MoTIF Project: Constructing a Pilot Thesaurus of Irish Folklore Using Facet Analysis - Catherine Ryan

Editor's Notes

  1. MoTIF is a collaborative project undertaken by the Digital Repository of Ireland and the National Library of Ireland.It produced a set of guidelines on thesaurus construction as a resource for librarians, archivists and other information professionals who wish to organise and annotate their content for improved search and retrieval. The guidelines are accompanied by a pilot thesaurus of Irish folklore which acts as a illustrative example, a visual demonstration of the principles and processes outlined in the guidelines.Both guidelines and pilot have been submitted for review and will be published in December 2013.
  2. Controlled vocabularies are restricted lists of terms used to provide consistency across search, remove any ambiguity between terms and improve search precision. They may contain equivalence relationships such as USE, Use For, see reference types but don’t have to.
  3. Taxonomies are controlled vocabularies with hierarchical relationships, which can be used for browsing up and down a tree or navigating a website.
  4. Thesauri are controlled vocabularies with hierarchical, associative and equivalence relationships that offer all the benefits previously described but can also make more connections between terms using associative relationships, allow search over non-preferred terms using equivalence relationships, and clarify the meaning of terms using definitions and scope notes.Now, definitions can overlap but that, broadly speaking, is how they work.
  5. Following the Digital Archiving in Ireland DRI report, an opportunity was identified to produce guidelines which would give professionals the advice they need to improve their own data practices by adhering to international standards and best practices.
  6. The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  7. The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  8. The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  9. The guidelines act as a basic introduction to thesauri, including the main elements of a thesaurus, facet analysis, and the construction process. They also briefly cover planning, multilingual thesauri, mapping thesauri and the relationship between thesauri and the Semantic Web.
  10. The construction process in brief
  11. Step 1 involves the selection and recording of terms
  12. Step 2: determining structure and display of the thesaurus, if it will be organised by subject or by facet.
  13. Step 3: the facet analysis itself
  14. Step 4: creating relationships and notes within the now complete structure
  15. Step 5: creating an alphabetical list (if desired)
  16. This will then be followed by expert review and
  17. Documentation.
  18. =thesaurus.
  19. The correct form of these terms was then chosen.ISO 25964-1 sets out guidelines on the form that term should take when entered into the thesaurus. Nouns are the most common form you will encounter. Verbs are the next most common and will take the gerund or verbal forms (usually ending in –ing)Adjectives, adverbs (usually ending in –ly) and articles are to be avoided. Some adjectives were included in the pilot thesaurus as they did pop up as significant in the literature. For example, the significance of wearing red on a particular day might be discussed as part of the lore of a particular area.
  20. The initial list of terms is all over the place and a systematic structure needs to be developed to order them in a logical way.
  21. A systematic structure can be though of as a hierarchical or classified display with subjects or facets at the top of the hierarchies.At this stage a hierarchical display was chosen with fundamental facets as the main divisions, or top concepts as this structure is more easily updated, and it is a good demonstration of the ISO standard rules for hierarchical arrangement, that broader and narrower terms should be one of three different types of relationships: a thing/kind of a thing relationship, all concepts in a the objects facet will be a kind or type of object a whole and its parts relationship, the human anatomy will have hands, heads and so on a narrower terms or a narrower term should be an instance of the broader term. So, for example, the class ‘dogs’ would have a narrower term ‘Spot’It’s important to emphasise that we didn’t structure the thesaurus at this stage, we only made the decision on the design of the scaffolding. More detailed elements of the systematic structure were only determined during the vocabulary analysis.
  22. We used the method of facet analysis which is theanalysis of a subject area into its constituent concepts which are then grouped into facets.The ISO thesaurus standard defines a facet as a ‘grouping of concepts of the same inherent category’. Object, materials, people, places and so on are known as fundamental facets. These fundamental categories of facets were first devised by Ranganathan as part of a library classification scheme in the 1920s and 1930s. Ranganathan proposed five categories, Personality, Matter, Energy, Space and Time, or PMEST, which could cover all aspects of a discipline or subject. These were later expanded by Brian Vickery for the Classification Research Group (CRG) based on the Aristotelian fundamental categories—thing, kind, part, property, material, process, operation, agent, patient, product, by-product, space and time. The CRG went on to state that these categories act as guides to analysis and should not be imposed on subjects Ultimately the choice of facets will depend on the subject matter and what is most practical.
  23. What was most practical for the pilot were the facets listed above. How to organise..
  24. ...a jumble of words into...
  25. ...intolists of basic coherent facets. For example, in the literature, an agent is a person or piece of equipment which carries out transitive actions, i.e. actions that require a direct object. Following this, animals, fish and people were placed under the Agents facet as these were living creatures which can perform actions and can have an effect on the environment around them.The category also includes supernatural beings and creatures. Other living organisms were originally located under a separate Living Entities facet. In the end, the decision was taken to include all living organisms, from people through to mythical beings and plants under the Agents facet as it made more sense to keep these all living entities together. It is also arguable that, in folklore, some plants, trees and other such living entities have the potential to perform actions or have an effect on others. So that made sense in the context of folklore. It may not in another. Rather than confuse people, equipment was then put into objects. The guidelines go through a few more tricky decisions and they also outline the scope of each fundamental facet as defined for the pilot thesaurus.
  26. Facet analysis IIOnce the initial analysis has been completed and all terms grouped, the facets were then grouped into narrower divisions, using node labels to divide the facets into sub-facets and to organise them according to the principles, or characteristics of division. In the above example, the Agent facets has sub-facets, people, animals, other living organisms and supernatural beings. The animal sub-facet is then organised according to their characteristics of division, in this case animals by function, by species and so on. This is exactly the kind of division you would see on say a fashion website where shirts are organised by size, by colour and so on.
  27. Once the hierarchies were completed and input into the thesaurus management software, associative relationships were added. This is the process recommended by the ISO standard as the most useful associative relationships are usually across hierarchies and so this is easier to do once those hierarchies have been established.These are examples of the most common type of relationships created across hierarchies, so we have agents relating to their activities, materials referring to their products, objects with parts, etc. It should also be noted that these relationships are reciprocal, so they refer to each other.
  28. After that, example scope notes were added to the pilot thesaurus to explain concepts. Like the relationships, the scope notes present in the pilot thesaurus should be considered as illustrative examples as this was as much as the time frame of the project would allow.
  29. Two lists, alphabetical and hierarchical, were then generated within the software and exported. These formed the basis of the print version of the thesaurus.An electronic version of the software also exists and it contains both hierarchical and alphabetical displays which can be browsed. It can also be searched by keyword.