Testing tools and AI - ideas what to try with some tool examples
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
1. A Vague Sense Classifier for Detecting Vague
Definitions in Ontologies
Panos Alexopoulos, John Pavlopoulos
14th Conference of the European Chapter of the Association for Computational
Linguistics
Gothenburg, Sweden, 26–30 April 2014
2. 2
Vagueness
Introduction
●Vagueness is a semantic phenomenon where predicates admit
borderline cases, i.e. cases where it is not determinately true that the
predicate applies or not (Shapiro 2006).
●This happens when predicates have blurred boundaries:
● What’s the threshold number of years separating old and not old
films?
● What are the exact criteria that distinguish modern restaurants
from non-modern?
3. 3
Vagueness Consequences
Introduction
●The problem with vague terms in semantic data is the possibility of
disagreements!
●E.g., when we asked domain experts to provide instances of the
concept Critical Business Process, there were certain processes for
which there was a dispute among them about whether they should be
regarded as critical or not.
●The problem was that different experts had different criteria of
process criticality and could not decide which of these were
sufficient to classify a process as critical.
4. 4
Problematic Scenarios
Introduction
1. Structuring Data with a Vague Ontology: Possible
disagreement among experts when defining class and relation
instances.
2. Utilizing Vague Facts in Ontology-Based Systems:
Reasoning results might not meet users’ expectations
3. Integrating Vague Semantic Information: The merging of
particular vague elements can lead to data that will not be
valid for all its users.
5. 5
Problem Definition & Approach
Automatic Vagueness Detection
●Can we automatically determine whether an ontology entity (class, relation etc.)
is vague or not?
● “StrategicClient” as “A client that has a high value for the company” is
vague!
● “AmericanCompany” as “A company that has legal status in the
Unites States” is not!
Problem Definition
●We train a binary classifier that may distinguish between vague and non-vague
term word senses.
●Training is supervised, using examples from Wordnet.
●We use this classifier to determine whether a given ontology element definition
is vague or not.
Approach
6. 6
Data
Automatic Vagueness Detection
●2,000 adjective senses from WordNet.
● 1,000 vague
● 1,000 non-vague
●Inter-agreement of vague/non-vague annotation among 3 human
judges was 0.64 (Cohen’s Kappa)
Vague Senses Non Vague Senses
• Abnormal: not normal, not typical or usual
or regularor conforming to a norm
• Compound: composed of more than one
part
• Impenitent: impervious to moral persuasion • Biweekly: occurring every two weeks.
• Notorious: known widely and usually
unfavorably
• Irregular: falling below the manufacturer's
standard
• Aroused: emotionally aroused • Outermost: situated at the farthest possible
point from a center.
7. 7
Training and Evaluation
Automatic Vagueness Detection
●80% of the data used to train a multinomial Naive Bayes classifier.
●We removed stop words and we used the bag of words assumption to
represent each instance.
●The remaining 20% of the data was used as a test set.
●Classification accuracy was 84%!
8. 8
Comparison with Subjectivity Analyzer
Automatic Vagueness Detection
●We also used a subjective sense classifier to classify our dataset’s
senses as subjective or objective.
●From the 1000 vague senses, only 167 were classified as subjective
while from the 1000 non-vague ones 993.
●This shows that treating vagueness in the same way as
subjectiveness is not really effective.
9. 9
Use Case: Detecting Vagueness in CiTO Ontology
Automatic Vagueness Detection
●As an ontology use case we considered CiTO, an ontology that
enables characterization of the nature or type of citations.
●CiTO consists primarily of relations, many of which are vague (e.g.
plagiarizes).
●We selected 44 relations and we had 3 human judges manually
classify them as vague or not.
●Then we applied our Wordnet-trained vagueness classifier on the
textual definitions of the same relations.
10. 10
Use Case: Detecting Vagueness in CiTO Ontology
Automatic Vagueness Detection
Vague Relations Non Vague Relations
• plagiarizes: A property indicating that
the author of the citing entity
plagiarizes the cited entity, by
including textual or other elements
from the cited entity without formal
acknowledgement of their source
• sharesAuthorInstitutionWith: Each
entity has at least one author that
shares a common institutional
affiliation with an author of the other
entity
• citesAsAuthority: The citing entity
cites the cited entity as one that
provides an authoritative description
or definition of the subject under
discussion.
• providesDataFor: The cited entity
presents data that are used in work
described in the citing entity.
11. 11
Use Case: Detecting Vagueness in CiTO Ontology
Automatic Vagueness Detection
●Classification Results:
● 82% of relations were correctly classified as vague/non-vague
● 94% accuracy for non-vague relations.
● 74% accuracy for vague relations.
●Again, we classified the same relations with the subjectivity classifier:
● 40% of vague/non-vague relations were classified as
subjective/objective respectively.
● 94% of non-vague were classified as objective.
● 7% of vague relations were classified as subjective.
12. 12
Future Work
Vagueness-Aware Semantic Data
●Incorporate the current classifier into an ontology analysis tool
●Improve the classifier by contemplating new features
●See whether it is possible to build a vague sense lexicon.
13. 13
Questions?
Thank you!
iSOCO Madrid
Av. del Partenón, 16-18, 1º7ª
Campo de las Naciones
28042 Madrid
España
(t) +34 913 349 797
iSOCO Pamplona
Parque Tomás
Caballero, 2, 6º4ª
31006 Pamplona
España
(t) +34 948 102 408
iSOCO Valencia
C/ Prof. Beltrán Báguena, 4
Oficina 107
46009 Valencia
España
(t) +34 963 467 143
iSOCO Barcelona
Av. Torre Blanca, 57
Edificio ESADE CREAPOLIS
Oficina 3C 15
08172 Sant Cugat del Vallès
Barcelona, España
(t) +34 935 677 200
iSOCO Colombia
Complejo Ruta N
Calle 67, 52-20
Piso 3, Torre A
Medellín
Colombia
(t) +57 516 7770 ext. 1132
Key Vendor
Virtual Assistant 2013
Quieres
innovar?
Dr. Panos Alexopoulos
Semantic Applications Research
Manager
palexopoulos@isoco.com
(t) +34 913 349 797