Semantics for visual resources: use cases from e-culture
1. Semantics for visual resources
Use Cases from E-Culture
Guus Schreiber
Free University Amsterdam
schreiber@cs.vu.nl
2. 2
Purpose
Analyze a number of use cases from e-culture
domain
– Multimedia plays key role
Required technology
– Typically combination of technologies
Relation to state of the art
Acknowledgements: This presentations contains
slides and images provided by Laura Hollink,
Giang Nguyen and Cees Snoek. Also thanks to
the MultimediaN E-Culture team
3. 3
Use case: Asian chairs
User has found an image of an Asian chair
Annotation:
ex:image vra:stylePeriod aat:Guangxu .
How can we find images of Asian chairs from
the same historical period?
5. 5
Importance of time and space
information
Many queries require time/space
knowledge, either absolute or abstracted
For the chair image we can establish
– Country = China (link Chinese => China)
– Period = 1644-1911 (from Qing description)
Technology requirements:
– Thesuari relating time/space concepts
– NLP for unstructured descriptions
– Time/space reasoning techniques
8. 8
Sample place information in TGN
<tgn:AdministrativePlace rdf:about="&tgn;1000111"
tgn:standardLatitude="35"
tgn:standardLongitude="105“>
<vp:parentPreferred rdf:resource="&tgn;1000004"/>
……..
</tgn:AdministrativePlace>
9. 9
Issues when searching for
“nearby” Asian chairs
Close in space:
– Other country in (East) Asia
– Latitude/longitude
Close in time:
– Links between style periods
– Match time periods (and
handle incomplete
information)
11. 11
Use case: painting style
Find paintings of a similar style
MATISSE, Henri
Le bonheur de vivre (The Joy of Life)
1905-1906
Oil on canvas, 69 1/8 x 94 7/8 in. (175 x 241 cm)
Barnes Foundation, Merion, PA
12. 12
How can we find this other Fauve
painting?
DERAIN, Andre
The Turning Road, L'Estaque, 1906
Oil on canvas, 51 x 76 3/4 in. (129.5 x 195 cm)
Museum of Fine Arts, Houston, Texas
13. 13
Issues
Parse annotation to find matches with thesauri
terms
– E.g. match artists to ULAN individuals
Artists-style links
– AAT contains styles; ULAN contains artists, but there
is no link
• Learn link from corpora
• Derive it from other annotations
– Domain-specific rules/reasoning needed
• see example in SWRL doc
• Painters may have painted in multiple styles
17. 17
Issues w.r.t. thesauri
Public availability!
RDF/OWL representation
Learning/specifying term/concept mapping
– owl:equivalentClass, owl:sameAs,
rdf:type, rdfs:subClassOf
– Domain-specific links
Managing the evolution of the thesauri and
the mappings
18. 18
Use case: find images with the
same subject
Find another painting which portrays dancing
19. 19
Issues
Same subjects can be visually very
different
Subject is often missing from the
annotation
Mismatch: users often search for subjects
of images
20. 20
Conceptual subject descriptions
85% of the user queries:
General Descriptions of generally known items. Only general,
everyday knowledge is necessary. Descriptions are at the
level of the Natural categories of E. Rosch (1973), or more
general. E.g An ape eating a banana.
Specific Descriptions of objects or scenes that can be identified
and named. Specific domain knowledge is necessary to
recognize the objects or scenes. E.g. The old male gorilla
Kumba, born in Cameroon and now living in Artis, Amsterdam
Abstract Descriptions for which interpretative knowledge is
used. This category is subjective. E.g An animal threatened
with extinction.
21. 21
Example concepts in image
Specific
– Fall of the Berlin Wall
General
– People walking at night
Abstract
– Fall of the Iron Curtain
22. 22
Use of conceptual categories by
people searching for images
Conceptual level: 83%
0%
20%
40%
60%
80%
100%
event time place relation scene object
Characteristics
Nuberofelementsin%of
conceptualelements
Abstract
Specific
General
29. 29
Some forms of image content are
well suited to image analysis
Collection of clothes
Abstract painting
30. 30
The semantic gap
The distance between Content-Based Image
Retrieval and semantics:
– Smeulders, Worring, Santini, Gupta, Jain. Content-
based image retrieval at the end of the early years.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 22(12), December 2000.
Direct links between visual features and
semantic concepts become more difficult when
the domain is broader / more general
35. 35
Bridging the semantic gap:
CBIR and ontologies
Visual WordNet (GE paper)
– Adding knowledge about visual characteristics
to WordNet: mobility, color, …
– Build detectors for the visual features
– Use visual data to prune the tree of categories
when analyzing a visual object
37. 37
Experiment: pruning the search
for “conveyance” concepts
6 concepts found
Including taxi cab
12 concepts found
Including passenger train
and commuter train
Three visual features: material, motion, environment
Assumption is that these work perfectly
38. 38
Bridging the semantic gap:
concept detectors
Snoek et al., TRECVID2004
– 185 hours of news video
32 detectors for concepts in news video
– Through machine learning
Similarity detectors based on keywords
and visual analysis
Query interface in which these functions
can be combined
44. 44
Main observation
A combination of many different techniques
is needed to be able to cope with the
complexity of multimedia semantics
– NLP, segmentation, CBIR, visual feature
detectors, visual ontologies, publicly available
thesauri, thesauri mappings, dedicated
reasoning techniques (time, space, default),
personalization, presentation generation
Key role for user studies
Editor's Notes
&lt;number&gt;
Criteria:
Indirectly derived from the image
Interpretation and domain knowledge are required