Semantics for visual resources: use cases from e-culture

Semantics for visual resources
Use Cases from E-Culture
Guus Schreiber
Free University Amsterdam
schreiber@cs.vu.nl

2
Purpose
 Analyze a number of use cases from e-culture
domain
– Multimedia plays key role
 Required technology
– Typically combination of technologies
 Relation to state of the art
Acknowledgements: This presentations contains
slides and images provided by Laura Hollink,
Giang Nguyen and Cees Snoek. Also thanks to
the MultimediaN E-Culture team

3
Use case: Asian chairs
User has found an image of an Asian chair
Annotation:
ex:image vra:stylePeriod aat:Guangxu .
How can we find images of Asian chairs from
the same historical period?

5
Importance of time and space
information
 Many queries require time/space
knowledge, either absolute or abstracted
 For the chair image we can establish
– Country = China (link Chinese => China)
– Period = 1644-1911 (from Qing description)
 Technology requirements:
– Thesuari relating time/space concepts
– NLP for unstructured descriptions
– Time/space reasoning techniques

8
Sample place information in TGN
<tgn:AdministrativePlace rdf:about="&tgn;1000111"
tgn:standardLatitude="35"
tgn:standardLongitude="105“>
<vp:parentPreferred rdf:resource="&tgn;1000004"/>
……..
</tgn:AdministrativePlace>

9
Issues when searching for
“nearby” Asian chairs
 Close in space:
– Other country in (East) Asia
– Latitude/longitude
 Close in time:
– Links between style periods
– Match time periods (and
handle incomplete
information)

11
Use case: painting style
Find paintings of a similar style
MATISSE, Henri
Le bonheur de vivre (The Joy of Life)
1905-1906
Oil on canvas, 69 1/8 x 94 7/8 in. (175 x 241 cm)
Barnes Foundation, Merion, PA

12
How can we find this other Fauve
painting?
DERAIN, Andre
The Turning Road, L'Estaque, 1906
Oil on canvas, 51 x 76 3/4 in. (129.5 x 195 cm)
Museum of Fine Arts, Houston, Texas

13
Issues
 Parse annotation to find matches with thesauri
terms
– E.g. match artists to ULAN individuals
 Artists-style links
– AAT contains styles; ULAN contains artists, but there
is no link
• Learn link from corpora
• Derive it from other annotations
– Domain-specific rules/reasoning needed
• see example in SWRL doc
• Painters may have painted in multiple styles

16
Search: WordNet patterns that increase recall
without sacrificing precision (Hollink)

17
Issues w.r.t. thesauri
 Public availability!
 RDF/OWL representation
 Learning/specifying term/concept mapping
– owl:equivalentClass, owl:sameAs,
rdf:type, rdfs:subClassOf
– Domain-specific links
 Managing the evolution of the thesauri and
the mappings

18
Use case: find images with the
same subject
Find another painting which portrays dancing

19
Issues
 Same subjects can be visually very
different
 Subject is often missing from the
annotation
 Mismatch: users often search for subjects
of images

20
Conceptual subject descriptions
85% of the user queries:
General Descriptions of generally known items. Only general,
everyday knowledge is necessary. Descriptions are at the
level of the Natural categories of E. Rosch (1973), or more
general. E.g An ape eating a banana.
Specific Descriptions of objects or scenes that can be identified
and named. Specific domain knowledge is necessary to
recognize the objects or scenes. E.g. The old male gorilla
Kumba, born in Cameroon and now living in Artis, Amsterdam
Abstract Descriptions for which interpretative knowledge is
used. This category is subjective. E.g An animal threatened
with extinction.

21
Example concepts in image
 Specific
– Fall of the Berlin Wall
 General
– People walking at night
 Abstract
– Fall of the Iron Curtain

22
Use of conceptual categories by
people searching for images
Conceptual level: 83%
0%
20%
40%
60%
80%
100%
event time place relation scene object
Characteristics
Nuberofelementsin%of
conceptualelements
Abstract
Specific
General

23
Thesauri for scenes: Iconclass

26
Annotation of image content
 Template for subject description
Agent Action Object Recipient
 Guidelines for manual annotation
– Annotate as specific as possible
 Default reasoning
 CBIR support:
– Object identification
– Spatial relations

29
Some forms of image content are
well suited to image analysis
Collection of clothes
Abstract painting

30
The semantic gap
 The distance between Content-Based Image
Retrieval and semantics:
– Smeulders, Worring, Santini, Gupta, Jain. Content-
based image retrieval at the end of the early years.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 22(12), December 2000.
 Direct links between visual features and
semantic concepts become more difficult when
the domain is broader / more general

31
Example semantic bridge:
microscopic cell images
mpeg7 : StillRegion(region) ^
mpeg7x : Dense(region) ^
mpeg7 : DominantColor(region, col) ^
swrlb : lessThan(col, 100)
=> mpeg7 : Depicts(region, mesh : MatureGranule)

32
Segmentation often requires
user interaction

33
Automatic detection of concepts can be
difficult even in “easy” cases
What is the color
of this ape?

34
Image analysis useful for
collection navigation

35
Bridging the semantic gap:
CBIR and ontologies
Visual WordNet (GE paper)
– Adding knowledge about visual characteristics
to WordNet: mobility, color, …
– Build detectors for the visual features
– Use visual data to prune the tree of categories
when analyzing a visual object

36
Sample visual features and their
mapping to WordNet

37
Experiment: pruning the search
for “conveyance” concepts
6 concepts found
Including taxi cab
12 concepts found
Including passenger train
and commuter train
Three visual features: material, motion, environment
Assumption is that these work perfectly

38
Bridging the semantic gap:
concept detectors
 Snoek et al., TRECVID2004
– 185 hours of news video
 32 detectors for concepts in news video
– Through machine learning
 Similarity detectors based on keywords
and visual analysis
 Query interface in which these functions
can be combined

39
“Concepts” for which visual
detectors were built

40
LSCOM lexicon: 229 - Weather
 Context-specific (i.e.
news broadcast)
interpretation:
“Weather forecast”

41
LSCOM lexicon: 110 – Female Anchor
 Composite concept
 Alignment needed for
semantic search, e.g.
with WordNet

42
Natural-lang proc.
automatic annotation
text stings → concepts
Distributed
cultuurwijzer.nl collections
OAI-based access
Reasoning support
time/space reasoning
Web interface
support for web collections
Presentation facilities
semantic presentation
device-specific
Interoperability
XML/RDF/OWL
Scalability
> 10,000,000 triples
Ontologies
WordNet, AAT, TGN
ULAN, Dutch labels
Search strategies
sibling search
semantic distance
Dublin Core
specializations
dumb-down
semantic annotation
DIGITAL HERITAGE
COLLECTIONS
semantic search
BASELINEENHANCEDENHANCED
FEATURESFEATURES
NEWNEW
FEATURESFEATURES

44
Main observation
A combination of many different techniques
is needed to be able to cope with the
complexity of multimedia semantics
– NLP, segmentation, CBIR, visual feature
detectors, visual ontologies, publicly available
thesauri, thesauri mappings, dedicated
reasoning techniques (time, space, default),
personalization, presentation generation
 Key role for user studies

Semantics for visual resources: use cases from e-culture

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Semantics for visual resources: use cases from e-culture

Similar to Semantics for visual resources: use cases from e-culture (20)

More from Guus Schreiber

More from Guus Schreiber (20)

Semantics for visual resources: use cases from e-culture

Editor's Notes