An overview of Media Analytics outlining the evolution of image classification and knowledge extraction. The presentation offers an insight into the Big-Data Analytics for Media Management.
1. 25/07/14 1
Krishna Chandramouli,
Associate Professor,
Media Engineering and Analytics Research Group,
School of Information Technology and Engineering,
VIT University
krishna.c@vit.ac.in
Big-Data Analytics for Media
Management
2. 25/07/14 2
Krishna Chandramouli,
Associate Professor,
Media Engineering and Analytics Research Group,
School of Information Technology and Engineering,
VIT University
krishna.c@vit.ac.in
Big-Data Analytics for Media
Management
3. Overview
Media and Internet
Information Access
Subjective vs Objective Indexing
The Semantic Gap
Evolving Strategies
Social Media Analysis
Indexing Large-scale Repositories
Future Research Directions
Take Away Message
Q & A
25/07/14 3
5. Media and Internet
In March 2013 that Flickr
had a total of 87 million
registered members and
more than 3.5 million new
images uploaded daily.
There are currently almost
90 billion photos total on
Facebook. This means we
are, by far, the largest
photos site on the Internet.
25/07/14 5
7. Information Access
Traditional ordering of images is achieved
through categorization of information into
logical structures
Creation of albums
Categorizing through date/time
Clustering through location
Image based search engines are gaining
popularity with the increase in power of
indexing schemes
25/07/14 7
13. Subjective vs Objective Indexing
How to uniquely name an image to make
them distinguishable?
What names can be used to search images?
How many names are needed to make the
images unique?
Will all humans use the same names to
identify the images?
25/07/14 13
14. Subjective vs Objective Indexing
Humans are culturally influenced
Terms contain different meanings across
boundaries and cultures
Therefore, any tag/word assigned to an image
will be considered subjective
Objective signatures for images are generated
from the characteristics of the images
The beginning of MPEG-7 standardisation
activities.
25/07/14 14
17. The Semantic Gap
The semantic gap characterizes the difference
between two descriptions of an object by
different linguistic representations, for
instance languages or symbols.
In computer science, the concept is relevant
whenever ordinary human activities,
observations, and tasks are transferred into a
computational representation
25/07/14 17
21. Evolving Strategies
The problem of Image classification and
clustering has been the subject of active research
for last decade. Mainly attributed to the
exponential growth of digital content.
The efficiency of the clustering and classification
algorithms can be attributed to the efficiency of
the machine learning approaches.
To improve the performance of machine learning
algorithms, different optimisation techniques has
been employed such as Genetic Algorithms.
25/07/14 21
22. Evolving Strategies
Recent developments in applied and heuristic
optimisation techniques have been strongly influenced
and inspired by natural and biological systems.
Algorithms developed from such observations are
Ant Colony Optimisation (ACO) - based on the ability of
an ant colony to nd the shortest path between the food and
the source compared to an individual ant.
Articial Immune System (AIS) - typically exploit the
immune system's characteristics of learning and memory
to solve a problem
Particle Swarm Optimisation (PSO) - inspired by the
social behaviour of a flock of birds.
25/07/14 22
23. Evolving Strategies
In the study of "Semantic Gap", machine
learning algorithms are the building blocks
for bottom-up approach.
Some of the applications of efficient machine
learning algorithms are:
Automatic Content Annotation
Knowledge Extraction
Content Retrieval
In the top-down approach, Ontology provides
partial understanding of human semantics.
25/07/14 23
25. Slide: 25
Particle Swarm Optimisation
In an effort to transform the social interaction of
different species into a computer simulation,
Kennedy and Eberhart developed an optimisation
technique named Particle Swarm Optimisation.
• In theory, the universal behaviour of individuals is
summarised in terms of Evaluate, Compare and
Imitate principles.
26. Slide: 26
Particle Swarm Optimisation
Evaluate: The tendency to evaluate stimuli – to rate
them as positive or negative, attractive or repulsive is
perhaps the most ubiquitous behavioural characteristic
of living organisms.
Compare: In almost every aspect of life, human tend to
compare with others
Imitate: Humans imitation comprises taking the
perspective of the other person, not only imitating a
behaviour but also realising its purpose and executing
the behaviour when it is appropriate
27. Slide: 27
Particle Swarm Optimisation
valuessocialandcognitivegoverningparameterscc
particletheofpositiontherepresentstx
swarmtheforsolutionbestglobalrepresentstgbest
iparticleofsolutionbestpersonalrepresentstpbest
particleofvelocitytherepresentstvid
tvtxtx
txtgbestctxtpbestctvtv
id
d
i
ididid
iddidiidid
−
−
−
−
−
++=+
−+−+=+
21
21
,
)(
)(
)(
)(
)1()()1(
))()(())()(()()1(
Equations governing the motion of particles in
PSO.
28. Slide: 28
Particle Swarm Optimisation
Pseudo code for the algorithm
Step 1: Random Initialization of Particles
Step 2: Function Evaluation
Step 3: Computation of personal best and global
best
Step 4: Velocity update
Step 5: Position update
Step 6: Loop to step 2, until the stopping criteria
is reached
29. Slide: 29
Visual Classification Framework
Self Organising Map
[X]
[X] - Input feature vector
Class 1 – Red
Untrained - Black
Winner Node selected
based on L2 norm
)]()[()()1( tmxthtmtm iciii −+=+
32. Slide: 32
Chaos-Particle Swarm Classifier
The elementary principle of “Chaos” is introduced to
model the behaviour of particle motion.
The theoretical discussion on Chaotic – PSO includes
the notion of “wind speed” and “wind direction”
modelling the biological atmosphere for position
update of the particles.
33. Slide: 33
Chaos-Particle Swarm Optimisation
The wind speed and therefore the position update
equation are presented by:
particleofposition
particleofvelocity
atmosphereofeffectsupporting()*
atmosphereofeffectopposing()*
)1()1()()1(
()*()*)()1(
−
−
−
−
−
++++=+
++=+
id
id
su
op
w
wididid
suopww
x
v
randv
randv
speedwindv
tvtvtxtx
randvrandvtvtv
37. Slide: 37
Knowledge Assisted Analysis
Experimental Dataset
A set of 500 Images, belonging to the general category of
vacation images was assembled.
The content was mainly obtained from Flickr online photo
management and sharing application and includes images
that depict cityscape, seaside, mountain and landscape
locations.
Every image was manually annotated, i.e. after the
segmentation algorithm is applied, a single concept was
associated with each resulting image segment
40. Slide: 40
Knowledge Assisted Analysis
From the results it can be seen that the combined use
of PSO optimisation technique with SOM results in
better classification accuracy compared to using the
latter alone.
It can be noted that the performance of PSO classier
is better than the performance of SVM and GA
classifiers.
Since, SVM's need large training data to accurately
discriminate between image classes.
44. Slide: 44
User Relevance Feedback
The database used in the experiment is generated
from Corel Dataset and consists of seven concepts
namely, building, cloud, car, elephant, grass, lion
and tiger
The test set has been modelled for seven concepts
with a variety of background elements and
overlapping concepts, hence making the test set
complex.
48. Multi-concept framework
Slide 48
• High-level queries
“A tiger resting in the forest and guarding his territory”
• Mid-level features (context independent)
“Tiger”, “Grass”, “Rock”, “Water”,……
49. Multi-concept framework
• Mid-level features:
In a constrained environment with limited number of mid-
level features, the performance of classification algorithm
has found to be satisfactory
• High-level queries:
Open to subjective interpretation of the concepts and also
may involve more than one mid-level feature
Main objective:
• In this multi-concept framework, users are encouraged to
construct high level queries based on their preferences
51. Mid-level feature extraction
Slide 51
• SVM Classifier
• SVM Light toolbox was used to generate semantic
labels
• CLD+EHD
• Multi-feature classifier (MF)
• Employs a mixture of 7 visual features.
• The visual features are merged using Multi-Objective
Learning (MOL)
52. Query space formulation
Slide 52
• Pre-processing stage: mid-level feature concept
detection
• Query formulation: users to construct a high-level
semantic information space
61. Experiment and Evaluation
Slide 61
User 1
Landscape water, grass 0.58
Modern city building, cloud 0.8
Wild life lion, tiger, elephant 0.59
Rural garden flower, water, grass 0.9
User 2
Landscape water 0.23
Modern city building 0.71
Wild life lion, rock, grass, tiger, elephant 0.87
Rural garden flower 0.28
User 3
Landscape water, grass, cloud, car, elephant 0.59
Modern city cloud, building, car 0.91
Wild life lion, tiger, grass, elephant, rock 0.82
Rural garden flower, water, grass 0.88
63. Social Media Analysis
Social media is the interaction among people
in which they create, share or exchange
information and ideas in virtual communities
and networks.
Andreas Kaplan and Michael Haenlein define
social media as "a group of Internet-based
applications that build on the ideological and
technological foundations of Web 2.0
25/07/14 63
64. Social Media Analysis
Social media allows for the creation and
exchange of user-generated content.
Social media differ from traditional or
industrial media in many ways, including
quality, reach, frequency, usability,
immediacy, and permanence.
25/07/14 64
65. Slide: 65
Textual and Visual Analysis
• Images are often accompanies with free-text
annotations, which can be used as
complementary information for content-based
classification
• The challenge is to extract entities from text
and classify them into an arbitrary set of
classes
Plansarsko lake
Shepherd in Bucegi
National Park
69. Slide: 69
Church of our Lady Mercy in Buje
BuildingBuilding
EXIF
Binary
mask
PSO
Largest region size
Use-case scenario
70. Slide: 70
Map word to Wordnet concept
1. noun phrase
2. head noun
3. hypernym for noun phrase (with THD)
4. hypernym for head noun (with THD)
Compute similarity with each of the classses
Experiments carried out with Lin similarity measure
)(log)(log
)),((log*2
),(
21
21
21
cpcp
cclsop
ccsimL
+
=
The probability of encountering concept c
is usually estimated from a large corpus
)(cp
Semantic Concept Mapping
71. Slide: 71
Content-based analysis (KAA) restricted to classes
for which the classifier has been learnt
For text-based analysis (SCM/THD), the classes have
to be exhaustive - all entities are classified
Mapping from SCM/THD to KAA
Perform intersection
between the individual
classifier results
Select concept occupying
largest area on the image
Image
Class.
(KAA)
Text
Class.
Classifier Fusion
74. Indexing Large-scale Repositories
The textual analysis block aims to generate a
list of named entities extracted from the
textual metadata associated with the input
video
The pre-processing framework classifies the
tags into two general categories
common-tags
named entities
25/07/14 74
75. Indexing Large-scale Repositories
Common tags correspond to either action,
country or associated with synset in WordNet
Named-entity tags do not have a WordNet
synset and thus depend on extrenal resources
to contextualise them
The objective of the pre-processing module is
to ensure the named entities are
disambiguated to enable a semantic similarity
search
25/07/14 75
76. Indexing Large-scale Repositories
Bag of Articles Classifier
The input of a BOA classifier is a set of labelled
instances and a set of unlabelled instances (noun
chunks).
Wikipedia article titles provide an unanimous
mapping between the labelled instance and a
wikipedia article
Each article is described by its type (article, page,
disambiguation page, category page and so forth)
25/07/14 76
77. Indexing Large-scale Repositories
A BOA classifier requires a Wikipedia index
containing the following information about each
article
term vectors with term frequencies
out links and
popularity ranking (for most frequent sense relevance
ranking)
For geo-tagging adaptation, the textual analysis
block searches for geographical named entities in
the queries Wikipedia articles
The location details are extracted with the help of
DBpedia using SPARQL end-point
25/07/14 77
80. VIT@MediaEval 2013
25/07/14 80
The geographical coordinates is an important component and
indicator of where an event has happened.
The event clusters are nalised through the weighted occurrence of
tags among the distribution of media annotation
81. VIT@MediaEval 2013
25/07/14 81
The system computes the similarity between
synset representing the tags and each of the
categories.
We use Lin similarity measure to evaluate the
semantic distance between the synset and
category.
83. VIT@MediaEval 2013
Dividing the globe into grids with a maximum
of 10,000 images per grid . Starting from an
initial grid that spans the entire globe,
recursively subdividing grids into smaller
ones once the threshold is reached.
25/07/14 83
86. Future Research Directions
MediaEval is a multimedia benchmarking
initiative that offers tasks and datasets to the
research community that emphasize the
human and social aspects of multimedia.
In 2014, MediaEval is offering eight classic
tasks and three Brave New Tasks.
http://www.multimediaeval.org/mediaeval2014/
25/07/14 86
87. Future Research Directions
ImageCLEF 2014
ImageCLEF organizes four main tasks to
benchmark the challenging task of image
annotation for a wide range of source images and
annotation objective, such as general multi-
domain images for object or concept detection, as
well as domain-specific tasks such as visual-
depth images for robot vision and volumetric
medical images for automated structured
reporting.
25/07/14 87
88. Future Research Directions
The tasks address different aspects of the annotation
problem and are aimed at supporting and promoting
the cutting-edge research addressing the key
challenges in the field, such as multi-modal image
annotation, domain adaptation and ontology driven
image annotation.
http://www.imageclef.org/2014
25/07/14 88