The MIRROR project received EU funding to develop a method for retrieving migration-related video content using Migration-Related Semantic Concepts (MRSCs). MRSCs were defined based on migration theories and expert input, organized into a hierarchy of 106 concepts across 5 categories. Videos were analyzed using an attention-based dual encoding network trained on captions to retrieve shots related to input MRSCs without training examples. The method achieved competitive results on evaluation datasets and future work will focus on automatic MRSC augmentation and improved representations.
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Migration-related video retrieval
1. The MIRROR project has received funding from the European Union’s Horizon 2020 research and innovation action program under grant agreement № 832921.
Migration-Related Semantic
Concepts for the Retrieval of
Relevant Video Content
Erick Elejalde, Damianos Galanopoulos, Claudia Niederee, Vasileios Mezaris
Int. Workshop on Artificial Intelligence and Robotics for Law Enforcement
Agencies (AIRLEAs) @ 3rd Int. Conf. on Intelligent Technologies and
Applications (INTAP 2020), Gjovik, Norway, Sept. 2020.
2. 2
Introduction
Problem
● Migration is a complex process and a critical issue
● A plethora of factors lead to migration decisions
● Social media may be used to manipulate perception and lead to misperceptions
Solutions
● A better understanding of these decisions is critical
● Automatic analysis of migration related media items
● A novel approach to bridge the gap between the migration driven factors and
their expressions in a video
3. 3
Approach
Top-down and bottom-up approach combination
● Top-down approach
○ Theoretical understanding of migration factors and decisions
○ Domain conceptualization
○ A set of Migration-Related Semantic Concepts (MRSCs) is defined
● Bottom-up approach
○ Visual content interpretation and analysis
○ Video analysis for retrieving related video or images
○ How are MRSCs expressed in videos and images?
4. 4
Migration-Related Semantic Concepts (MRSCs)
What are MRSCs?
● Semantic concepts relevant in the context of migration
How are MRSCs defined?
● In-depth study of migration theories
● Discussions with domain experts
● Semantic concepts collection expresses the migration aspects
● Based on three popular theoretical approaches
○ Νeo-classical economic equilibrium
○ Historical-structural approach
○ Migration systems theory
5. 5
MRSC - Migration Theories
Νeo-classical economic equilibrium
● Focuses on imbalance conditions between origin country and destination
● People try to maximize their benefits take into consideration any constraints
● Criticism: ignores the historical antecedents of movements and ignoring the
role of the state
6. 6
MRSC - Migration Theories
Historical-structural approach
● Based on the Marxist view of political economy
● Stresses the unequal distribution of the economy
● Global scaled recruitment of cheap labor from the capital
● Uneven economy development maintenance
● Criticism: no attention to personal motivations
7. 7
MRSC - Migration Theories
Migration systems theory
● Response to other theories criticism
● More holistic analysis of the migration factors
● The migration process is the result of interacting macro-, meso- and
microstructures
8. 8
MRSC - Factors Classification
● Semantic concepts are combined to form meaningful templates
○ For example, “family” and “war” combined as “Families in war”
● Based on such patterns, a hierarchical structure is constructed
● 106 MRSCs are grouped into five categories
○ Economic
○ Social
○ Demographic
○ Environmental
○ Political
● 20 on the first level and 86 under them
9. 9
MRSC - Factor Classification
106 MRSCs organized on five categories and two levels
10. 10
MRSC-based Video Retrieval
● Ad-Hoc Video Search (AVS) is a similar cross-modal retrieval problem
● Based on a state-of-the-art method for the AVS problem
● Video shots and free text encoding into a joint feature space
● Retrieve the most relevant video shots by inputting an MRSC
● Attention-based dual encoding network
● Trained with video-caption pairs
● MRSCs augmentation with a small set of complex sentences
13. 13
Experiments and Results
Experimental set-up
● Training datasets
○ TGIF & MSR-VTT
● Keyframe representation
○ ResNet 152 trained on Imagenet 11K
● Word embeddings
○ Word2Vec
○ BERT
● Evaluation datasets
○ TRECVID SIN 2013 & 2015
● Evaluation metric
○ Mean extended inferred average precision (MXinfAP)
14. 14
Experiments and Results
Why TRECVID SIN as an evaluation dataset?
● Excessively specific problem
● No domain-specific video retrieval datasets are available
● SIN task is very similar to ours
● Abstract concepts to be correlated with video shots
● Well known state-of-the-art methods to be compared
16. 16
Experiments and Results
SIN concept augmentation improvements
● “Telephones” is described as “speaking on a telephone” and “talking on a
telephone”
○ SIN’13: XinfAP improved from 0.0 to 0.3151
○ SIN’15: XinfAP improved from 0.0 to 0.308
● “Bicycling” is described as “a man riding a bike”, “people riding bicycles”
and “a woman on a bike”
○ SIN’15: XinfAP improved from 0.0569 to 0.3730
17. 17
Experiments and Results
● Conventional concept retrieval methods are used as the baseline
● Baselines use predefined sets of visual concepts and positive exemplars for
every concept
● Competitive results even with the absence of training exemplars
19. 19
Conclusion and Future work
● A novel approach for understanding migration and migration decisions
● Theoretically defined MRSCs provides
○ A better view of the migration topic
○ A common language for analysis
● Video analysis to bridge the gap between MRSCs and video
Future work
● Fully automatic pipeline for automatic MRSCs augmentation
● Better encoding and improved visual and text representations
● Domain-specific dataset