Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Semantics at the multimedia fragment level or how enabling the remixing of online media
1. Semantics at the multimedia
fragment level or how enabling
the remixing of online media
Raphaël Troncy <raphael.troncy@eurecom.fr>
2. Once upon a time …
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -2
3. … leading to sharing Media Fragments
Publishing status message containing
a M di Fragment URI
Media F t
Use a ‘#’ !
Highlight a
video
sequence
Highlight a
region
to pay
attention to
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -3
4. What are Media Fragments?
0 20 temporal media fragment 35 t
spatial media fragment
track media fragment
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -4
5. Media Fragments (temporal)
Original resource
length
Fragment beginning Playback progress Fragment end
g
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -5
6. Media Fragments (spatial) + Demo
highlighted
fragment
semi-opaque
semi opaque
overlay
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -6
7. Media Fragments URIs
Bookmark / Share parts (fragments) of
audio/video content
di / id t t
Annotate media fragments
Search for media fragments
Mash-ups
C
Conserve b d idth
bandwidth
http://www.w3.org/TR/media frags reqs/
http://www.w3.org/TR/media-frags-reqs/
http://www.w3.org/TR/media-frags/
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -7
8. Video annotation
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -8
9. Video interactivity
CONCEPT IN
PLAYER
Cubism Fauvism
Expressionism
FACETS / PROPERTIES OF CONCEPT CONTENT ENRICHMENT
CO C
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 -9
10. LinkedTV EU Project
Vision
Ubiquitously online cloud of
Networked Audio-Visual 12 Excellent Partners
Content
Decoupled from place, Fraunhofer Eurecom
E
device or source STI GMBH Condat
CERTH BEELD EN GELUID
Aim UEP Noterik
provide interactive UMONS U. ST GALLEN
multimedia service for non- CWI RBB
professional end-users
focus television broadcast
content as seed videos
d id
Web:
http://www.linkedtv.eu
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 10
11. Video Accessibility
What is required to make video accessible on the Web?
Technologies:
Annotating: automatic (speech transcription) and manual (social
collaborative annotation tool)
Addressing: pointing to, retrieving, transmitting only parts of media
Rendering: video visualization for the impaired, Braille output
Benchmarking: Sphinx, HTK,
Julius
J li
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 11
12. Speech Processing
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 12
14. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 14
15. Semantic indexing at the fragment level
Benchmarking: Sphinx, HTK,
Julius
NER + full text index with the
transcription
Interlinking with the Linked Data
Cloud to enable semantic search
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 15
16. NERD: Named Entity Recognition and
Disambiguation
Compare performances of Named Entity
Recognition tools
Understand strengths and weaknesses of different Web APIs
Adapt NER processing to different context
(Learn how to) Combine NER tools
Participate in the ANR ETAPE benchmark
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 16
17. What is a Named Entity recognition task?
A task that aims to locate and classify the name of a
person or an organization a location, a brand, a
organization, location brand
product, a numeric expression including time, date,
money and p
y percent in a textual document
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 17
18. NER Tools and Web APIs
Standalone software
GATE
Stanford CoreNLP
Temis http://nerd.eurecom.fr/
Web APIs
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 18
19. What is NERD?
ontology1 REST API2
UI3
The NERD ontology has been
integrated in the NIF p j ,
g project,
a EU FP7 in the context of the
LOD2: Creating Knowledge
out of Interlinked Data
1
http://nerd.eurecom.fr/ontology
2 http://nerd.eurecom.fr/api/application.wadl
3 http://nerd.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 19
20. Factual comparison of 10 Web NER tools
Alchemy DBpedia Evri Extractiv Lupedia Open Saplo Wikimeta Yahoo! Zemanta
API Spotlight Calais
Language EN,FR,
EN FR EN EN,I
EN I EN EN,FR,
EN FR EN,FR
EN FR EN,
EN EN,FR
EN FR EN EN
GR,IT, GR* T IT SP SW SP
PT,RU, PT*
SP,SW SP*
Granularity OEN OEN OED OEN OEN OEN OED OEN OEN OED
Entity N/A char N/A word range of char N/A POS range N/A
position offset offset chars offset offset of
chars
Classification Alchemy DBpedia Evri DBpedia DBpedia Open N/A ESTER Yahoo FreeBase
schema FreeBase LinkedM Calais
Scema.or DB
g
Number of 324 320 5 34 319 95 5 7 13 81
classes
Response JSON HTML HTM HTML HTML JSON JSON JSON JSON XML
Format MicroF JSON L JSON JSON MicroF XML XML JSON
XML RDF JSO RDF RDFa ormat RDF
RDF XML N XML XML
RDF
Quota 30000 unl 300 3000 unl 50000 1333 unl 5000 10000
(calls/day) 09/10/2012 - 0
Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 20/15
21. NERD Ontology
Aligned th t
Ali d the taxonomies used b
i d by
the extractors
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 21
22. NERD type Occurrence
Building the NERD Ontology Person 10
Organization 10
Country 6
Company 6
Location 6
Continent 5
City 5
RadioStation 5
Album 5
Product 5
... ...
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 22
23. NERD REST API
RDF
/document
/user GET,
/annotation/{extractor} POST,
/extraction PUT,
JSON
/evaluation DELETE
...
“entities” : [{
“entity”: “Tim Berners-Lee” ,
“type”: “Person” ,
“uri”: "http://dbpedia.org/resource/Tim_berners_lee",
p p g ,
“nerdType”: "http://nerd.eurecom.fr/ontology#Person",
“startChar”: 30,
“endChar”: 45,
“confidence”: 1, ,
“relevance”: 0.5
}]
Rizzo G., Troncy R. (2012), NERD: A Framework for Unifying Named Entity Recognition and Disambiguation Web Extraction
Tools.
Tools In: European chapter of the Association for Computational Linguistics (EACL'12) Avignon France
(EACL 12), Avignon, France.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 23
24. NERD meets NIF
Model documents through a
set of strings deferencable on
the Web
: offset_23107_ 23110 a str:String ;
offset 23107
str:referenceContext :offset_0_26546 .
Map t i to tit
M string t entity
: offset_23107_ 23110 sso:oen dbpedia:W3C.
Classification
dbpedia:W3C rdf:type nerd:Organization .
Rizzo G, Troncy R., Hellmann S. and Bruemmer M. (2012), NERD meets NIF: Lifting NLP Extraction Results to the Linked
Data Cloud. In: (LDOW'12) Linked Data on the Web (WWW'12), Lyon, France.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 24
25. NERD User Interface
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 25
26. NERD Dashboard
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 26
27. History of NER benchmarks
CoNLL 2003 and CoNLL 2005
schema (4 types): person, organization, location and miscellaneous
language independent task
ACE 2004 ACE 2005 and ACE 2007
2004,
schema (7 types): person, organization, location, facility, weapon,
vehicle and geo-political entity
entity recognition, not just name (e.g. description, pronoun)
find relationships among entities extracted
TAC 2009 (Knowledge Base Track)
schema (3 types): person, organization and location
create a knowledge base from the named entities extracted
ETAPE 2012 (Named Entity Task)
schema: Quaero (7 main types, 32 sub-types)
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 27
28. ETAPE 2012 challenge
genre train dev test sources
TV news 7h 40m 1h 40m 1h 40m BFM Story, Top QUestions (LCP)
Pile et Face, Ca vous regarde,
, g ,
TV debates
d b t 10h 30
30m 5h 10
10m 5h 10
10m
Entre les lignes (LCP)
TV amusements - 1h 05m 1h 05m La place du village (TV8)
Train Dev Eval
Item length 26h 10h 55m 10h 55m
Nb files 44 15 15
Nb words 290517 91656 115511
Nb Named Entities 46763 14398 13055
Nb unique categories 33 33 33
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 28
29. Participation at ETAPE (combined strategy)
extraction
(eA1,tA1,URIA1,siA1,eiA1) ...
t URI si ei ... ... cleaning
l i
(eA2,tA2,URIA2,siA2,eiA2)
(eA3,tA3,URIA3,siA3,eiA3)
fusion
When at least 2 extractors classify the
(eN1,tN1,URIN1,siN1,eiN1)
t URI si ei same entity with a different type then
` (eN2,tN2,URIN2,siN2,eiN2) we apply a preferred selection order
(empirically defined): Wikimeta,
AlchemyAPI, OpenCalais
AlchemyAPI OpenCalais, Lupedia
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 29
30. Participation at ETAPE (combined+ strategy)
ETAPE
Train & Dev
...
Learned model POS tagger
Created Apply rules (eA1,tA1,URIA1,siA1,eA1
static rules )
(eA2,tA2,URIA2,siA2,eiA2
) fusion
f
(e1,t1,URI1,si1,ei1) Conflicts handled by
priority selection: own,
Wikimeta,AlchemyAPI,
OpenCalais,Lupedia
(eN1,tN1,URIN1,sN1,eN1)
`(e ,t ,URI ,s ,e )
N2 N2 N2 N2 N2
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 30
31. NERD Global results
SLR Precision Recall F-measure %correct
combined 86.85%
% 35.31%
% 17.69%
% 23.44%
% 17.69%
%
combined+ 188.81% 15.13% 28.40% 19.45% 28.40%
Combined+ : Eval corpus differs substantially from the Train & Dev
corpora. The static rules do not fit well the Eval corpora and they
introduce classification noise.
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 31
33. NERD + Synote: http://linkeddata.synote.org
Synote:
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 33
34. WoLE Workshop
WoLE2012 Workshop in conjunction with the
ISWC2012 conference
f
http://wole2012.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 34
35. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 35
36. 09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 36
37. Building the data.eurecom.fr
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 37
38. Zenaminer
Publish SCORM content in the Web of Data
separating the content from the layout
Introduce the use of media element / fragments
Automatic annotation of user comments using
NER t l
tools
hypertext link navigation to key terms and entities
satisfy better the information needs of the learner
See also: http://zenaminer.sourceforge.net/
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 38
39. Example application: Link OpenLearn
to relevant course/podcasts
Credit: Mathieu D’Aquin
See also: Zablith et al, LinkedLearning 2011
41. Take Home Message
Video is a first class citizen on the Web
Annotations: Ontology and API for Media Resources
Access: Media Fragments URI
NERD platform for extracting key information
from learning resources including videos
Linked Universities movement for federating
initiatives in exposing educational data as
i iti ti i i d ti ld t
linked data
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 41
42. Media Mixer
Vision: adoption of semantic multimedia
technologies ill f t
t h l i will foster an European market for
E k tf
media fragment re-purposing and re-selling
EU FP7 CSA: November 2012 - November 2014
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 42
43. Credits
Giuseppe Rizzo (Zenaminer, NERD)
Anne Elisabeth Gazet (data.eurecom.fr)
M thi D’Aquin (LinkedUniversities, Lucero)
Mathieu D’A i (Li k dU i iti L )
09/10/2012 - Semantics at the multimedia fragment level - tele-Task Symposium, Potsdam, Octobre 2012 - 43