SlideShare a Scribd company logo
1 of 26
New Life for Old Media
Investigations into Speech Synthesis and Deep Learning-based Colorization for
Audiovisual Archive
Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen
Netherlands Institute for Sound and Vision (NISV)
70% audio-visual heritage material
More than 1.000.000 hrs of
TV (public broadcasters)
Radio, Music,Documentaries, Film, Commercials,
etc.
Photographs, objects, …
CC BY - SA as preferable license
3000 items “Internet Quality”
Polygoon newsreels
Supporting a National and
European Audiovisual Commons
Public outreach by embracing
new technologies and
‘participatory culture’
Openbeelden.nl / openimages.eu
Explore AI techniques to enrich this archival
material to allow for new types of engagement
1. Text-To-Speech engine based on limited single narrator
2. Colorization of old black-and-white video footage
Philip Bloemendal
Famous anchorman
Iconic voice
tiny.cc/voiceNL
(not a virus)
Limited Domain Speech Synthesis
Can the current corpus of audio recordings
of Bloemendal be used to construct a TTS
engine?
• Percentage of the Dutch language can be
generated with the current corpus?
• What can we do to improve?
• How well is the text-to-speech engine
recognizable as Philip Bloemendal?
• How understandable are the constructed
audio files?
Text:
Audio:
The Dutch football played Germany
the.wav dutch.wav football.wav
Spoken Language
Elements Repository
(35,000 words)
team
Slot-and-filler Text-to-speech
3,300 newsreels,
speech recognition
How to expand the coverage of the index?
•Many (contemporary) words have not
been pronounced by Philip Bloemendal
•Multiple strategies
–Change format (Lowercase, diaeresis)
–Numbers
–Finding synonyms
–Decompounding
Finding Synonyms
• Open Dutch Wordnet
Dutch lexical semantic database
(Postma et al. 2016)
• Yields synsets
(e.g. Hoofdmeester -> Rector, Schoolhoofd)
• Computationally expensive lookup
Decompounding
• Dutch language allows for
compounding words, each word is
distinct in the corpus
• Decompounding is
computationally expensive (for
large corpora, long words)
• Constructed Bigrams and Trigrams
School, hoofd -> Schoolhoofd
Regen, water -> regenwater
Staat, hoofd -> StaatShoofd
4 corpora to test against
•News articles (same domain, different time) | 50 articles, 2743 unique words
•1970s news articles from the (same domain, time) | 50 articles | 16,191 words
•E-books (different domain, various times) |6 books | 2,657 words
•Tweets (different domain, different time) | 1000 tweets| 27,180 words
• Evaluation
– Number of distinct words
– Number of sentences
Evaluation
Results (words)Coverage
• 8 people tested the software
• Philip was recognized (or ‘that news guy’)
• Words with more consonants were easier to recognize
• When user input their own sentences, more recognition
• When sentences were demonstrated without subtitles, less
• Speed of software / GUI limited testing capabilities
How recognisable are sentences?
The use of Deep Neural Networks in colorizing video
Neural Networks
Recent progress in computational power made implementation
of Deep Neural Nets possible
Neural Networks trained on large training set can accurately
make predictions in real-world examples
Zhang et al. (2012) trained a neural net
on over a million images for colorization
http://richzhang.github.io/colorization/
Existing Literature
• Extract individual frames from video using FFMPEG
• Colorize each individual frame
• Re-compile video and attach original audio file
Outcome
Extract 200x200
frames 24fps
(ffmpeg)
Zhang et al. implemented in
TensorFlow
Combine into
videos (ffmpeg)
Implementation on Video
• Colorized videos are more ‘tangible’ and ‘alive’ than black/white
• Showing colorized Polygoonjournaals can augment TTS engine
• General positive responses on technology may increase attention to NISV collection
Outcome
Outcome
• Each frame is considered
independent and is colorized as such
--> Artifacts appear between frames
• Slow performance without use of
Nvidia GPU
• Low resolution
• Predicted colors still far from perfect
Challenges
www.openbeelden.nl/tags/ingekleurd
Hosted on Openbeelden
platform
One of the colorized videos
received 61,000+ views, 1,700
likes and was shared 521 times,
illustrating the potential to
engage new audiences.
tiny.cc/colorNL
• Collection-specific TTS systems for audio-enrichments of archive
material or multimedia applications.
• Colorization of old media allows for a new view on existing images
• NISV will continue investigating these emerging technologies to
enable new types of interaction and to further engage new
audiences with archival material in unexpected ways.
– In the media museum
– On its public-facing online channels.
Take home
New Life for old Media:
Investigations into Speech Synthesis and Deep Learning-based Colorization for
Audiovisual Archive
Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen
Thank you
Annex: Results (sentences)
Dataset Unique sentences Unique sentences
found
After synsets After
decompounding
Contemporary news 1022 106 110 186
Old news 2626 183 190 301
Tweets 8937 174 181 296
Books 56106 9387 11385 18271

More Related Content

Similar to New life for old media - Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive

Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...Victor de Boer
 
New EUscreen Portal launch
New EUscreen Portal launchNew EUscreen Portal launch
New EUscreen Portal launchEUscreen
 
AMIA Johan Oomen Final
AMIA Johan Oomen FinalAMIA Johan Oomen Final
AMIA Johan Oomen FinalJohan Oomen
 
Research and Development at Sound and Vision
Research and Development at Sound and Vision Research and Development at Sound and Vision
Research and Development at Sound and Vision Victor de Boer
 
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...Erwin Verbruggen
 
Navigating Access to Digital AV Collections
Navigating Access to Digital AV CollectionsNavigating Access to Digital AV Collections
Navigating Access to Digital AV CollectionsRebecca Fraimow
 
VRAlocal14: Is This Thing On, Hoover
VRAlocal14: Is This Thing On, HooverVRAlocal14: Is This Thing On, Hoover
VRAlocal14: Is This Thing On, HooverVanderbiltVRC
 
A la recherche
 A la recherche A la recherche
A la rechercheEd Weiss
 
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...Lucidworks
 
how to understand and implement the "WAVENET"
how to understand and implement the "WAVENET"how to understand and implement the "WAVENET"
how to understand and implement the "WAVENET"Adonis Han
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 
SAMT 2009 Johan Oomen
SAMT 2009 Johan OomenSAMT 2009 Johan Oomen
SAMT 2009 Johan OomenJohan Oomen
 
transLectures fact sheet
transLectures fact sheettransLectures fact sheet
transLectures fact sheettransLectures
 
From Machine Translation to Machine Interpretation - Jimmy Kunzmann
From Machine Translation to Machine Interpretation - Jimmy KunzmannFrom Machine Translation to Machine Interpretation - Jimmy Kunzmann
From Machine Translation to Machine Interpretation - Jimmy KunzmannTAUS - The Language Data Network
 
Video Localization: Challenges, Opportunities and Best Practices
Video Localization: Challenges, Opportunities and Best PracticesVideo Localization: Challenges, Opportunities and Best Practices
Video Localization: Challenges, Opportunities and Best PracticesAnton Bollen
 
Audio adr (sarat koneti 2011 12-01)
Audio adr (sarat koneti 2011 12-01)Audio adr (sarat koneti 2011 12-01)
Audio adr (sarat koneti 2011 12-01)Sarat Koneti
 

Similar to New life for old media - Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive (20)

Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
 
New EUscreen Portal launch
New EUscreen Portal launchNew EUscreen Portal launch
New EUscreen Portal launch
 
AMIA Johan Oomen Final
AMIA Johan Oomen FinalAMIA Johan Oomen Final
AMIA Johan Oomen Final
 
Research and Development at Sound and Vision
Research and Development at Sound and Vision Research and Development at Sound and Vision
Research and Development at Sound and Vision
 
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
 
Navigating Access to Digital AV Collections
Navigating Access to Digital AV CollectionsNavigating Access to Digital AV Collections
Navigating Access to Digital AV Collections
 
VRAlocal14: Is This Thing On, Hoover
VRAlocal14: Is This Thing On, HooverVRAlocal14: Is This Thing On, Hoover
VRAlocal14: Is This Thing On, Hoover
 
A la recherche
 A la recherche A la recherche
A la recherche
 
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...
Embracing Diversity: Searching over Multiple Languages - Suneel Marthi, Red H...
 
how to understand and implement the "WAVENET"
how to understand and implement the "WAVENET"how to understand and implement the "WAVENET"
how to understand and implement the "WAVENET"
 
A brief history of CALL
A brief history of CALLA brief history of CALL
A brief history of CALL
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 
SAMT 2009 Johan Oomen
SAMT 2009 Johan OomenSAMT 2009 Johan Oomen
SAMT 2009 Johan Oomen
 
transLectures fact sheet
transLectures fact sheettransLectures fact sheet
transLectures fact sheet
 
From Machine Translation to Machine Interpretation - Jimmy Kunzmann
From Machine Translation to Machine Interpretation - Jimmy KunzmannFrom Machine Translation to Machine Interpretation - Jimmy Kunzmann
From Machine Translation to Machine Interpretation - Jimmy Kunzmann
 
Motion information and media
Motion information and mediaMotion information and media
Motion information and media
 
Audio adr
Audio adrAudio adr
Audio adr
 
Audio adr
Audio adrAudio adr
Audio adr
 
Video Localization: Challenges, Opportunities and Best Practices
Video Localization: Challenges, Opportunities and Best PracticesVideo Localization: Challenges, Opportunities and Best Practices
Video Localization: Challenges, Opportunities and Best Practices
 
Audio adr (sarat koneti 2011 12-01)
Audio adr (sarat koneti 2011 12-01)Audio adr (sarat koneti 2011 12-01)
Audio adr (sarat koneti 2011 12-01)
 

More from Sound and Vision R&D

Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...
Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...
Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...Sound and Vision R&D
 
(Im)possible Approaches to Preserving Interactive Media
(Im)possible Approaches to Preserving Interactive Media(Im)possible Approaches to Preserving Interactive Media
(Im)possible Approaches to Preserving Interactive MediaSound and Vision R&D
 
Beeld en Geluid Kenniscafé: GIFs en RE:VIVE
Beeld en Geluid Kenniscafé: GIFs en RE:VIVEBeeld en Geluid Kenniscafé: GIFs en RE:VIVE
Beeld en Geluid Kenniscafé: GIFs en RE:VIVESound and Vision R&D
 
Identification Authentication Authorization in CLARIAH
Identification Authentication Authorization in CLARIAHIdentification Authentication Authorization in CLARIAH
Identification Authentication Authorization in CLARIAHSound and Vision R&D
 
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneyTools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneySound and Vision R&D
 
Virtual Reunification of Mixed Media Collections
Virtual Reunification of Mixed Media CollectionsVirtual Reunification of Mixed Media Collections
Virtual Reunification of Mixed Media CollectionsSound and Vision R&D
 
Archival Intelligence for AV Archives
Archival Intelligence for AV ArchivesArchival Intelligence for AV Archives
Archival Intelligence for AV ArchivesSound and Vision R&D
 
Access to Europe's Television Heritage via EUscreen
Access to Europe's Television Heritage via EUscreenAccess to Europe's Television Heritage via EUscreen
Access to Europe's Television Heritage via EUscreenSound and Vision R&D
 
Pop Up Archive Makes Sound Searchable
Pop Up Archive Makes Sound SearchablePop Up Archive Makes Sound Searchable
Pop Up Archive Makes Sound SearchableSound and Vision R&D
 
Ho'okele: Navigating Copyright to Provide Access and Use
Ho'okele: Navigating Copyright to Provide Access and UseHo'okele: Navigating Copyright to Provide Access and Use
Ho'okele: Navigating Copyright to Provide Access and UseSound and Vision R&D
 
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...Methodologies for Assessment and Evaluation of Access to Moving Image Collect...
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...Sound and Vision R&D
 
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...Sound and Vision R&D
 
Information-seeking Behaviors of Filmmakers Using Moving Image Archives
Information-seeking Behaviors of Filmmakers Using Moving Image ArchivesInformation-seeking Behaviors of Filmmakers Using Moving Image Archives
Information-seeking Behaviors of Filmmakers Using Moving Image ArchivesSound and Vision R&D
 
Art / Archives: A New England Archivists Research Project
Art / Archives: A New England Archivists Research ProjectArt / Archives: A New England Archivists Research Project
Art / Archives: A New England Archivists Research ProjectSound and Vision R&D
 
Measuring Access and Outreach on a Very Primitive Level
Measuring Access and Outreach on a Very Primitive LevelMeasuring Access and Outreach on a Very Primitive Level
Measuring Access and Outreach on a Very Primitive LevelSound and Vision R&D
 

More from Sound and Vision R&D (20)

Journal Forms and Futures
Journal Forms and FuturesJournal Forms and Futures
Journal Forms and Futures
 
Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...
Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...
Towards a New Audiovisual Think Tank for Audiovisual Archivists & Cultural He...
 
ACM TVX2017 Review
ACM TVX2017 Review ACM TVX2017 Review
ACM TVX2017 Review
 
(Im)possible Approaches to Preserving Interactive Media
(Im)possible Approaches to Preserving Interactive Media(Im)possible Approaches to Preserving Interactive Media
(Im)possible Approaches to Preserving Interactive Media
 
Beeld en Geluid Kenniscafé: GIFs en RE:VIVE
Beeld en Geluid Kenniscafé: GIFs en RE:VIVEBeeld en Geluid Kenniscafé: GIFs en RE:VIVE
Beeld en Geluid Kenniscafé: GIFs en RE:VIVE
 
Identification Authentication Authorization in CLARIAH
Identification Authentication Authorization in CLARIAHIdentification Authentication Authorization in CLARIAH
Identification Authentication Authorization in CLARIAH
 
Copyright and Open Content
Copyright and Open ContentCopyright and Open Content
Copyright and Open Content
 
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneyTools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
 
Virtual Reunification of Mixed Media Collections
Virtual Reunification of Mixed Media CollectionsVirtual Reunification of Mixed Media Collections
Virtual Reunification of Mixed Media Collections
 
Archival Intelligence for AV Archives
Archival Intelligence for AV ArchivesArchival Intelligence for AV Archives
Archival Intelligence for AV Archives
 
Access to Europe's Television Heritage via EUscreen
Access to Europe's Television Heritage via EUscreenAccess to Europe's Television Heritage via EUscreen
Access to Europe's Television Heritage via EUscreen
 
Pop Up Archive Makes Sound Searchable
Pop Up Archive Makes Sound SearchablePop Up Archive Makes Sound Searchable
Pop Up Archive Makes Sound Searchable
 
Ho'okele: Navigating Copyright to Provide Access and Use
Ho'okele: Navigating Copyright to Provide Access and UseHo'okele: Navigating Copyright to Provide Access and Use
Ho'okele: Navigating Copyright to Provide Access and Use
 
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...Methodologies for Assessment and Evaluation of Access to Moving Image Collect...
Methodologies for Assessment and Evaluation of Access to Moving Image Collect...
 
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...
Moving Beyond Access: Unlocking the Potential of Moving Image Archive Collect...
 
Information-seeking Behaviors of Filmmakers Using Moving Image Archives
Information-seeking Behaviors of Filmmakers Using Moving Image ArchivesInformation-seeking Behaviors of Filmmakers Using Moving Image Archives
Information-seeking Behaviors of Filmmakers Using Moving Image Archives
 
Art / Archives: A New England Archivists Research Project
Art / Archives: A New England Archivists Research ProjectArt / Archives: A New England Archivists Research Project
Art / Archives: A New England Archivists Research Project
 
Measuring Access and Outreach on a Very Primitive Level
Measuring Access and Outreach on a Very Primitive LevelMeasuring Access and Outreach on a Very Primitive Level
Measuring Access and Outreach on a Very Primitive Level
 
Cataloging the AAPB
Cataloging the AAPBCataloging the AAPB
Cataloging the AAPB
 
What's Happening in Copyright Law
What's Happening in Copyright LawWhat's Happening in Copyright Law
What's Happening in Copyright Law
 

Recently uploaded

Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Mohammad Khajehpour
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONrouseeyyy
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oManavSingh202607
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxBhagirath Gogikar
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 

Recently uploaded (20)

Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
Dopamine neurotransmitter determination using graphite sheet- graphene nano-s...
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATIONSTS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
STS-UNIT 4 CLIMATE CHANGE POWERPOINT PRESENTATION
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
Introduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptxIntroduction,importance and scope of horticulture.pptx
Introduction,importance and scope of horticulture.pptx
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 

New life for old media - Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive

  • 1. New Life for Old Media Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen
  • 2. Netherlands Institute for Sound and Vision (NISV)
  • 3. 70% audio-visual heritage material More than 1.000.000 hrs of TV (public broadcasters) Radio, Music,Documentaries, Film, Commercials, etc. Photographs, objects, …
  • 4.
  • 5. CC BY - SA as preferable license 3000 items “Internet Quality” Polygoon newsreels Supporting a National and European Audiovisual Commons Public outreach by embracing new technologies and ‘participatory culture’ Openbeelden.nl / openimages.eu
  • 6. Explore AI techniques to enrich this archival material to allow for new types of engagement 1. Text-To-Speech engine based on limited single narrator 2. Colorization of old black-and-white video footage
  • 7. Philip Bloemendal Famous anchorman Iconic voice tiny.cc/voiceNL (not a virus)
  • 8. Limited Domain Speech Synthesis Can the current corpus of audio recordings of Bloemendal be used to construct a TTS engine? • Percentage of the Dutch language can be generated with the current corpus? • What can we do to improve? • How well is the text-to-speech engine recognizable as Philip Bloemendal? • How understandable are the constructed audio files?
  • 9. Text: Audio: The Dutch football played Germany the.wav dutch.wav football.wav Spoken Language Elements Repository (35,000 words) team Slot-and-filler Text-to-speech 3,300 newsreels, speech recognition
  • 10. How to expand the coverage of the index? •Many (contemporary) words have not been pronounced by Philip Bloemendal •Multiple strategies –Change format (Lowercase, diaeresis) –Numbers –Finding synonyms –Decompounding
  • 11. Finding Synonyms • Open Dutch Wordnet Dutch lexical semantic database (Postma et al. 2016) • Yields synsets (e.g. Hoofdmeester -> Rector, Schoolhoofd) • Computationally expensive lookup
  • 12. Decompounding • Dutch language allows for compounding words, each word is distinct in the corpus • Decompounding is computationally expensive (for large corpora, long words) • Constructed Bigrams and Trigrams School, hoofd -> Schoolhoofd Regen, water -> regenwater Staat, hoofd -> StaatShoofd
  • 13. 4 corpora to test against •News articles (same domain, different time) | 50 articles, 2743 unique words •1970s news articles from the (same domain, time) | 50 articles | 16,191 words •E-books (different domain, various times) |6 books | 2,657 words •Tweets (different domain, different time) | 1000 tweets| 27,180 words • Evaluation – Number of distinct words – Number of sentences Evaluation
  • 15. • 8 people tested the software • Philip was recognized (or ‘that news guy’) • Words with more consonants were easier to recognize • When user input their own sentences, more recognition • When sentences were demonstrated without subtitles, less • Speed of software / GUI limited testing capabilities How recognisable are sentences?
  • 16. The use of Deep Neural Networks in colorizing video
  • 17. Neural Networks Recent progress in computational power made implementation of Deep Neural Nets possible Neural Networks trained on large training set can accurately make predictions in real-world examples
  • 18. Zhang et al. (2012) trained a neural net on over a million images for colorization http://richzhang.github.io/colorization/ Existing Literature
  • 19. • Extract individual frames from video using FFMPEG • Colorize each individual frame • Re-compile video and attach original audio file Outcome Extract 200x200 frames 24fps (ffmpeg) Zhang et al. implemented in TensorFlow Combine into videos (ffmpeg) Implementation on Video
  • 20. • Colorized videos are more ‘tangible’ and ‘alive’ than black/white • Showing colorized Polygoonjournaals can augment TTS engine • General positive responses on technology may increase attention to NISV collection Outcome
  • 22. • Each frame is considered independent and is colorized as such --> Artifacts appear between frames • Slow performance without use of Nvidia GPU • Low resolution • Predicted colors still far from perfect Challenges
  • 23. www.openbeelden.nl/tags/ingekleurd Hosted on Openbeelden platform One of the colorized videos received 61,000+ views, 1,700 likes and was shared 521 times, illustrating the potential to engage new audiences. tiny.cc/colorNL
  • 24. • Collection-specific TTS systems for audio-enrichments of archive material or multimedia applications. • Colorization of old media allows for a new view on existing images • NISV will continue investigating these emerging technologies to enable new types of interaction and to further engage new audiences with archival material in unexpected ways. – In the media museum – On its public-facing online channels. Take home
  • 25. New Life for old Media: Investigations into Speech Synthesis and Deep Learning-based Colorization for Audiovisual Archive Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen Thank you
  • 26. Annex: Results (sentences) Dataset Unique sentences Unique sentences found After synsets After decompounding Contemporary news 1022 106 110 186 Old news 2626 183 190 301 Tweets 8937 174 181 296 Books 56106 9387 11385 18271