SlideShare a Scribd company logo
1 of 28
Download to read offline
London Information Retrieval Meetup
19 Feb 2019
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019
London Information Retrieval Meetup
Who I am
▪ Software Engineer (1999-)
▪ “Hermit” Software Engineer (2010-)
▪ Java & Information Retrieval Passionate
▪ Apache Qpid (past) Committer
▪ Husband & Father
▪ Bass Player
Andrea Gazzarini, “Gazza”
London Information Retrieval Meetup
Sease
Search Services
● Open Source Enthusiasts
● Apache Lucene/Solr experts
! Community Contributors
● Active Researchers
● Hot Trends : Learning To Rank, Document Similarity,
Search Quality Evaluation, Relevancy Tuning
London Information Retrieval Meetup
✓Music Information Retrieval (MIR)?
➢ Music Essentials
➢ Audio Processing
➢ Q&A
Agenda
London Information Retrieval Meetup
MIR is concerned with the extraction, analysis and usage of information about any kind of music
entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI
representation of a piece of music, or name of a music artist).”
Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web.
Dissertation, Johannes Kepler University, Wien (2003)
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from
music. MIR is a small but growing field of research with many real-world applications. Those involved in
MIR may have a background in in musicology, psychoacoustics, psychology, academic music study,
signal processing, informatics, machine learning, optical music recognition, computational intelligence or
some combination of these.
https://en.wikipedia.org/wiki/Music_information_retrieval
Music Information Retrieval (MIR)
London Information Retrieval Meetup
AUDIO IDENTIFICATION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
COVER SONG DETECTION
SYMBOLIC SIMILARITY
MOOD
SOURCE SEPARATION
INSTRUMENT RECOGNITION
TEMPO ESTIMATION
SCORE ALIGNMENT
SONG STRUCTURE
BEAT TRACKING
KEY DETECTION
QUERY BY HUMMINGQUERY BY HUMMING
AUDIO IDENTIFICATION
INSTRUMENT RECOGNITION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
TEMPO ESTIMATION
SONG STRUCTURE
SCORE ALIGNMENT
COVER SONG DETECTION
SYMBOLIC SIMILARITY
KEY DETECTION
BEAT TRACKING
MOOD
SOURCE SEPARATION
Music Information Retrieval (MIR)
London Information Retrieval Meetup
Music Content includes all those low-level things we
can extract from the audio signal (e.g. time,
frequencies, loudness)
Computational Factors
Context
State
Music Content
Music Context
Music Context defines additional metadata that
cannot be extracted from the audio signal (e.g. lyrics,
tags, artists, feedback, posts)
Listener state includes the user state in a given
moment (e.g. mood, musical knowledge, preferences)
Listener Context relates to the environment where
the listener is in a given moment (e.g. political,
geographical, social)
Factors in Music Perception
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
✓Music Essentials
‣ Essentials
‣ Score Music Representation
‣ Symbolic Representations
‣ Audio Representation
➢ Audio Processing
➢ Q&A
Agenda
London Information Retrieval Meetup
A note is used for denoting a sound, its pitch and duration
A sound is the audio signal produced by a vibrating body
Notes are associated to graphical symbols (for indicating the pitch and the duration)
Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As
consequence of that, we say they belong to the same pitch class
A note is also used for denoting a pitch class. The traditional music theory individuates 12 pitch classes
Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI)
C D E
F G A B
C
B A
G F E D
C
C#
D# F# G#
A#
Bb Ab Gb Eb Db
Music Language Essentials
London Information Retrieval Meetup
Text Music
Letter Note
Word
Phrase
Sentence
Chord
Ghost Note
Phrase
Text vs Music
London Information Retrieval Meetup
Time Signature
Key Signature
Clef
Tempo
Note
Reference Chord
Chord
Score music representation
London Information Retrieval Meetup
Symbolic music representations comprise any
kind of score representation with an explicit
encoding of notes or other musical events.
Piano Roll, initially used for denoting rolls of
paper with holes for controlling a melody
execution on a self-playing device, it is nowadays
used for referring to a digital visualisation which
provides pitches over time.
Musical Instrument Digital Interface (MIDI) is
another representation, widely adopted, for
representing music event (e.g. pitch, velocity,
duration, intensity)
Piano Roll & MIDI
Symbolic music representation
London Information Retrieval Meetup
MusicXML [1] is an XML dialect for expressing Music
in XML format.
As you can imagine from the example on the right,
encoding a whole song will result in a huge and
verbose textual representation (that’s XML!).
For that reason MusicXML 2.0 introduced a
compressed format with a .mxml suffix
• Widely supported (scorewriting, OCR, sequencer)
• Easy to understand
• Full support of music features
MusicXML
Part
Time
Clef
Note(s)
[1] https://www.musicxml.com
MusicXML
London Information Retrieval Meetup
The Parsons code, formally named the Parsons
code for melodic contours, is a simple notation
used to identify a piece of music through melodic
motion — movements of the pitch up and down.
(https://en.wikipedia.org/wiki/Parsons_code)
The encoding focuses on the pitch relation between
subsequent notes. Main points about this method are:
• Simplicity
• Being a textual encoding it offers interesting
challenges in text search engines
• Limited: It doesn’t consider at all important
features like time and intervals, pauses, ghost
notes
Parsons CodeSymbol Description
* First note of a sequence
u,/
“up”, the note is higher than the
previous one
d,
“down”, the note is lower than
the previous one
r,-
“repeat”, the note is the same
of the previous one
Parsons Code (1/4)
London Information Retrieval Meetup
Parsons Code (2/4)
London Information Retrieval Meetup
*
*
r
u u rr u r u r d r d r
d r d r
u r u r u r u r
*
u
d d d u u uX
u
d d d u u uXd
Money, Pink Floyd
Parsons Code (3/4)
London Information Retrieval Meetup
Tempo (Time)
Intervals
Rests
Ghost Notes
Parsons Code (4/4)
London Information Retrieval Meetup
Digital computers can only capture this data at discrete moments in time. The rate at which a
computer captures audio data is called the sampling frequency or sampling rate.
An audio signal is a representation of sound that represents the fluctuation in air pressure
caused by the vibration as a function of time. Unlike sheet music or symbolic representations,
audio representations encode everything that is necessary to reproduce an acoustic realization
of a piece of music.
Audio Representation: Time Domain
London Information Retrieval Meetup
The Frequency Domain representation
decomposes the audio signal in a number of
waves oscillating a different frequencies.
The FD plots the frequencies on the
horizontal axis by their corresponding
magnitude (power) on the vertical axis.
This representation, among other things, can
be used for highlighting the dominant
frequencies of a musical tone.
Frequency Domain
Frequency Domain
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Essentials
✓ Audio Processing
‣ Basic Pipeline
‣ Time Domain Features
‣ Frequency Domain Features
‣ Chroma Features
➢ Q&A
Agenda
London Information Retrieval Meetup
Time Domain Features Extraction
Frequency Domain Features Extraction
Sampling / Quantization
Framing
Windowing
FFT
Analog Signal
Basic Audio Processing Pipeline
London Information Retrieval Meetup
Amplitude Envelope (AE)
Max amplitude within a frame
Root-Mean-Square Energy (RMS)
Perceived sound intensity
Zero Crossing Rate (ZCR)
Number of times the amplitude changes its sign within a frameFeature
Example
Usage
Loudness Estimation
Timbre Analysis
Speech Recognition
Audio Segmentation
Onset Detection
Time Domain Features
London Information Retrieval Meetup
Band Energy Ratio (BER)
Ratio between lower and higher
frequency bands energy
Spectral Centroid
Frequency band where most of
the energy is concentrated
Bandwidth (BW)
Spectral range of interesting
part of a signal
Feature
Example
Usage
Timbre Analysis
Speech Recognition
Onset DetectionSpeech/Music Discrimination
Spectral Flux
Frequency band where most of
the energy is concentrated
Frequency Domain Features
London Information Retrieval Meetup
Chroma features are a powerful representation for
music audio in which the entire spectrum is
projected onto 12 bins representing the 12 distinct
semitones (or chroma) of the musical octave.
It’s a kind of analysis which bridges between low-level
and middle-level features, moving the audio signal
representation toward something which is more
readable, from a functional perspective.
Chroma Features
Chroma Features (1/2)
London Information Retrieval Meetup
Time
C
D
E
F
G
A
B
C#
D#
F#
G#
A#
A A A A A A C A F F F F F F FG C C C C C C D C B B B B B B C B
N
O
I
S
E
Chroma Features (2/2)
London Information Retrieval Meetup
FALCON: FAst Lucene-based Cover sOng identification | chromaprint (part of AcustID)
Interesting Projects
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Representation
➢ Audio Processing
✓ Q&A
Agenda
London Information Retrieval Meetup
19 Feb 2019
Thank you!
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019

More Related Content

What's hot

Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)mohammedalimahdi
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologyAamir-sheriff
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognitionfathitarek
 
Lecture 10 semantic analysis 01
Lecture 10 semantic analysis 01Lecture 10 semantic analysis 01
Lecture 10 semantic analysis 01Iffat Anjum
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music RecognitionMrinmoy Dalal
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Yi-Hsuan Yang
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mirYi-Hsuan Yang
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognitionRichie
 

What's hot (20)

Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Music Information Retrieval
Music Information RetrievalMusic Information Retrieval
Music Information Retrieval
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Chap04
Chap04Chap04
Chap04
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Semantic analysis
Semantic analysisSemantic analysis
Semantic analysis
 
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)
ADDITTIVE WHITE GAUSIAN NOIS ( AWGN)
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Lecture 10 semantic analysis 01
Lecture 10 semantic analysis 01Lecture 10 semantic analysis 01
Lecture 10 semantic analysis 01
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music Recognition
 
Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)Machine learning for creative AI applications in music (2018 nov)
Machine learning for creative AI applications in music (2018 nov)
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Spectrograms
SpectrogramsSpectrograms
Spectrograms
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Unit i-pcm-vsh
Unit i-pcm-vshUnit i-pcm-vsh
Unit i-pcm-vsh
 
Automatic speech recognition
Automatic speech recognitionAutomatic speech recognition
Automatic speech recognition
 
An Introduction To Speech Recognition
An Introduction To Speech RecognitionAn Introduction To Speech Recognition
An Introduction To Speech Recognition
 

Similar to Introduction to Music Information Retrieval

Interval Hashing Based Ranking
Interval Hashing Based RankingInterval Hashing Based Ranking
Interval Hashing Based RankingAndrea Gazzarini
 
Musical Information Retrieval Take 2: Interval Hashing Based Ranking
Musical Information Retrieval Take 2: Interval Hashing Based RankingMusical Information Retrieval Take 2: Interval Hashing Based Ranking
Musical Information Retrieval Take 2: Interval Hashing Based RankingSease
 
RDA for music cataloguers
RDA for music cataloguersRDA for music cataloguers
RDA for music cataloguersPeter Sime
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentialsgamedevelopersturkey
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...sebastianewert
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 
Poster vega north
Poster vega northPoster vega north
Poster vega northAcxelVega
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010ocor203
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...kthrlab
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingijma
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...I MT
 
Notating pop music
Notating pop musicNotating pop music
Notating pop musicxjkoboe
 
Music Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesMusic Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesALATechSource
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...TELKOMNIKA JOURNAL
 

Similar to Introduction to Music Information Retrieval (20)

Interval Hashing Based Ranking
Interval Hashing Based RankingInterval Hashing Based Ranking
Interval Hashing Based Ranking
 
Musical Information Retrieval Take 2: Interval Hashing Based Ranking
Musical Information Retrieval Take 2: Interval Hashing Based RankingMusical Information Retrieval Take 2: Interval Hashing Based Ranking
Musical Information Retrieval Take 2: Interval Hashing Based Ranking
 
RDA for music cataloguers
RDA for music cataloguersRDA for music cataloguers
RDA for music cataloguers
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 
Poster vega north
Poster vega northPoster vega north
Poster vega north
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Music mobile
Music mobileMusic mobile
Music mobile
 
Sound
SoundSound
Sound
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
 
MIR
MIRMIR
MIR
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key finding
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
 
Understang Music with Machine Learning by Jimena Royo-Letelier
Understang Music with Machine Learning by Jimena Royo-LetelierUnderstang Music with Machine Learning by Jimena Royo-Letelier
Understang Music with Machine Learning by Jimena Royo-Letelier
 
Ism2011
Ism2011Ism2011
Ism2011
 
Notating pop music
Notating pop musicNotating pop music
Notating pop music
 
Music Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesMusic Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - Slides
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
 

Recently uploaded

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 

Recently uploaded (20)

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

Introduction to Music Information Retrieval

  • 1. London Information Retrieval Meetup 19 Feb 2019 Introduction to Music Information Retrieval Thoughts from a former bass player Andrea Gazzarini, Software Engineer 19th February 2019
  • 2. London Information Retrieval Meetup Who I am ▪ Software Engineer (1999-) ▪ “Hermit” Software Engineer (2010-) ▪ Java & Information Retrieval Passionate ▪ Apache Qpid (past) Committer ▪ Husband & Father ▪ Bass Player Andrea Gazzarini, “Gazza”
  • 3. London Information Retrieval Meetup Sease Search Services ● Open Source Enthusiasts ● Apache Lucene/Solr experts ! Community Contributors ● Active Researchers ● Hot Trends : Learning To Rank, Document Similarity, Search Quality Evaluation, Relevancy Tuning
  • 4. London Information Retrieval Meetup ✓Music Information Retrieval (MIR)? ➢ Music Essentials ➢ Audio Processing ➢ Q&A Agenda
  • 5. London Information Retrieval Meetup MIR is concerned with the extraction, analysis and usage of information about any kind of music entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI representation of a piece of music, or name of a music artist).” Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web. Dissertation, Johannes Kepler University, Wien (2003) Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in in musicology, psychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these. https://en.wikipedia.org/wiki/Music_information_retrieval Music Information Retrieval (MIR)
  • 6. London Information Retrieval Meetup AUDIO IDENTIFICATION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION COVER SONG DETECTION SYMBOLIC SIMILARITY MOOD SOURCE SEPARATION INSTRUMENT RECOGNITION TEMPO ESTIMATION SCORE ALIGNMENT SONG STRUCTURE BEAT TRACKING KEY DETECTION QUERY BY HUMMINGQUERY BY HUMMING AUDIO IDENTIFICATION INSTRUMENT RECOGNITION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION TEMPO ESTIMATION SONG STRUCTURE SCORE ALIGNMENT COVER SONG DETECTION SYMBOLIC SIMILARITY KEY DETECTION BEAT TRACKING MOOD SOURCE SEPARATION Music Information Retrieval (MIR)
  • 7. London Information Retrieval Meetup Music Content includes all those low-level things we can extract from the audio signal (e.g. time, frequencies, loudness) Computational Factors Context State Music Content Music Context Music Context defines additional metadata that cannot be extracted from the audio signal (e.g. lyrics, tags, artists, feedback, posts) Listener state includes the user state in a given moment (e.g. mood, musical knowledge, preferences) Listener Context relates to the environment where the listener is in a given moment (e.g. political, geographical, social) Factors in Music Perception
  • 8. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ✓Music Essentials ‣ Essentials ‣ Score Music Representation ‣ Symbolic Representations ‣ Audio Representation ➢ Audio Processing ➢ Q&A Agenda
  • 9. London Information Retrieval Meetup A note is used for denoting a sound, its pitch and duration A sound is the audio signal produced by a vibrating body Notes are associated to graphical symbols (for indicating the pitch and the duration) Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As consequence of that, we say they belong to the same pitch class A note is also used for denoting a pitch class. The traditional music theory individuates 12 pitch classes Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI) C D E F G A B C B A G F E D C C# D# F# G# A# Bb Ab Gb Eb Db Music Language Essentials
  • 10. London Information Retrieval Meetup Text Music Letter Note Word Phrase Sentence Chord Ghost Note Phrase Text vs Music
  • 11. London Information Retrieval Meetup Time Signature Key Signature Clef Tempo Note Reference Chord Chord Score music representation
  • 12. London Information Retrieval Meetup Symbolic music representations comprise any kind of score representation with an explicit encoding of notes or other musical events. Piano Roll, initially used for denoting rolls of paper with holes for controlling a melody execution on a self-playing device, it is nowadays used for referring to a digital visualisation which provides pitches over time. Musical Instrument Digital Interface (MIDI) is another representation, widely adopted, for representing music event (e.g. pitch, velocity, duration, intensity) Piano Roll & MIDI Symbolic music representation
  • 13. London Information Retrieval Meetup MusicXML [1] is an XML dialect for expressing Music in XML format. As you can imagine from the example on the right, encoding a whole song will result in a huge and verbose textual representation (that’s XML!). For that reason MusicXML 2.0 introduced a compressed format with a .mxml suffix • Widely supported (scorewriting, OCR, sequencer) • Easy to understand • Full support of music features MusicXML Part Time Clef Note(s) [1] https://www.musicxml.com MusicXML
  • 14. London Information Retrieval Meetup The Parsons code, formally named the Parsons code for melodic contours, is a simple notation used to identify a piece of music through melodic motion — movements of the pitch up and down. (https://en.wikipedia.org/wiki/Parsons_code) The encoding focuses on the pitch relation between subsequent notes. Main points about this method are: • Simplicity • Being a textual encoding it offers interesting challenges in text search engines • Limited: It doesn’t consider at all important features like time and intervals, pauses, ghost notes Parsons CodeSymbol Description * First note of a sequence u,/ “up”, the note is higher than the previous one d, “down”, the note is lower than the previous one r,- “repeat”, the note is the same of the previous one Parsons Code (1/4)
  • 15. London Information Retrieval Meetup Parsons Code (2/4)
  • 16. London Information Retrieval Meetup * * r u u rr u r u r d r d r d r d r u r u r u r u r * u d d d u u uX u d d d u u uXd Money, Pink Floyd Parsons Code (3/4)
  • 17. London Information Retrieval Meetup Tempo (Time) Intervals Rests Ghost Notes Parsons Code (4/4)
  • 18. London Information Retrieval Meetup Digital computers can only capture this data at discrete moments in time. The rate at which a computer captures audio data is called the sampling frequency or sampling rate. An audio signal is a representation of sound that represents the fluctuation in air pressure caused by the vibration as a function of time. Unlike sheet music or symbolic representations, audio representations encode everything that is necessary to reproduce an acoustic realization of a piece of music. Audio Representation: Time Domain
  • 19. London Information Retrieval Meetup The Frequency Domain representation decomposes the audio signal in a number of waves oscillating a different frequencies. The FD plots the frequencies on the horizontal axis by their corresponding magnitude (power) on the vertical axis. This representation, among other things, can be used for highlighting the dominant frequencies of a musical tone. Frequency Domain Frequency Domain
  • 20. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Music Essentials ✓ Audio Processing ‣ Basic Pipeline ‣ Time Domain Features ‣ Frequency Domain Features ‣ Chroma Features ➢ Q&A Agenda
  • 21. London Information Retrieval Meetup Time Domain Features Extraction Frequency Domain Features Extraction Sampling / Quantization Framing Windowing FFT Analog Signal Basic Audio Processing Pipeline
  • 22. London Information Retrieval Meetup Amplitude Envelope (AE) Max amplitude within a frame Root-Mean-Square Energy (RMS) Perceived sound intensity Zero Crossing Rate (ZCR) Number of times the amplitude changes its sign within a frameFeature Example Usage Loudness Estimation Timbre Analysis Speech Recognition Audio Segmentation Onset Detection Time Domain Features
  • 23. London Information Retrieval Meetup Band Energy Ratio (BER) Ratio between lower and higher frequency bands energy Spectral Centroid Frequency band where most of the energy is concentrated Bandwidth (BW) Spectral range of interesting part of a signal Feature Example Usage Timbre Analysis Speech Recognition Onset DetectionSpeech/Music Discrimination Spectral Flux Frequency band where most of the energy is concentrated Frequency Domain Features
  • 24. London Information Retrieval Meetup Chroma features are a powerful representation for music audio in which the entire spectrum is projected onto 12 bins representing the 12 distinct semitones (or chroma) of the musical octave. It’s a kind of analysis which bridges between low-level and middle-level features, moving the audio signal representation toward something which is more readable, from a functional perspective. Chroma Features Chroma Features (1/2)
  • 25. London Information Retrieval Meetup Time C D E F G A B C# D# F# G# A# A A A A A A C A F F F F F F FG C C C C C C D C B B B B B B C B N O I S E Chroma Features (2/2)
  • 26. London Information Retrieval Meetup FALCON: FAst Lucene-based Cover sOng identification | chromaprint (part of AcustID) Interesting Projects
  • 27. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Music Representation ➢ Audio Processing ✓ Q&A Agenda
  • 28. London Information Retrieval Meetup 19 Feb 2019 Thank you! Introduction to Music Information Retrieval Thoughts from a former bass player Andrea Gazzarini, Software Engineer 19th February 2019