SlideShare a Scribd company logo
1 of 28
Download to read offline
London Information Retrieval Meetup
19 Feb 2019
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019
London Information Retrieval Meetup
Who I am
▪ Software Engineer (1999-)
▪ “Hermit” Software Engineer (2010-)
▪ Java & Information Retrieval Passionate
▪ Apache Qpid (past) Committer
▪ Husband & Father
▪ Bass Player
Andrea Gazzarini, “Gazza”
London Information Retrieval Meetup
Sease
Search Services
● Open Source Enthusiasts
● Apache Lucene/Solr experts
! Community Contributors
● Active Researchers
● Hot Trends : Learning To Rank, Document Similarity,
Search Quality Evaluation, Relevancy Tuning
London Information Retrieval Meetup
✓Music Information Retrieval (MIR)?
➢ Music Essentials
➢ Audio Processing
➢ Q&A
Agenda
London Information Retrieval Meetup
MIR is concerned with the extraction, analysis and usage of information about any kind of music
entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI
representation of a piece of music, or name of a music artist).”
Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web.
Dissertation, Johannes Kepler University, Wien (2003)
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from
music. MIR is a small but growing field of research with many real-world applications. Those involved in
MIR may have a background in in musicology, psychoacoustics, psychology, academic music study,
signal processing, informatics, machine learning, optical music recognition, computational intelligence or
some combination of these.
https://en.wikipedia.org/wiki/Music_information_retrieval
Music Information Retrieval (MIR)
London Information Retrieval Meetup
AUDIO IDENTIFICATION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
COVER SONG DETECTION
SYMBOLIC SIMILARITY
MOOD
SOURCE SEPARATION
INSTRUMENT RECOGNITION
TEMPO ESTIMATION
SCORE ALIGNMENT
SONG STRUCTURE
BEAT TRACKING
KEY DETECTION
QUERY BY HUMMING
QUERY BY HUMMING
AUDIO IDENTIFICATION
INSTRUMENT RECOGNITION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
TEMPO ESTIMATION
SONG STRUCTURE
SCORE ALIGNMENT
COVER SONG DETECTION
SYMBOLIC SIMILARITY
KEY DETECTION
BEAT TRACKING
MOOD
SOURCE SEPARATION
Music Information Retrieval (MIR)
London Information Retrieval Meetup
Music Content includes all those low-level things we
can extract from the audio signal (e.g. time,
frequencies, loudness)
Computational Factors
Context
State
Music Content
Music Context
Music Context defines additional metadata that
cannot be extracted from the audio signal (e.g. lyrics,
tags, artists, feedback, posts)
Listener state includes the user state in a given
moment (e.g. mood, musical knowledge, preferences)
Listener Context relates to the environment where
the listener is in a given moment (e.g. political,
geographical, social)
Factors in Music Perception
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
✓Music Essentials
‣ Essentials
‣ Score Music Representation
‣ Symbolic Representations
‣ Audio Representation
➢ Audio Processing
➢ Q&A
Agenda
London Information Retrieval Meetup
A note is used for denoting a sound, its pitch and duration
A sound is the audio signal produced by a vibrating body
Notes are associated to graphical symbols (for indicating the pitch and the duration)
Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As
consequence of that, we say they belong to the same pitch class
A note is also used for denoting a pitch class. The traditional music theory individuates 12 pitch classes
Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI)
C D E
F G A B
C
B A
G F E D
C
C#
D# F# G#
A#
Bb Ab Gb Eb Db
Music Language Essentials
London Information Retrieval Meetup
Text Music
Letter Note
Word
Phrase
Sentence
Chord
Ghost Note
Phrase
Text vs Music
London Information Retrieval Meetup
Time Signature
Key Signature
Clef
Tempo
Note
Reference Chord
Chord
Score music representation
London Information Retrieval Meetup
Symbolic music representations comprise any
kind of score representation with an explicit
encoding of notes or other musical events.
Piano Roll, initially used for denoting rolls of
paper with holes for controlling a melody
execution on a self-playing device, it is nowadays
used for referring to a digital visualisation which
provides pitches over time.
Musical Instrument Digital Interface (MIDI) is
another representation, widely adopted, for
representing music event (e.g. pitch, velocity,
duration, intensity)
Piano Roll & MIDI
Symbolic music representation
London Information Retrieval Meetup
MusicXML [1] is an XML dialect for expressing Music
in XML format.
As you can imagine from the example on the right,
encoding a whole song will result in a huge and
verbose textual representation (that’s XML!).
For that reason MusicXML 2.0 introduced a
compressed format with a .mxml suffix
• Widely supported (scorewriting, OCR, sequencer)
• Easy to understand
• Full support of music features
MusicXML
Part
Time
Clef
Note(s)
[1] https://www.musicxml.com
MusicXML
London Information Retrieval Meetup
The Parsons code, formally named the Parsons
code for melodic contours, is a simple notation
used to identify a piece of music through melodic
motion — movements of the pitch up and down.
(https://en.wikipedia.org/wiki/Parsons_code)
The encoding focuses on the pitch relation between
subsequent notes. Main points about this method are:
• Simplicity
• Being a textual encoding it offers interesting
challenges in text search engines
• Limited: It doesn’t consider at all important
features like time and intervals, pauses, ghost
notes
Parsons Code
Symbol Description
* First note of a sequence
u,/
“up”, the note is higher than the
previous one
d,
“down”, the note is lower than
the previous one
r,-
“repeat”, the note is the same
of the previous one
Parsons Code (1/4)
London Information Retrieval Meetup
Parsons Code (2/4)
London Information Retrieval Meetup
*
*
r
u u r
r u r u r d r d r
d r d r
u r u r u r u r
*
u
d d d u u u
X
u
d d d u u u
X
d
Money, Pink Floyd
Parsons Code (3/4)
London Information Retrieval Meetup
Tempo (Time)
Intervals
Rests
Ghost Notes
Parsons Code (4/4)
London Information Retrieval Meetup
Digital computers can only capture this data at discrete moments in time. The rate at which a
computer captures audio data is called the sampling frequency or sampling rate.
An audio signal is a representation of sound that represents the fluctuation in air pressure
caused by the vibration as a function of time. Unlike sheet music or symbolic representations,
audio representations encode everything that is necessary to reproduce an acoustic realization
of a piece of music.
Audio Representation: Time Domain
London Information Retrieval Meetup
The Frequency Domain representation
decomposes the audio signal in a number of
waves oscillating a different frequencies.
The FD plots the frequencies on the
horizontal axis by their corresponding
magnitude (power) on the vertical axis.
This representation, among other things, can
be used for highlighting the dominant
frequencies of a musical tone.
Frequency Domain
Frequency Domain
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Essentials
✓ Audio Processing
‣ Basic Pipeline
‣ Time Domain Features
‣ Frequency Domain Features
‣ Chroma Features
➢ Q&A
Agenda
London Information Retrieval Meetup
Time Domain Features Extraction
Frequency Domain Features Extraction
Sampling / Quantization
Framing
Windowing
FFT
Analog Signal
Basic Audio Processing Pipeline
London Information Retrieval Meetup
Amplitude Envelope (AE)
Max amplitude within a frame
Root-Mean-Square Energy (RMS)
Perceived sound intensity
Zero Crossing Rate (ZCR)
Number of times the amplitude changes its sign within a frame
Feature
Example
Usage
Loudness Estimation
Timbre Analysis
Speech Recognition
Audio Segmentation
Onset Detection
Time Domain Features
London Information Retrieval Meetup
Band Energy Ratio (BER)
Ratio between lower and higher
frequency bands energy
Spectral Centroid
Frequency band where most of
the energy is concentrated
Bandwidth (BW)
Spectral range of interesting
part of a signal
Feature
Example
Usage
Timbre Analysis
Speech Recognition
Onset Detection
Speech/Music Discrimination
Spectral Flux
Frequency band where most of
the energy is concentrated
Frequency Domain Features
London Information Retrieval Meetup
Chroma features are a powerful representation for
music audio in which the entire spectrum is
projected onto 12 bins representing the 12 distinct
semitones (or chroma) of the musical octave.
It’s a kind of analysis which bridges between low-level
and middle-level features, moving the audio signal
representation toward something which is more
readable, from a functional perspective.
Chroma Features
Chroma Features (1/2)
London Information Retrieval Meetup
Time
C
D
E
F
G
A
B
C#
D#
F#
G#
A#
A A A A A A C A F F F F F F F
G C C C C C C D C B B B B B B C B
N
O
I
S
E
Chroma Features (2/2)
London Information Retrieval Meetup
FALCON: FAst Lucene-based Cover sOng identification | chromaprint (part of AcustID)
Interesting Projects
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Music Representation
➢ Audio Processing
✓ Q&A
Agenda
London Information Retrieval Meetup
19 Feb 2019
Thank you!
Introduction to Music Information Retrieval
Thoughts from a former bass player
Andrea Gazzarini, Software Engineer
19th February 2019

More Related Content

Similar to Introduction to Music Information Retrieval

RDA for music cataloguers
RDA for music cataloguersRDA for music cataloguers
RDA for music cataloguersPeter Sime
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentialsgamedevelopersturkey
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...sebastianewert
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mirYi-Hsuan Yang
 
Poster vega north
Poster vega northPoster vega north
Poster vega northAcxelVega
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010ocor203
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...kthrlab
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingijma
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...I MT
 
Notating pop music
Notating pop musicNotating pop music
Notating pop musicxjkoboe
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music RecognitionMrinmoy Dalal
 
Music Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesMusic Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesALATechSource
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...TELKOMNIKA JOURNAL
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Oscar Celma
 

Similar to Introduction to Music Information Retrieval (20)

RDA for music cataloguers
RDA for music cataloguersRDA for music cataloguers
RDA for music cataloguers
 
Alpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio EssentialsAlpan Aytekin-Game Audio Essentials
Alpan Aytekin-Game Audio Essentials
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir
 
Poster vega north
Poster vega northPoster vega north
Poster vega north
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Music mobile
Music mobileMusic mobile
Music mobile
 
Sound
SoundSound
Sound
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
 
楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索楊奕軒/音樂資料檢索
楊奕軒/音樂資料檢索
 
MIR
MIRMIR
MIR
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key finding
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - "Machine...
 
Understang Music with Machine Learning by Jimena Royo-Letelier
Understang Music with Machine Learning by Jimena Royo-LetelierUnderstang Music with Machine Learning by Jimena Royo-Letelier
Understang Music with Machine Learning by Jimena Royo-Letelier
 
Notating pop music
Notating pop musicNotating pop music
Notating pop music
 
Ism2011
Ism2011Ism2011
Ism2011
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music Recognition
 
Music Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - SlidesMusic Cataloging Basics Workshop - Slides
Music Cataloging Basics Workshop - Slides
 
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
Streaming Audio Using MPEG–7 Audio Spectrum Envelope to Enable Self-similarit...
 
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
Annotating Music Collections: How Content-Based Similarity Helps to Propagate...
 

More from Sease

Multi Valued Vectors Lucene
Multi Valued Vectors LuceneMulti Valued Vectors Lucene
Multi Valued Vectors LuceneSease
 
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...Sease
 
How To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaHow To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaSease
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneSease
 
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Sease
 
How does ChatGPT work: an Information Retrieval perspective
How does ChatGPT work: an Information Retrieval perspectiveHow does ChatGPT work: an Information Retrieval perspective
How does ChatGPT work: an Information Retrieval perspectiveSease
 
How To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaHow To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaSease
 
Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache SolrSease
 
Large Scale Indexing
Large Scale IndexingLarge Scale Indexing
Large Scale IndexingSease
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfSease
 
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Sease
 
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfWord2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfSease
 
How to cache your searches_ an open source implementation.pptx
How to cache your searches_ an open source implementation.pptxHow to cache your searches_ an open source implementation.pptx
How to cache your searches_ an open source implementation.pptxSease
 
Online Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr InterleavingOnline Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr InterleavingSease
 
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Sease
 
Apache Lucene/Solr Document Classification
Apache Lucene/Solr Document ClassificationApache Lucene/Solr Document Classification
Apache Lucene/Solr Document ClassificationSease
 
Advanced Document Similarity with Apache Lucene
Advanced Document Similarity with Apache LuceneAdvanced Document Similarity with Apache Lucene
Advanced Document Similarity with Apache LuceneSease
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSease
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationSease
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to RankSease
 

More from Sease (20)

Multi Valued Vectors Lucene
Multi Valued Vectors LuceneMulti Valued Vectors Lucene
Multi Valued Vectors Lucene
 
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...
When SDMX meets AI-Leveraging Open Source LLMs To Make Official Statistics Mo...
 
How To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaHow To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With Kibana
 
Introducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache LuceneIntroducing Multi Valued Vectors Fields in Apache Lucene
Introducing Multi Valued Vectors Fields in Apache Lucene
 
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
Stat-weight Improving the Estimator of Interleaved Methods Outcomes with Stat...
 
How does ChatGPT work: an Information Retrieval perspective
How does ChatGPT work: an Information Retrieval perspectiveHow does ChatGPT work: an Information Retrieval perspective
How does ChatGPT work: an Information Retrieval perspective
 
How To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With KibanaHow To Implement Your Online Search Quality Evaluation With Kibana
How To Implement Your Online Search Quality Evaluation With Kibana
 
Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache Solr
 
Large Scale Indexing
Large Scale IndexingLarge Scale Indexing
Large Scale Indexing
 
Dense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdfDense Retrieval with Apache Solr Neural Search.pdf
Dense Retrieval with Apache Solr Neural Search.pdf
 
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
Neural Search Comes to Apache Solr_ Approximate Nearest Neighbor, BERT and Mo...
 
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfWord2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
 
How to cache your searches_ an open source implementation.pptx
How to cache your searches_ an open source implementation.pptxHow to cache your searches_ an open source implementation.pptx
How to cache your searches_ an open source implementation.pptx
 
Online Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr InterleavingOnline Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr Interleaving
 
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
 
Apache Lucene/Solr Document Classification
Apache Lucene/Solr Document ClassificationApache Lucene/Solr Document Classification
Apache Lucene/Solr Document Classification
 
Advanced Document Similarity with Apache Lucene
Advanced Document Similarity with Apache LuceneAdvanced Document Similarity with Apache Lucene
Advanced Document Similarity with Apache Lucene
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer Perspective
 
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality EvaluationRated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
 
Explainability for Learning to Rank
Explainability for Learning to RankExplainability for Learning to Rank
Explainability for Learning to Rank
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Introduction to Music Information Retrieval

  • 1. London Information Retrieval Meetup 19 Feb 2019 Introduction to Music Information Retrieval Thoughts from a former bass player Andrea Gazzarini, Software Engineer 19th February 2019
  • 2. London Information Retrieval Meetup Who I am ▪ Software Engineer (1999-) ▪ “Hermit” Software Engineer (2010-) ▪ Java & Information Retrieval Passionate ▪ Apache Qpid (past) Committer ▪ Husband & Father ▪ Bass Player Andrea Gazzarini, “Gazza”
  • 3. London Information Retrieval Meetup Sease Search Services ● Open Source Enthusiasts ● Apache Lucene/Solr experts ! Community Contributors ● Active Researchers ● Hot Trends : Learning To Rank, Document Similarity, Search Quality Evaluation, Relevancy Tuning
  • 4. London Information Retrieval Meetup ✓Music Information Retrieval (MIR)? ➢ Music Essentials ➢ Audio Processing ➢ Q&A Agenda
  • 5. London Information Retrieval Meetup MIR is concerned with the extraction, analysis and usage of information about any kind of music entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI representation of a piece of music, or name of a music artist).” Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web. Dissertation, Johannes Kepler University, Wien (2003) Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in in musicology, psychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these. https://en.wikipedia.org/wiki/Music_information_retrieval Music Information Retrieval (MIR)
  • 6. London Information Retrieval Meetup AUDIO IDENTIFICATION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION COVER SONG DETECTION SYMBOLIC SIMILARITY MOOD SOURCE SEPARATION INSTRUMENT RECOGNITION TEMPO ESTIMATION SCORE ALIGNMENT SONG STRUCTURE BEAT TRACKING KEY DETECTION QUERY BY HUMMING QUERY BY HUMMING AUDIO IDENTIFICATION INSTRUMENT RECOGNITION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION TEMPO ESTIMATION SONG STRUCTURE SCORE ALIGNMENT COVER SONG DETECTION SYMBOLIC SIMILARITY KEY DETECTION BEAT TRACKING MOOD SOURCE SEPARATION Music Information Retrieval (MIR)
  • 7. London Information Retrieval Meetup Music Content includes all those low-level things we can extract from the audio signal (e.g. time, frequencies, loudness) Computational Factors Context State Music Content Music Context Music Context defines additional metadata that cannot be extracted from the audio signal (e.g. lyrics, tags, artists, feedback, posts) Listener state includes the user state in a given moment (e.g. mood, musical knowledge, preferences) Listener Context relates to the environment where the listener is in a given moment (e.g. political, geographical, social) Factors in Music Perception
  • 8. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ✓Music Essentials ‣ Essentials ‣ Score Music Representation ‣ Symbolic Representations ‣ Audio Representation ➢ Audio Processing ➢ Q&A Agenda
  • 9. London Information Retrieval Meetup A note is used for denoting a sound, its pitch and duration A sound is the audio signal produced by a vibrating body Notes are associated to graphical symbols (for indicating the pitch and the duration) Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As consequence of that, we say they belong to the same pitch class A note is also used for denoting a pitch class. The traditional music theory individuates 12 pitch classes Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI) C D E F G A B C B A G F E D C C# D# F# G# A# Bb Ab Gb Eb Db Music Language Essentials
  • 10. London Information Retrieval Meetup Text Music Letter Note Word Phrase Sentence Chord Ghost Note Phrase Text vs Music
  • 11. London Information Retrieval Meetup Time Signature Key Signature Clef Tempo Note Reference Chord Chord Score music representation
  • 12. London Information Retrieval Meetup Symbolic music representations comprise any kind of score representation with an explicit encoding of notes or other musical events. Piano Roll, initially used for denoting rolls of paper with holes for controlling a melody execution on a self-playing device, it is nowadays used for referring to a digital visualisation which provides pitches over time. Musical Instrument Digital Interface (MIDI) is another representation, widely adopted, for representing music event (e.g. pitch, velocity, duration, intensity) Piano Roll & MIDI Symbolic music representation
  • 13. London Information Retrieval Meetup MusicXML [1] is an XML dialect for expressing Music in XML format. As you can imagine from the example on the right, encoding a whole song will result in a huge and verbose textual representation (that’s XML!). For that reason MusicXML 2.0 introduced a compressed format with a .mxml suffix • Widely supported (scorewriting, OCR, sequencer) • Easy to understand • Full support of music features MusicXML Part Time Clef Note(s) [1] https://www.musicxml.com MusicXML
  • 14. London Information Retrieval Meetup The Parsons code, formally named the Parsons code for melodic contours, is a simple notation used to identify a piece of music through melodic motion — movements of the pitch up and down. (https://en.wikipedia.org/wiki/Parsons_code) The encoding focuses on the pitch relation between subsequent notes. Main points about this method are: • Simplicity • Being a textual encoding it offers interesting challenges in text search engines • Limited: It doesn’t consider at all important features like time and intervals, pauses, ghost notes Parsons Code Symbol Description * First note of a sequence u,/ “up”, the note is higher than the previous one d, “down”, the note is lower than the previous one r,- “repeat”, the note is the same of the previous one Parsons Code (1/4)
  • 15. London Information Retrieval Meetup Parsons Code (2/4)
  • 16. London Information Retrieval Meetup * * r u u r r u r u r d r d r d r d r u r u r u r u r * u d d d u u u X u d d d u u u X d Money, Pink Floyd Parsons Code (3/4)
  • 17. London Information Retrieval Meetup Tempo (Time) Intervals Rests Ghost Notes Parsons Code (4/4)
  • 18. London Information Retrieval Meetup Digital computers can only capture this data at discrete moments in time. The rate at which a computer captures audio data is called the sampling frequency or sampling rate. An audio signal is a representation of sound that represents the fluctuation in air pressure caused by the vibration as a function of time. Unlike sheet music or symbolic representations, audio representations encode everything that is necessary to reproduce an acoustic realization of a piece of music. Audio Representation: Time Domain
  • 19. London Information Retrieval Meetup The Frequency Domain representation decomposes the audio signal in a number of waves oscillating a different frequencies. The FD plots the frequencies on the horizontal axis by their corresponding magnitude (power) on the vertical axis. This representation, among other things, can be used for highlighting the dominant frequencies of a musical tone. Frequency Domain Frequency Domain
  • 20. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Music Essentials ✓ Audio Processing ‣ Basic Pipeline ‣ Time Domain Features ‣ Frequency Domain Features ‣ Chroma Features ➢ Q&A Agenda
  • 21. London Information Retrieval Meetup Time Domain Features Extraction Frequency Domain Features Extraction Sampling / Quantization Framing Windowing FFT Analog Signal Basic Audio Processing Pipeline
  • 22. London Information Retrieval Meetup Amplitude Envelope (AE) Max amplitude within a frame Root-Mean-Square Energy (RMS) Perceived sound intensity Zero Crossing Rate (ZCR) Number of times the amplitude changes its sign within a frame Feature Example Usage Loudness Estimation Timbre Analysis Speech Recognition Audio Segmentation Onset Detection Time Domain Features
  • 23. London Information Retrieval Meetup Band Energy Ratio (BER) Ratio between lower and higher frequency bands energy Spectral Centroid Frequency band where most of the energy is concentrated Bandwidth (BW) Spectral range of interesting part of a signal Feature Example Usage Timbre Analysis Speech Recognition Onset Detection Speech/Music Discrimination Spectral Flux Frequency band where most of the energy is concentrated Frequency Domain Features
  • 24. London Information Retrieval Meetup Chroma features are a powerful representation for music audio in which the entire spectrum is projected onto 12 bins representing the 12 distinct semitones (or chroma) of the musical octave. It’s a kind of analysis which bridges between low-level and middle-level features, moving the audio signal representation toward something which is more readable, from a functional perspective. Chroma Features Chroma Features (1/2)
  • 25. London Information Retrieval Meetup Time C D E F G A B C# D# F# G# A# A A A A A A C A F F F F F F F G C C C C C C D C B B B B B B C B N O I S E Chroma Features (2/2)
  • 26. London Information Retrieval Meetup FALCON: FAst Lucene-based Cover sOng identification | chromaprint (part of AcustID) Interesting Projects
  • 27. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Music Representation ➢ Audio Processing ✓ Q&A Agenda
  • 28. London Information Retrieval Meetup 19 Feb 2019 Thank you! Introduction to Music Information Retrieval Thoughts from a former bass player Andrea Gazzarini, Software Engineer 19th February 2019