SlideShare a Scribd company logo
1 of 42
Download to read offline
London Information Retrieval Meetup
21 OCT 2019
Music Information Retrieval Take 2
Interval Hashing Based Ranking
Andrea Gazzarini, Software Engineer
21th October 2019
London Information Retrieval Meetup
▪ Software Engineer (1999-)
▪ “Hermit” Software Engineer (2010-)
▪ Java & Information Retrieval Passionate
▪ Apache Qpid (past) Committer
▪ Husband & Father
▪ Bass Player
Andrea Gazzarini, “Gazza”
London Information Retrieval Meetup
Sease
Search Services
● Open Source Enthusiasts
● Apache Lucene/Solr experts
! Community Contributors
● Active Researchers
● Hot Trends : Learning To Rank, Document Similarity,
Search Quality Evaluation, Relevancy Tuning
London Information Retrieval Meetup
✓Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
➢ Chroma Feature
➢ Music Language Essentials
➢ Ranked Interval Hashing
➢ Next steps
➢ Q&A
Agenda
London Information Retrieval Meetup
MIR is concerned with the extraction, analysis and usage of information about any kind of music
entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI
representation of a piece of music, or name of a music artist).”
Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web.
Dissertation, Johannes Kepler University, Wien (2003)
Music information retrieval (MIR) is the interdisciplinary science of retrieving information from
music. MIR is a small but growing field of research with many real-world applications. Those involved in
MIR may have a background in in musicology, psychoacoustics, psychology, academic music study,
signal processing, informatics, machine learning, optical music recognition, computational intelligence or
some combination of these.
https://en.wikipedia.org/wiki/Music_information_retrieval
Music Information Retrieval (MIR)
London Information Retrieval Meetup
AUDIO IDENTIFICATION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
SYMBOLIC SIMILARITY
MOOD
SOURCE SEPARATION
INSTRUMENT RECOGNITION
TEMPO ESTIMATION
SCORE ALIGNMENT
BEAT TRACKING
QUERY BY HUMMINGAUDIO IDENTIFICATION
INSTRUMENT RECOGNITION
GENRE IDENTIFICATION
TRANSCRIPTION RECOMMENDATION
TEMPO ESTIMATION
SONG STRUCTURE SCORE ALIGNMENT
COVER SONG DETECTION
SYMBOLIC SIMILARITY
KEY DETECTION
BEAT TRACKING
MOOD
SOURCE SEPARATION
Music Information Retrieval (MIR)
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
✓Query By Humming / Cover Song Detection
➢ Chroma Feature
➢ Music Language Essentials
➢ Ranked Interval Hashing
➢ Next steps
➢ Q&A
Agenda
London Information Retrieval Meetup
Query by humming (QbH) is a music retrieval
system that branches off the original classification
systems of title, artist, composer, and genre.
It normally applies to songs or other music with a
distinct single theme or melody. The system involves
taking a user-hummed melody (input query) and
comparing it to an existing database.
(Wikipedia)
What is it?
Query by Humming/Playing
The main challenge of such query paradigm is the
difference, in terms of audio signal, between the
input query and the stored dataset.
London Information Retrieval Meetup
A cover song is an alternative version/recording of
a previously recorded song.
A cover can differ from the original song in several
features: key, timbre, tempo, structure, arrangement,
language, even melody and lyrics.
The goal of a cover song detection search algorithm
is to identify, for a query audio track, recordings that
belong to the same conceptual work/composition.
The main challenges are:
• find a set of features that capture the common
aspects of the target compositions
• give a rank to each response items (generally not
so important)
What is it?
Cover Song Detection
London Information Retrieval Meetup
QBH + CSD
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
✓Chroma Feature
➢ Music Language Essentials
➢ Ranked Interval Hashing
➢ Next steps
➢ Q&A
Agenda
London Information Retrieval Meetup
Chroma features are a powerful representation for
music audio in which the entire spectrum is
projected onto 12 bins representing the 12 distinct
semitones (or chroma) of the musical octave.
The base assumption is that in music, notes exactly
one octave apart are perceived as “similar”. As
consequence of that, working on the distribution of
chroma without the absolute frequency can give
useful musical information about the audio
It’s a kind of analysis which bridges between low-
level and middle-level features, moving the audio
signal representation towards something which is
more readable, from a functional perspective.
Chroma Feature
Chroma Features (1/5)
Source: Wikipedia
London Information Retrieval Meetup
Chroma Features (2/5)
C
D
D# D
DD
D
D
DC
C
C E
F
F#
A
A#
B
F#
F#
BB
EF
D
C
D
F
London Information Retrieval Meetup
A Chroma Matrix consists of a 12 * tn matrix where
each vector represents the twelve pitches
strengths for a given instant t.
Sometimes the matrix is inverted so the same
information described above is available at row-level.
Chroma features can be computed using different
tools available on the internet including:
• LabROSA audio analysis tools (partially portable
to Octave)
• opensource tools written in several languages
Chroma Matrix
Chroma Features (3/5)
Pitches strengths at instant t1 where you can see the top (i.e. most intense) 3 classes
are A#, B and A
London Information Retrieval Meetup
Time
C
D
E
F
G
A
B
C#
D#
F#
G#
A#
A A A A A A C A F F F F F F FG C C C C C C D C B B B B B B C B
N
O
I
S
E
Chroma Features (4/5)
London Information Retrieval Meetup
Chroma Features (5/5)
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
➢ Chroma Feature
✓Music Language Essentials
➢ Ranked Interval Hashing
➢ Next steps
➢ Q&A
Agenda
London Information Retrieval Meetup
A note is used for denoting a sound, its pitch and duration
A sound is the audio signal produced by a vibrating body
Notes are associated to graphical symbols (for indicating the pitch and the duration)
Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As
consequence of that, we say they belong to the same pitch class
A note is also used for denoting a pitch class. The Western Music Theory individuates 12 pitch classes
Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI)
C D E
F G A B
C
B A
G F E D
C
C#
D# F# G#
A#
Bb Ab Gb Eb Db
Music Language Essentials
The distance between two subsequent notes is called a semitone
London Information Retrieval Meetup
Text Music
Letter Note
Word
Phrase
Sentence
Chord
Ghost Note
Phrase
Text vs Music
London Information Retrieval Meetup
Time Signature
Key Signature
Clef
Tempo
Note
Reference Chord
Chord
Score music representation
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
➢ Chroma Feature
➢ Music Language Essentials
✓Ranked Interval Hashing
➢ Next steps
➢ Q&A
Agenda
London Information Retrieval Meetup
We’ve already seen Chroma Features analysis
normalises the entire spectrogram in classes,
loosing the absolute frequency.
That introduces a normalisation which actually puts
on the same layer notes belonging to the same
class.
This strictly relates to a concept known in music
theory as “Inversion”
“Inversion, in music, is the rearrangement of the top-
to-bottom elements in an interval, a chord, a melody,
or a group of contrapuntal lines of music.”
(Source: Encyclopedia Britannica)
Inversions
Assumption #1: Inversions as “Synonyms”
=
= =
AND =
IF
AND
THEN
~ ~
=~ =~ =~
London Information Retrieval Meetup
In Music Theory, an interval is defined as the
distance between two notes.
In Western Music Theory the minimum distance
between two notes is called semitone. As
consequence of that, an interval measures the
number of semitones that occurs between two
notes.
The intervals concept allows to translate any
sequence of notes.
If we have a melody starting from a given note (e.g.
C) we can translate the whole sequence using a
different starting point (e.g. A): the most important
thing is that intervals must remain the same.
Intervals
Assumption #2: Intervals instead of Notes!
London Information Retrieval Meetup
A simple matrix which defines all the possible
intervals between two notes.
Each interval has a precise name in Western Music
Theory (e.g. 0 = unison), but for our purposes seeing
the number (of semitones) is more important
The interval between two arbitrary notes can be
easily computed by using the following formula:
(defn semitones [x y]
(if (>= y x)
(- y x)
(+ (- 11 (- x 1)) y)))
Intervals Table
Intervals Table
C C# D D# E F F# G G# A A# B
C 0 1 2 3 4 5 6 7 8 9 10 11
C# 11 0 1 2 3 4 5 6 7 8 9 10
D 10 11 0 1 2 3 4 5 6 7 8 9
D# 9 10 11 0 1 2 3 4 5 6 7 8
E 8 9 10 11 0 1 2 3 4 5 6 7
F 7 8 9 10 11 0 1 2 3 4 5 6
F# 6 7 8 9 10 11 0 1 2 3 4 5
G 5 6 7 8 9 10 11 0 1 2 3 4
G
#
4 5 6 7 8 9 10 11 0 1 2 3
A 3 4 5 6 7 8 9 10 11 0 1 2
A# 2 3 4 5 6 7 8 9 10 11 0 1
B 1 2 3 4 5 6 7 8 9 10 11 0
London Information Retrieval Meetup
For each Chroma Vector we keep the top k
most intense pitch energies.
Each class is denoted by its ordinal position
(the examples on the left provides, for
simplicity, the corresponding mnemonic code
as well).
The underlying assumption is that in order
to represent and summarise a chroma
vector, the stronger a class energy is, the
better.
• INPUT => n Chroma vectors
• OUTPUT => n Rank vectors
Top K classes
Step #1: Ranks
0.28956729767718553 0.29918483818230696 0.29750046253718804 0.26970075757159021
0.26570971680808941 0.28762092331555628 0.27230350897098865 0.26479060237734442
0.24563184404410343 0.35945972816613164 0.20234795661509283 0.19691813882136551
0.21751928582300245 0.31682129361484279 0.21327678582822374 0.19345434437896755
0.19067933460430175 0.19899201366683592 0.20626346569405868 0.11096279563650845
0.18864723689615112 0.15732581183385463 0.16582685461399227 0.13454776152251802
0.17659990800974107 0.12781101466846442 0.12998904712393031 0.17361679104251077
0.15805875199726557 0.11070659008562342 0.12210923914829071 0.14658222225322309
0.14652990682934677 0.10462518086147332 0.11147004101461088 0.11962189455960689
0.13012578265891636 0.08802297436725277 0.08795354261988087 0.07726694847307212
0.10615064267867232 0.06285819867921958 0.06280150731135973 0.04137698779765421
0.07978125487239938 0.04066800213125138 0.03900988562456922 0.01849582605529331
0 - C 2 - D 0 - C 0 - C
1 - C# 3 - D# 1 - C# 1 - C#
2 - D 0 - C 3 - D# 2 - D
3 - D# 1 - C# 4 - E 3 - D#
4 - E 4 - E 2 - D 6 - F#
London Information Retrieval Meetup
For each pair of Rank vectors, for each
pair of pitch classes sharing the same
index, we compute the corresponding
interval.
The underlying assumption is that we want
to have a relative model which works
regardless the input key/tonality.
• INPUT => n Rank Vectors
• OUTPUT => n - 1 Interval Vectors
Distances
Step #2: Intervals
0 - C 2 - D 0 - C 0 - C
1 - C# 3 - D# 1 - C# 1 - C#
2 - D 0 - C 3 - D# 2 - D
3 - D# 1 - C# 4 - E 3 - D#
4 - E 4 - E 2 - D 6 - F#
(defn semitones [x y]
(if (>= y x) (- y x) (+ (- 11 (- x 1)) y)))
2 10 0
2 10 0
10 3 11
10 3 11
0 10 4
London Information Retrieval Meetup
Step #3: Hashing
Each Interval Vector is reduced to a
number (hash).
The hash uniquely represents a given
intervals distribution.
• INPUT => n-1 Interval Vectors
• OUTPUT => 1 Hash Vector
Summarisation
2 10 0 …
2 10 0 …
10 3 11 …
10 3 11 …
0 10 4 …
∑xi *12^i
18476 213106 103536 …
London Information Retrieval Meetup
Step #4: Partitions
The hash vector is partitioned using a size and an
overlap parameters (see examples on the left).
An audio input is therefore represented as a list of
partitions derived from the hash vector where
• each partition is a term
• ordering is preserved:
• in the term metadata (e.g. position increment)
• implicitly in the term value
Bags of Bags of Words
39483 4149 1440 3213 33232 44256 6654 4432 4149 7843 …
Partition of 3 hashes without overlap
[39483, 4149, 1440] [3213, 33232, 44256] [6654,4432, 4149], …
Partition of 3 hashes with overlap = 1
[39483, 4149, 1440] [1440, 3213, 33232] [33232, 44256, 6654], …
Partition of 5 hashes without overlap
[39483,4149,1440,3213,33232] [44256,6654,4432,4149,7843] …
(Examples)
London Information Retrieval Meetup
Step #5: Multiple choices for the output
{
"title": "Zombie",
"hashes": [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ]
}
Only one partition type
Only one partition type with different overlaps
{
"title": "Zombie",
“hashes_overlap_0”: [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ],
“hashes_overlap_1”: [“39483,4149,1440”, “1440,3213,33232”, “33232,44256,6654”, … ],
…
}
Multiple partition types
{
"title": "Zombie",
“hashes_3”: [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ],
“hashes_5”: [“39483,4149,1440,3213,33232”, “33232,44256,6654,4432,4149”, … ],
…
}
London Information Retrieval Meetup
Sample test query
Sample Data
Sample test data
Money, Pink Floyd
https://www.youtube.com/watch?v=-0kcet4aPpQ
{Q1 - Main Bass Riff (original tonality)
Q2 - Main Bass Riff (different tonality)
Q3 - Descendent pentatonic scale (different tonality)
Zombie, Cranberries
https://www.youtube.com/watch?v=6Ejga4kJUts
{
Q1 - Main Bass Riff (original tonality)
Q2 - Main Melody
Q3 - Chorus
Stand By Me, John Lennon
https://www.youtube.com/watch?v=YqB8Dm65X18
{Q1 - Main Bass Riff (different tonality)
Q2 - Main Melody
London Information Retrieval Meetup
V1.5
Sample Config
V1.0
<requestHandler name="standard">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="qf">hash</str>
<str name="mm">10%</str>
<str name="fl">*,score</str>
</lst>
</requestHandler><lst name="defaults">
<str name="defType">edismax</str>
<str name="qf">hash_3</str>
<str name="fl">*,score</str>
<str name="pf2">hash_2 hash_3^10 hash_4^100 hash_5^1000</str>
<str name="ps2">3</str>
<str name="pf3">hash_2^100 hash_3^1000 hash_4^10000 hash_5^100000</str>
<str name="ps3">2</str>
<str name="boost">product(query({!lucene df=hash_2 v=$q}), 10)</str>
<str name="boost">product(query({!lucene df=hash_3 v=$q}), 100)</str>
<str name="boost">product(query({!lucene df=hash_4 v=$q}), 1000)</str>
<str name="boost">product(query({!lucene df=hash_5 v=$q}), 10000)</str>
<str name="tie">0.1</str>
London Information Retrieval Meetup
Nice shot #1
Q2 - Money (Pink Floyd): main bass riff (different tonality)
Q1 - Money (Pink Floyd): main bass riff (original tonality)
Same melody (bass riff) in different tonalities yield similar results!
P@1 P@2 P@10 NDCG
London Information Retrieval Meetup
Nice shot #2
Q2 - Zombie (Cranberries): main melody (different tonality)
Q1 - Zombie (Cranberries): main bass riff (original tonality)
Position matters!
P@1 P@2 P@10 NDCG
London Information Retrieval Meetup
Nice shot #3
Q1 - Zombie (Cranberries): main bass riff (original tonality)
Good similarity even if the query pattern is not in result.
P@1
London Information Retrieval Meetup
To be improved
Q1 - Zombie (Cranberries): main bass riff (original tonality)
Same query as before, red items have a very different “shape”
P@1
London Information Retrieval Meetup
To be improved
If we were looking for “Money” these would look like great results…
…but unfortunately the input query here is the “Stand by me” bass line
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
➢ Chroma Feature
➢ Music Language Essentials
➢ Ranked Interval Hashing
✓Next Steps
➢ Q&A
Agenda
London Information Retrieval Meetup
Further areas of investigation
• Different types of Chroma Features
• “Stop” chroma vectors / intervals / energies
• Input data understanding & pre-processing
• Complementary approaches
• Additional audio features
• Query time
• Machine Learning
London Information Retrieval Meetup
Chroma Matrix Clean Up
Original Threshold = 50% Threshold = 70% Threshold = 80% Top frequency
London Information Retrieval Meetup
➢ Music Information Retrieval (MIR)
➢ Query by Humming and Cover Song Detection
➢ Chroma Feature
➢ Music Language Essentials
➢ Ranked Interval Hashing
➢ Next Steps
✓Q&A
Agenda
London Information Retrieval Meetup
Links
Chroma Feature Analysis and Synthesis
https://labrosa.ee.columbia.edu/matlab/chroma-ansyn/
Chroma (Java)
https://github.com/adiyoss/Chroma
CHROMA TOOLBOX: MATLAB IMPLEMENTATIONS FOR
EXTRACTING VARIANTS OF CHROMA-BASED AUDIO FEATURES
https://www.academia.edu/30760846/
CHROMA_TOOLBOX_MATLAB_IMPLEMENTATIONS_FOR_EXTRACTING_VARIANTS_OF_CH
ROMA-BASED_AUDIO_FEATURES
FALCON: FAst Lucene-based Cover sOng identification
https://www.researchgate.net/publication/221571301_FALCON_FAst_Lucene-based_Cover_sOng_identification
Music Theory
https://en.wikipedia.org/wiki/Music_theory
Inversions
https://en.wikipedia.org/wiki/Inversion_(music)
Intervals
https://en.wikipedia.org/wiki/Interval_(music)
Chroma Feature
https://en.wikipedia.org/wiki/Chroma_feature
London Information Retrieval Meetup
21 OCT 2019
Thank you!
Music Information Retrieval Take 2
Interval Hashing Based Ranking
Andrea Gazzarini, Software Engineer
21th October 2019

More Related Content

Similar to Interval Hashing Based Ranking

Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information RetrievalSease
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingijma
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Yi-Hsuan Yang
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...kthrlab
 
Information retrieval to recommender systems
Information retrieval to recommender systemsInformation retrieval to recommender systems
Information retrieval to recommender systemsData Science Society
 
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technologyShital Kat
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010ocor203
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)ALATechSource
 
Mining Melodic Patterns in Large Audio Collections of Indian Art Music
	Mining Melodic Patterns in Large Audio Collections of Indian Art Music	Mining Melodic Patterns in Large Audio Collections of Indian Art Music
Mining Melodic Patterns in Large Audio Collections of Indian Art MusicSankalp Gulati
 
Landmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesLandmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesSankalp Gulati
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...sebastianewert
 
RDA for Music: Classical Music Audio Recordings Workshop
RDA for Music: Classical Music Audio Recordings WorkshopRDA for Music: Classical Music Audio Recordings Workshop
RDA for Music: Classical Music Audio Recordings WorkshopALATechSource
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov modelsMeghana Kantharaj
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachEnrico Daga
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mirYi-Hsuan Yang
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Yi-Hsuan Yang
 

Similar to Interval Hashing Based Ranking (20)

Introduction to Music Information Retrieval
Introduction to Music Information RetrievalIntroduction to Music Information Retrieval
Introduction to Music Information Retrieval
 
The kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key findingThe kusc classical music dataset for audio key finding
The kusc classical music dataset for audio key finding
 
Sparse and Low Rank Representations in Music Signal Analysis
 Sparse and Low Rank Representations in Music Signal  Analysis Sparse and Low Rank Representations in Music Signal  Analysis
Sparse and Low Rank Representations in Music Signal Analysis
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
MIR
MIRMIR
MIR
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
 
Information retrieval to recommender systems
Information retrieval to recommender systemsInformation retrieval to recommender systems
Information retrieval to recommender systems
 
Audio mining
Audio miningAudio mining
Audio mining
 
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technology
 
Denktank 2010
Denktank 2010Denktank 2010
Denktank 2010
 
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
Introduction to LC Faceted Vocabularies for Music Resources (August 2018)
 
Mining Melodic Patterns in Large Audio Collections of Indian Art Music
	Mining Melodic Patterns in Large Audio Collections of Indian Art Music	Mining Melodic Patterns in Large Audio Collections of Indian Art Music
Mining Melodic Patterns in Large Audio Collections of Indian Art Music
 
Landmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music MelodiesLandmark Detection in Hindustani Music Melodies
Landmark Detection in Hindustani Music Melodies
 
auto_playlist
auto_playlistauto_playlist
auto_playlist
 
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
Semantic Linking of Information, Content and Metadata for Early Music (SLICKM...
 
RDA for Music: Classical Music Audio Recordings Workshop
RDA for Music: Classical Music Audio Recordings WorkshopRDA for Music: Classical Music Audio Recordings Workshop
RDA for Music: Classical Music Audio Recordings Workshop
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov models
 
Capturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid ApproachCapturing Themed Evidence, a Hybrid Approach
Capturing Themed Evidence, a Hybrid Approach
 
20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir20211026 taicca 1 intro to mir
20211026 taicca 1 intro to mir
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
 

Recently uploaded

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Recently uploaded (20)

办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

Interval Hashing Based Ranking

  • 1. London Information Retrieval Meetup 21 OCT 2019 Music Information Retrieval Take 2 Interval Hashing Based Ranking Andrea Gazzarini, Software Engineer 21th October 2019
  • 2. London Information Retrieval Meetup ▪ Software Engineer (1999-) ▪ “Hermit” Software Engineer (2010-) ▪ Java & Information Retrieval Passionate ▪ Apache Qpid (past) Committer ▪ Husband & Father ▪ Bass Player Andrea Gazzarini, “Gazza”
  • 3. London Information Retrieval Meetup Sease Search Services ● Open Source Enthusiasts ● Apache Lucene/Solr experts ! Community Contributors ● Active Researchers ● Hot Trends : Learning To Rank, Document Similarity, Search Quality Evaluation, Relevancy Tuning
  • 4. London Information Retrieval Meetup ✓Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ➢ Chroma Feature ➢ Music Language Essentials ➢ Ranked Interval Hashing ➢ Next steps ➢ Q&A Agenda
  • 5. London Information Retrieval Meetup MIR is concerned with the extraction, analysis and usage of information about any kind of music entity (e.g. a song or a music artist) on any representation level (for example, audio signal, symbolic MIDI representation of a piece of music, or name of a music artist).” Schedl, M.: Automatically extracting, analyzing and visualizing information on music artists from the world wide web. Dissertation, Johannes Kepler University, Wien (2003) Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. MIR is a small but growing field of research with many real-world applications. Those involved in MIR may have a background in in musicology, psychoacoustics, psychology, academic music study, signal processing, informatics, machine learning, optical music recognition, computational intelligence or some combination of these. https://en.wikipedia.org/wiki/Music_information_retrieval Music Information Retrieval (MIR)
  • 6. London Information Retrieval Meetup AUDIO IDENTIFICATION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION SYMBOLIC SIMILARITY MOOD SOURCE SEPARATION INSTRUMENT RECOGNITION TEMPO ESTIMATION SCORE ALIGNMENT BEAT TRACKING QUERY BY HUMMINGAUDIO IDENTIFICATION INSTRUMENT RECOGNITION GENRE IDENTIFICATION TRANSCRIPTION RECOMMENDATION TEMPO ESTIMATION SONG STRUCTURE SCORE ALIGNMENT COVER SONG DETECTION SYMBOLIC SIMILARITY KEY DETECTION BEAT TRACKING MOOD SOURCE SEPARATION Music Information Retrieval (MIR)
  • 7. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ✓Query By Humming / Cover Song Detection ➢ Chroma Feature ➢ Music Language Essentials ➢ Ranked Interval Hashing ➢ Next steps ➢ Q&A Agenda
  • 8. London Information Retrieval Meetup Query by humming (QbH) is a music retrieval system that branches off the original classification systems of title, artist, composer, and genre. It normally applies to songs or other music with a distinct single theme or melody. The system involves taking a user-hummed melody (input query) and comparing it to an existing database. (Wikipedia) What is it? Query by Humming/Playing The main challenge of such query paradigm is the difference, in terms of audio signal, between the input query and the stored dataset.
  • 9. London Information Retrieval Meetup A cover song is an alternative version/recording of a previously recorded song. A cover can differ from the original song in several features: key, timbre, tempo, structure, arrangement, language, even melody and lyrics. The goal of a cover song detection search algorithm is to identify, for a query audio track, recordings that belong to the same conceptual work/composition. The main challenges are: • find a set of features that capture the common aspects of the target compositions • give a rank to each response items (generally not so important) What is it? Cover Song Detection
  • 10. London Information Retrieval Meetup QBH + CSD
  • 11. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ✓Chroma Feature ➢ Music Language Essentials ➢ Ranked Interval Hashing ➢ Next steps ➢ Q&A Agenda
  • 12. London Information Retrieval Meetup Chroma features are a powerful representation for music audio in which the entire spectrum is projected onto 12 bins representing the 12 distinct semitones (or chroma) of the musical octave. The base assumption is that in music, notes exactly one octave apart are perceived as “similar”. As consequence of that, working on the distribution of chroma without the absolute frequency can give useful musical information about the audio It’s a kind of analysis which bridges between low- level and middle-level features, moving the audio signal representation towards something which is more readable, from a functional perspective. Chroma Feature Chroma Features (1/5) Source: Wikipedia
  • 13. London Information Retrieval Meetup Chroma Features (2/5) C D D# D DD D D DC C C E F F# A A# B F# F# BB EF D C D F
  • 14. London Information Retrieval Meetup A Chroma Matrix consists of a 12 * tn matrix where each vector represents the twelve pitches strengths for a given instant t. Sometimes the matrix is inverted so the same information described above is available at row-level. Chroma features can be computed using different tools available on the internet including: • LabROSA audio analysis tools (partially portable to Octave) • opensource tools written in several languages Chroma Matrix Chroma Features (3/5) Pitches strengths at instant t1 where you can see the top (i.e. most intense) 3 classes are A#, B and A
  • 15. London Information Retrieval Meetup Time C D E F G A B C# D# F# G# A# A A A A A A C A F F F F F F FG C C C C C C D C B B B B B B C B N O I S E Chroma Features (4/5)
  • 16. London Information Retrieval Meetup Chroma Features (5/5)
  • 17. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ➢ Chroma Feature ✓Music Language Essentials ➢ Ranked Interval Hashing ➢ Next steps ➢ Q&A Agenda
  • 18. London Information Retrieval Meetup A note is used for denoting a sound, its pitch and duration A sound is the audio signal produced by a vibrating body Notes are associated to graphical symbols (for indicating the pitch and the duration) Two notes with the same fundamental frequency in a ratio of any integer power of two are perceived as similar. As consequence of that, we say they belong to the same pitch class A note is also used for denoting a pitch class. The Western Music Theory individuates 12 pitch classes Notes and Pitch classes are associated to mnemonic codes (e.g. C,D,E,F,G,A,B or DO,RE,MI,FA,SOL,LA,SI) C D E F G A B C B A G F E D C C# D# F# G# A# Bb Ab Gb Eb Db Music Language Essentials The distance between two subsequent notes is called a semitone
  • 19. London Information Retrieval Meetup Text Music Letter Note Word Phrase Sentence Chord Ghost Note Phrase Text vs Music
  • 20. London Information Retrieval Meetup Time Signature Key Signature Clef Tempo Note Reference Chord Chord Score music representation
  • 21. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ➢ Chroma Feature ➢ Music Language Essentials ✓Ranked Interval Hashing ➢ Next steps ➢ Q&A Agenda
  • 22. London Information Retrieval Meetup We’ve already seen Chroma Features analysis normalises the entire spectrogram in classes, loosing the absolute frequency. That introduces a normalisation which actually puts on the same layer notes belonging to the same class. This strictly relates to a concept known in music theory as “Inversion” “Inversion, in music, is the rearrangement of the top- to-bottom elements in an interval, a chord, a melody, or a group of contrapuntal lines of music.” (Source: Encyclopedia Britannica) Inversions Assumption #1: Inversions as “Synonyms” = = = AND = IF AND THEN ~ ~ =~ =~ =~
  • 23. London Information Retrieval Meetup In Music Theory, an interval is defined as the distance between two notes. In Western Music Theory the minimum distance between two notes is called semitone. As consequence of that, an interval measures the number of semitones that occurs between two notes. The intervals concept allows to translate any sequence of notes. If we have a melody starting from a given note (e.g. C) we can translate the whole sequence using a different starting point (e.g. A): the most important thing is that intervals must remain the same. Intervals Assumption #2: Intervals instead of Notes!
  • 24. London Information Retrieval Meetup A simple matrix which defines all the possible intervals between two notes. Each interval has a precise name in Western Music Theory (e.g. 0 = unison), but for our purposes seeing the number (of semitones) is more important The interval between two arbitrary notes can be easily computed by using the following formula: (defn semitones [x y] (if (>= y x) (- y x) (+ (- 11 (- x 1)) y))) Intervals Table Intervals Table C C# D D# E F F# G G# A A# B C 0 1 2 3 4 5 6 7 8 9 10 11 C# 11 0 1 2 3 4 5 6 7 8 9 10 D 10 11 0 1 2 3 4 5 6 7 8 9 D# 9 10 11 0 1 2 3 4 5 6 7 8 E 8 9 10 11 0 1 2 3 4 5 6 7 F 7 8 9 10 11 0 1 2 3 4 5 6 F# 6 7 8 9 10 11 0 1 2 3 4 5 G 5 6 7 8 9 10 11 0 1 2 3 4 G # 4 5 6 7 8 9 10 11 0 1 2 3 A 3 4 5 6 7 8 9 10 11 0 1 2 A# 2 3 4 5 6 7 8 9 10 11 0 1 B 1 2 3 4 5 6 7 8 9 10 11 0
  • 25. London Information Retrieval Meetup For each Chroma Vector we keep the top k most intense pitch energies. Each class is denoted by its ordinal position (the examples on the left provides, for simplicity, the corresponding mnemonic code as well). The underlying assumption is that in order to represent and summarise a chroma vector, the stronger a class energy is, the better. • INPUT => n Chroma vectors • OUTPUT => n Rank vectors Top K classes Step #1: Ranks 0.28956729767718553 0.29918483818230696 0.29750046253718804 0.26970075757159021 0.26570971680808941 0.28762092331555628 0.27230350897098865 0.26479060237734442 0.24563184404410343 0.35945972816613164 0.20234795661509283 0.19691813882136551 0.21751928582300245 0.31682129361484279 0.21327678582822374 0.19345434437896755 0.19067933460430175 0.19899201366683592 0.20626346569405868 0.11096279563650845 0.18864723689615112 0.15732581183385463 0.16582685461399227 0.13454776152251802 0.17659990800974107 0.12781101466846442 0.12998904712393031 0.17361679104251077 0.15805875199726557 0.11070659008562342 0.12210923914829071 0.14658222225322309 0.14652990682934677 0.10462518086147332 0.11147004101461088 0.11962189455960689 0.13012578265891636 0.08802297436725277 0.08795354261988087 0.07726694847307212 0.10615064267867232 0.06285819867921958 0.06280150731135973 0.04137698779765421 0.07978125487239938 0.04066800213125138 0.03900988562456922 0.01849582605529331 0 - C 2 - D 0 - C 0 - C 1 - C# 3 - D# 1 - C# 1 - C# 2 - D 0 - C 3 - D# 2 - D 3 - D# 1 - C# 4 - E 3 - D# 4 - E 4 - E 2 - D 6 - F#
  • 26. London Information Retrieval Meetup For each pair of Rank vectors, for each pair of pitch classes sharing the same index, we compute the corresponding interval. The underlying assumption is that we want to have a relative model which works regardless the input key/tonality. • INPUT => n Rank Vectors • OUTPUT => n - 1 Interval Vectors Distances Step #2: Intervals 0 - C 2 - D 0 - C 0 - C 1 - C# 3 - D# 1 - C# 1 - C# 2 - D 0 - C 3 - D# 2 - D 3 - D# 1 - C# 4 - E 3 - D# 4 - E 4 - E 2 - D 6 - F# (defn semitones [x y] (if (>= y x) (- y x) (+ (- 11 (- x 1)) y))) 2 10 0 2 10 0 10 3 11 10 3 11 0 10 4
  • 27. London Information Retrieval Meetup Step #3: Hashing Each Interval Vector is reduced to a number (hash). The hash uniquely represents a given intervals distribution. • INPUT => n-1 Interval Vectors • OUTPUT => 1 Hash Vector Summarisation 2 10 0 … 2 10 0 … 10 3 11 … 10 3 11 … 0 10 4 … ∑xi *12^i 18476 213106 103536 …
  • 28. London Information Retrieval Meetup Step #4: Partitions The hash vector is partitioned using a size and an overlap parameters (see examples on the left). An audio input is therefore represented as a list of partitions derived from the hash vector where • each partition is a term • ordering is preserved: • in the term metadata (e.g. position increment) • implicitly in the term value Bags of Bags of Words 39483 4149 1440 3213 33232 44256 6654 4432 4149 7843 … Partition of 3 hashes without overlap [39483, 4149, 1440] [3213, 33232, 44256] [6654,4432, 4149], … Partition of 3 hashes with overlap = 1 [39483, 4149, 1440] [1440, 3213, 33232] [33232, 44256, 6654], … Partition of 5 hashes without overlap [39483,4149,1440,3213,33232] [44256,6654,4432,4149,7843] … (Examples)
  • 29. London Information Retrieval Meetup Step #5: Multiple choices for the output { "title": "Zombie", "hashes": [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ] } Only one partition type Only one partition type with different overlaps { "title": "Zombie", “hashes_overlap_0”: [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ], “hashes_overlap_1”: [“39483,4149,1440”, “1440,3213,33232”, “33232,44256,6654”, … ], … } Multiple partition types { "title": "Zombie", “hashes_3”: [“39483,4149,1440”, “3213,33232,44256”, “6654,4432, 4149”, … ], “hashes_5”: [“39483,4149,1440,3213,33232”, “33232,44256,6654,4432,4149”, … ], … }
  • 30. London Information Retrieval Meetup Sample test query Sample Data Sample test data Money, Pink Floyd https://www.youtube.com/watch?v=-0kcet4aPpQ {Q1 - Main Bass Riff (original tonality) Q2 - Main Bass Riff (different tonality) Q3 - Descendent pentatonic scale (different tonality) Zombie, Cranberries https://www.youtube.com/watch?v=6Ejga4kJUts { Q1 - Main Bass Riff (original tonality) Q2 - Main Melody Q3 - Chorus Stand By Me, John Lennon https://www.youtube.com/watch?v=YqB8Dm65X18 {Q1 - Main Bass Riff (different tonality) Q2 - Main Melody
  • 31. London Information Retrieval Meetup V1.5 Sample Config V1.0 <requestHandler name="standard"> <lst name="defaults"> <str name="defType">edismax</str> <str name="qf">hash</str> <str name="mm">10%</str> <str name="fl">*,score</str> </lst> </requestHandler><lst name="defaults"> <str name="defType">edismax</str> <str name="qf">hash_3</str> <str name="fl">*,score</str> <str name="pf2">hash_2 hash_3^10 hash_4^100 hash_5^1000</str> <str name="ps2">3</str> <str name="pf3">hash_2^100 hash_3^1000 hash_4^10000 hash_5^100000</str> <str name="ps3">2</str> <str name="boost">product(query({!lucene df=hash_2 v=$q}), 10)</str> <str name="boost">product(query({!lucene df=hash_3 v=$q}), 100)</str> <str name="boost">product(query({!lucene df=hash_4 v=$q}), 1000)</str> <str name="boost">product(query({!lucene df=hash_5 v=$q}), 10000)</str> <str name="tie">0.1</str>
  • 32. London Information Retrieval Meetup Nice shot #1 Q2 - Money (Pink Floyd): main bass riff (different tonality) Q1 - Money (Pink Floyd): main bass riff (original tonality) Same melody (bass riff) in different tonalities yield similar results! P@1 P@2 P@10 NDCG
  • 33. London Information Retrieval Meetup Nice shot #2 Q2 - Zombie (Cranberries): main melody (different tonality) Q1 - Zombie (Cranberries): main bass riff (original tonality) Position matters! P@1 P@2 P@10 NDCG
  • 34. London Information Retrieval Meetup Nice shot #3 Q1 - Zombie (Cranberries): main bass riff (original tonality) Good similarity even if the query pattern is not in result. P@1
  • 35. London Information Retrieval Meetup To be improved Q1 - Zombie (Cranberries): main bass riff (original tonality) Same query as before, red items have a very different “shape” P@1
  • 36. London Information Retrieval Meetup To be improved If we were looking for “Money” these would look like great results… …but unfortunately the input query here is the “Stand by me” bass line
  • 37. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ➢ Chroma Feature ➢ Music Language Essentials ➢ Ranked Interval Hashing ✓Next Steps ➢ Q&A Agenda
  • 38. London Information Retrieval Meetup Further areas of investigation • Different types of Chroma Features • “Stop” chroma vectors / intervals / energies • Input data understanding & pre-processing • Complementary approaches • Additional audio features • Query time • Machine Learning
  • 39. London Information Retrieval Meetup Chroma Matrix Clean Up Original Threshold = 50% Threshold = 70% Threshold = 80% Top frequency
  • 40. London Information Retrieval Meetup ➢ Music Information Retrieval (MIR) ➢ Query by Humming and Cover Song Detection ➢ Chroma Feature ➢ Music Language Essentials ➢ Ranked Interval Hashing ➢ Next Steps ✓Q&A Agenda
  • 41. London Information Retrieval Meetup Links Chroma Feature Analysis and Synthesis https://labrosa.ee.columbia.edu/matlab/chroma-ansyn/ Chroma (Java) https://github.com/adiyoss/Chroma CHROMA TOOLBOX: MATLAB IMPLEMENTATIONS FOR EXTRACTING VARIANTS OF CHROMA-BASED AUDIO FEATURES https://www.academia.edu/30760846/ CHROMA_TOOLBOX_MATLAB_IMPLEMENTATIONS_FOR_EXTRACTING_VARIANTS_OF_CH ROMA-BASED_AUDIO_FEATURES FALCON: FAst Lucene-based Cover sOng identification https://www.researchgate.net/publication/221571301_FALCON_FAst_Lucene-based_Cover_sOng_identification Music Theory https://en.wikipedia.org/wiki/Music_theory Inversions https://en.wikipedia.org/wiki/Inversion_(music) Intervals https://en.wikipedia.org/wiki/Interval_(music) Chroma Feature https://en.wikipedia.org/wiki/Chroma_feature
  • 42. London Information Retrieval Meetup 21 OCT 2019 Thank you! Music Information Retrieval Take 2 Interval Hashing Based Ranking Andrea Gazzarini, Software Engineer 21th October 2019