Audio identification

Tonal representations for music retrieval: from version identification to query-by-humming

Publication Type:

Journal Article

Source:

Int. Journal of Multimedia Information Retrieval, special issue on Hybrid Music Information Retrieval, Springer (In Press)

Abstract:

In this study we compare the use of different music representations for retrieving alternative performances of the same musical piece, a task commonly referred to as version identification. Given the audio signal of a song, we compute descriptors representing its melody, bass line and harmonic progression using state-of-the-art algorithms. These descriptors are then employed to retrieve different versions of the same musical piece using a dynamic programming algorithm based on nonlinear time series analysis. First, we evaluate the accuracy obtained using individual descriptors, and then we examine whether performance can be improved by combining these music representations (i.e. descriptor fusion). Our results show that whilst harmony is the most reliable music representation for version identification, the melody and bass line representations also carry useful information for this task. Furthermore, we show that by combining these tonal representations we can increase version detection accuracy. Finally, we demonstrate how the proposed version identification method can be adapted for the task of query-by-humming. We propose a melody-based retrieval approach, and demonstrate how melody representations extracted from recordings of a cappella singing can be successfully used to retrieve the original song from a collection of polyphonic audio. The current limitations of the proposed approach are discussed in the context of version identification and query-by-humming, and possible solutions and future research directions are proposed.

Structure-based audio fingerprinting for music retrieval

Publication Type:

Conference Paper

Source:

Int. Soc. for Music Information Retrieval Conf. (ISMIR), Porto, Portugal, p.55-60 (2012)

URL:

http://ismir2012.ismir.net/event/papers/055-ismir-2012.pdf

Abstract:

Content-based approaches to music retrieval are of great relevance as they do not require any kind of manually generated annotations. In this paper, we introduce the concept of structure fingerprints, which are compact descriptors of the musical structure of an audio recording. Given a recorded music performance, structure fingerprints facilitate the retrieval of other performances sharing the same underlying structure. Avoiding any explicit determination of musical structure, our fingerprints can be thought of a probability density function derived from a self-similarity matrix. We show that the proposed fingerprints can be compared using simple Euclidean distances without using any kind of complex warping operations required in previous approaches. Experiments on a collection of Chopin Mazurkas reveal that structure fingerprints facilitate robust and efficient content-based music retrieval. Furthermore, we give a musically informed discussion that also deepens the understanding of the popular Mazurka dataset.

Melody, bassline and harmony representations for music version identification

Publication Type:

Conference Paper

Source:

Int. World Wide Web Conf., Workshop on Advances on Music Information Retrieval (AdMIRe), WWW, Lyon, France, p.887-894 (2012)

URL:

http://www2012.wwwconference.org/proceedings/forms/companion.htm#8

Abstract:

In this paper we compare the use of different musical representations for the task of version identification (i.e. retrieving alternative performances of the same musical piece). We automatically compute descriptors representing the melody and bass line using a state-of-the-art melody extraction algorithm, and compare them to a harmony-based descriptor. The similarity of descriptor sequences is computed using a dynamic programming algorithm based on nonlinear time series analysis which has been successfully used for version identification with harmony descriptors. After evaluating the accuracy of individual descriptors, we assess whether performance can be improved by descriptor fusion, for which we apply a classification approach, comparing different classification algorithms. We show that both melody and bass line descriptors carry useful information for version identification, and that combining them increases version detection accuracy. Whilst harmony remains the most reliable musical representation for version identification, we demonstrate how in some cases performance can be improved by combining it with melody and bass line descriptions. Finally, we identify some of the limitations of the proposed descriptor fusion approach, and discuss directions for future research.

PDFFile: 

Automatic identification of samples in hip hop music

Publication Type:

Conference Paper

Source:

Int. Symp. on Computer Music Modeling and Retrieval (CMMR), London, UK, p.544-551 (2012)

URL:

http://cmmr2012.eecs.qmul.ac.uk/sites/cmmr2012.eecs.qmul.ac.uk/files/pdf/papers/cmmr2012_submission_19.pdf

Keywords:

Music; Information retrieval; Audio identification

Abstract:

Digital sampling can be defined as the use of a fragment of another artist’s recording in a new work, and is common practice in popular music production since the 1980’s. Knowledge on the origins of samples hold valuable musicological information, which could in turn be used to organise music collections. Yet the automatic recognition of samples has not been addressed in the music retrieval community. In this paper, we introduce the problem, situate it in the field of content-based music retrieval and present a first strategy to approach it. Evaluation confirms that our modified optimised fingerprinting approach is indeed a viable strategy.

Syndicate content