Automatically inferring the structural properties of raw multimedia documents is essential in today's digitized society. Given its hierarchical and multi-faceted organization, musical pieces represent a challenge for current computational systems. In this paper we present a novel approach to music structure annotation based on structure features and time series similarity. The proposed structure features encapsulate both local and global properties of a time series, and allow us to detect boundaries between homogeneous, novel, or repeated segments. After boundary detection, time series similarity is used to identify equivalent segments, corresponding to musically meaningful parts. Extensive tests with a total of five benchmark music collections and seven different human annotations show that the proposed approach is robust to different ground truth choices and parameter settings. Moreover, we see that it outperforms the other literature approaches evaluated under the same framework. Our results stress the importance of a robust boundary detection strategy as a first step for structure annotation, which is an often underestimated aspect.
Enlaces:
[1] http://www.iiia.csic.es/es/individual/joan-serra
[2] http://www.iiia.csic.es/es/node/4492
[3] http://www.iiia.csic.es/es/node/4491
[4] http://www.iiia.csic.es/es/individual/josep-lluis-arcos
[5] http://www.iiia.csic.es/es/publications/export/tagged/4766
[6] http://www.iiia.csic.es/es/publications/export/xml/4766
[7] http://www.iiia.csic.es/es/publications/export/bib/4766