Search CORE

5,116 research outputs found

Identifying Cover Songs Using Information-Theoretic Measures of Similarity

Author: Dixon S
Foster P
Klapuri A
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/This paper investigates methods for quantifying similarity between audio signals, specifically for the task of cover song detection. We consider an information-theoretic approach, where we compute pairwise measures of predictability between time series. We compare discrete-valued approaches operating on quantized audio features, to continuous-valued approaches. In the discrete case, we propose a method for computing the normalized compression distance, where we account for correlation between time series. In the continuous case, we propose to compute information-based measures of similarity as statistics of the prediction error between time series. We evaluate our methods on two cover song identification tasks using a data set comprised of 300 Jazz standards and using the Million Song Dataset. For both datasets, we observe that continuous-valued approaches outperform discrete-valued approaches. We consider approaches to estimating the normalized compression distance (NCD) based on string compression and prediction, where we observe that our proposed normalized compression distance with alignment (NCDA) improves average performance over NCD, for sequential compression algorithms. Finally, we demonstrate that continuous-valued distances may be combined to improve performance with respect to baseline approaches. Using a large-scale filter-and-refine approach, we demonstrate state-of-the-art performance for cover song identification using the Million Song Dataset.The work of P. Foster was supported by an Engineering and Physical Sciences Research Council Doctoral Training Account studentship

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queen Mary Research Online

Perception of structure in auditory patterns

Author: Ash Roisin L
Publication venue: University of Stirling
Publication date: 01/01/1998
Field of study

The present research utilised five tasks to investigate non-musicians' perception of phrase, rhythm, pitch and beat structure in unaccompanied Gaelic melodies and musical sequences. Perception of phrase structure was examined using: i) a segmentation task in which listeners segmented Gaelic melodies into a series of meaningful units and ii) a novel click localisation task whereby listeners indicated where they perceived a superimposed click in the melody had occurred. Listeners consistently segmented the melodies into units of 2.4 - 5.4 seconds. Clicks which were positioned before and after perceived boundaries (identified by segmentation) were perceptually migrated towards the boundary. These results suggest that listeners perceptually differentiate between phrasal groups in melodies (See Sloboda & Gregory, 1980; Stoffer, 1985, for similar results with musicians). Short term memory for rhythmic structure was examined using rhythm recall of computer generated sequences and Gaelic melodies. Computer generated rhythms with small tonal pitch intervals (1 - 4 semitones) were easier to recall than large atonal intervals (predominantly greater than 4 semitones). Recall of Gaelic melodies, containing repetitive rhythmic units, was better than recall of computer sequences. Pitch reversal of Gaelic melodies did not effect recall. Beat-tapping with three Gaelic melodies revealed that the majority of listeners established the underlying beat 1.5 - 3 seconds (5 - 6 notes) after the start of the melodies. Perception of meaning and content in two note melodic intervals and three Gaelic melodies was examined using an adjective pair two-alternative forced choice task. Responses to musical intervals showed evidence of perceptual similarity based mainly on interval size. Perceived information content in the melodies increased significantly by the fourth note. The results suggest that the amounts of Gaelic melody which are: i) required to establish an underlying beat, ii) remembered after one hearing, and iii) perceptually grouped into a meaningful unit, include the unit of melody which is necessary to establish a basic meaning

Stirling Online Research Repository

Review of Zohar Eitan. Highpoints: A Study of Melodic Peaks

Author: Everett Yayoi Uno
Publication venue: Indiana University Graduate Theory Association
Publication date: 01/01/1997
Field of study

IUScholarWorks (University of Indiana)

Automatic Phrase Continuation from Guitar and Bass guitar Melodies

Author: Cherla Srikanth
Marchini Marco
Purwins Hendrik
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2013
Field of study

VBN

Towards a general computational theory of musical structure

Author: Cambouropoulos Emilios
Publication venue: The University of Edinburgh
Publication date: 01/01/1998
Field of study

The General Computational Theory of Musical Structure (GCTMS) is a theory that may be employed to obtain a structural description (or set of descriptions) of a musical surface. This theory is based on general cognitive and logical principles, is independent of any specific musical style or idiom, and can be applied to any musical surface. The musical work is presented to GCTMS as a sequence of discrete symbolically represented events (e.g. notes) without higher-level structural elements (e.g. articulation marks, timesignature etc.)- although such information may be used to guide the analytic process. The aim of the application of the theory is to reach a structural description of the musical work that may be considered as 'plausible' or 'permissible' by a human music analyst. As styledependent knowledge is not embodied in the general theory, highly sophisticated analyses (similar to those an expert analyst may provide) are not expected. The theory gives, however, higher rating to descriptions that may be considered more reasonable or acceptable by human analysts and lower to descriptions that are less plausible

CiteSeerX

Edinburgh Research Archive

A statistical MMN reflects the magnitude of transitional probabilities in auditory sequences

Author: Busch Tobias
Jentschke Sebastian
Koelsch Stefan
Rohrmeier Martin
Publication venue
Publication date: 01/01/2016
Field of study

Within the framework of statistical learning, many behavioural studies investigated the processing of unpredicted events. However, surprisingly few neurophysiological studies are available on this topic, and no statistical learning experiment has investigated electroencephalographic (EEG) correlates of processing events with different transition probabilities. We carried out an EEG study with a novel variant of the established statistical learning paradigm. Timbres were presented in isochronous sequences of triplets. The first two sounds of all triplets were equiprobable, while the third sound occurred with either low (10%), intermediate (30%), or high (60%) probability. Thus, the occurrence probability of the third item of each triplet (given the first two items) was varied. Compared to high-probability triplet endings, endings with low and intermediate probability elicited an early anterior negativity that had an onset around 100 ms and was maximal at around 180 ms. This effect was larger for events with low than for events with intermediate probability. Our results reveal that, when predictions are based on statistical learning, events that do not match a prediction evoke an early anterior negativity, with the amplitude of this mismatch response being inversely related to the probability of such events. Thus, we report a statistical mismatch negativity (sMMN) that reflects statistical learning of transitional probability distributions that go beyond auditory sensory memory capabilities

Institutional Repository of the Freie Universität Berlin

The Interpersonal Entrainment in Music Performance Data Collection

Author: Alborno Paolo
Camurri Antonio
Clayton Martin
Eerola Tuomas
Jacoby Nori
Jakubowski Kelly
Jankowsky Richard
Jure Luis
Leante Laura
Polak Rainer
Poole Adrian
Rocamora Martín
Tarsitani Simone
Publication venue: 'The Ohio State University Libraries'
Publication date: 01/01/2021
Field of study

The Interpersonal Entrainment in Music Performance Data Collection (IEMPDC) comprises six related corpora of music research materials: Cuban Son & Salsa (CSS), European String Quartet (ESQ), Malian Jembe (MJ), North Indian Raga (NIR), Tunisian Stambeli (TS), and Uruguayan Candombe (UC). The core data for each corpus comprises media files and computationally extracted event onset timing data. Annotation of metrical structure and code used in the preparation of the collection is also shared. The collection is unprecedented in size and level of detail and represents a significant new resource for empirical and computational research in music. In this article we introduce the main features of the data collection and the methods used in its preparation. Details of technical validation procedures and notes on data visualization are available as Appendices. We also contextualize the collection in relation to developments in Open Science and Open Data, discussing important distinctions between the two related concepts

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Convolutional Methods for Music Analysis

Author: Velarde Gissel
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2017
Field of study

VBN