Search CORE

1,274 research outputs found

Treating Inherent Ambiguity in Ground Truth Data: Evaluating a Chord Labelling Algorithm

Author: Lewis David
Müllensiefen Daniel
Rhodes Christophe
Wiggins Geraint
Publication venue
Publication date: 01/09/2007
Field of study

This paper outlines a method for evaluating a new chord-labelling algorithm using symbolic data as input. Excerpts from full-score transcriptions of 40 pop songs are used. The accuracy of the algorithm’s output is compared with that of chord labels from published song books, as assessed by experts in pop music theory. We are interested not only in the accuracy of the two sets of labels but also in the question of potential harmonic ambiguity as reflected the judges’ assessments. We focus, in this short paper, on outlining the general approach of this research project

Goldsmiths Research Online

Using discovered, polyphonic patterns to filter computer-generated music

Author: Collins Tom
Garthwaite Paul
Laney Robin
Willis Alistair
Publication venue
Publication date: 01/01/2010
Field of study

A metric for evaluating the creativity of a music-generating system is presented, the objective being to generate mazurka-style music that inherits salient patterns from an original excerpt by FrÃ©dÃ©ric Chopin. The metric acts as a filter within our overall system, causing rejection of generated passages that do not inherit salient patterns, until a generated passage survives. Over fifty iterations, the mean number of generations required until survival was 12.7, with standard deviation 13.2. In the interests of clarity and replicability, the system is described with reference to specific excerpts of music. Four concepts–Markov modelling for generation, pattern discovery, pattern quantification, and statistical testing–are presented quite distinctly, so that the reader might adopt (or ignore) each concept as they wish

CiteSeerX

Open Research Online (The Open University)

De Montfort University Open Research Archive

Deep Learning for Audio Signal Processing

Author: Chang Shuo-yiin
Li Bo
Purwins Hendrik
Sainath Tara
Schlüter Jan
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2019
Field of study

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

arXiv.org e-Print Archive

VBN

Recommended from our members

From on-line sketching to 2D and 3D geometry: A fuzzy knowledge based system

Author: Jordanov IN
Qin SF
Wright DK
Publication venue: 'Elsevier BV'
Publication date: 01/01/2000
Field of study

The paper describes the development of a fuzzy knowledge based prototype system for conceptual design. This real time system is designed to infer user’s sketching intentions, to segment sketched input and generate corresponding geometric primitives: straight lines, circles, arcs, ellipses, elliptical arcs, and B-spline curves. Topology information (connectivity, unitary constraints and pairwise constraints) is received dynamically from 2D sketched input and primitives. From the 2D topology information, a more accurate 2D geometry can be built up by applying a 2D geometric constraint solver. Subsequently, 3D geometry can be received feature by feature incrementally. Each feature can be recognised by inference knowledge in terms of matching its 2D primitive configurations and connection relationships. The system accepts not only sketched input, working as an automatic design tools, but also accepts user’s interactive input of both 2D primitives and special positional 3D primitives. This makes it easy and friendly to use. The system has been tested with a number of sketched inputs of 2D and 3D geometry

Brunel University Research Archive

Towards automatic extraction of harmony information from music signals

Author: Harte Christopher
Publication venue
Publication date: 01/01/2010
Field of study

PhDIn this thesis we address the subject of automatic extraction of harmony information from audio recordings. We focus on chord symbol recognition and methods for evaluating algorithms designed to perform that task. We present a novel six-dimensional model for equal tempered pitch space based on concepts from neo-Riemannian music theory. This model is employed as the basis of a harmonic change detection function which we use to improve the performance of a chord recognition algorithm. We develop a machine readable text syntax for chord symbols and present a hand labelled chord transcription collection of 180 Beatles songs annotated using this syntax. This collection has been made publicly available and is already widely used for evaluation purposes in the research community. We also introduce methods for comparing chord symbols which we subsequently use for analysing the statistics of the transcription collection. To ensure that researchers are able to use our transcriptions with confidence, we demonstrate a novel alignment algorithm based on simple audio fingerprints that allows local copies of the Beatles audio files to be accurately aligned to our transcriptions automatically. Evaluation methods for chord symbol recall and segmentation measures are discussed in detail and we use our chord comparison techniques as the basis for a novel dictionary-based chord symbol recall calculation. At the end of the thesis, we evaluate the performance of fifteen chord recognition algorithms (three of our own and twelve entrants to the 2009 MIREX chord detection evaluation) on the Beatles collection. Results are presented for several different evaluation measures using a range of evaluation parameters. The algorithms are compared with each other in terms of performance but we also pay special attention to analysing and discussing the benefits and drawbacks of the different evaluation methods that are used

Queen Mary Research Online

OpenGrey Repository