1,274 research outputs found
Treating Inherent Ambiguity in Ground Truth Data: Evaluating a Chord Labelling Algorithm
This paper outlines a method for evaluating a new chord-labelling algorithm using symbolic data as input. Excerpts from full-score transcriptions of 40 pop songs are used. The accuracy of the algorithm’s output is compared with that of chord labels from published song books, as assessed by experts in pop music theory. We are interested not only in the accuracy of the two sets of labels but also in the question of potential harmonic ambiguity as reflected the judges’ assessments. We focus, in this short paper, on outlining the general approach of this research project
Using discovered, polyphonic patterns to filter computer-generated music
A metric for evaluating the creativity of a music-generating system is presented, the objective being to generate mazurka-style music that inherits salient patterns from an original excerpt by Frédéric Chopin. The metric acts as a filter within our overall system, causing rejection of generated passages that do not inherit salient patterns, until a generated passage survives. Over fifty iterations, the mean number of generations required until survival was 12.7, with standard deviation 13.2. In the interests of clarity and replicability, the system is described with reference to specific excerpts of music. Four concepts–Markov modelling for generation, pattern discovery, pattern quantification, and statistical testing–are presented quite distinctly, so that the reader might adopt (or ignore) each concept as they wish
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
Recommended from our members
From on-line sketching to 2D and 3D geometry: A fuzzy knowledge based system
The paper describes the development of a fuzzy knowledge based prototype system for conceptual design. This real time system is designed to infer user’s sketching intentions, to segment sketched input and generate corresponding geometric primitives: straight lines, circles, arcs, ellipses, elliptical arcs, and B-spline curves. Topology information (connectivity, unitary constraints and pairwise constraints) is received dynamically from 2D sketched input and primitives. From the 2D topology information, a more accurate 2D geometry can be built up by applying a 2D geometric constraint solver. Subsequently, 3D geometry can be received feature by feature incrementally. Each feature can be recognised by inference knowledge in terms of matching its 2D primitive configurations and connection relationships. The system accepts not only sketched input, working as an automatic design tools, but also accepts user’s interactive input of both 2D primitives and special positional 3D primitives. This makes it easy and friendly to use. The system has been tested with a number of sketched inputs of 2D and 3D geometry
Towards automatic extraction of harmony information from music signals
PhDIn this thesis we address the subject of automatic extraction of harmony
information from audio recordings. We focus on chord symbol recognition
and methods for evaluating algorithms designed to perform that task.
We present a novel six-dimensional model for equal tempered pitch
space based on concepts from neo-Riemannian music theory. This model
is employed as the basis of a harmonic change detection function which
we use to improve the performance of a chord recognition algorithm.
We develop a machine readable text syntax for chord symbols and
present a hand labelled chord transcription collection of 180 Beatles songs
annotated using this syntax. This collection has been made publicly available
and is already widely used for evaluation purposes in the research
community. We also introduce methods for comparing chord symbols
which we subsequently use for analysing the statistics of the transcription
collection. To ensure that researchers are able to use our transcriptions
with confidence, we demonstrate a novel alignment algorithm based on
simple audio fingerprints that allows local copies of the Beatles audio files
to be accurately aligned to our transcriptions automatically.
Evaluation methods for chord symbol recall and segmentation measures
are discussed in detail and we use our chord comparison techniques
as the basis for a novel dictionary-based chord symbol recall calculation.
At the end of the thesis, we evaluate the performance of fifteen chord
recognition algorithms (three of our own and twelve entrants to the 2009
MIREX chord detection evaluation) on the Beatles collection. Results
are presented for several different evaluation measures using a range of
evaluation parameters. The algorithms are compared with each other in
terms of performance but we also pay special attention to analysing and
discussing the benefits and drawbacks of the different evaluation methods
that are used
Evaluating a weighted graph polynomial for graphs of bounded tree-width
We show that for any there is a polynomial time algorithm to evaluate the weighted graph polynomial of any graph with tree-width at most at any point. For a graph with vertices, the algorithm requires arithmetical operations, where depends only on
- …