1,893 research outputs found
Broadcasting Automata and Patterns on Z^2
The Broadcasting Automata model draws inspiration from a variety of sources
such as Ad-Hoc radio networks, cellular automata, neighbourhood se- quences and
nature, employing many of the same pattern forming methods that can be seen in
the superposition of waves and resonance. Algorithms for broad- casting
automata model are in the same vain as those encountered in distributed
algorithms using a simple notion of waves, messages passed from automata to au-
tomata throughout the topology, to construct computations. The waves generated
by activating processes in a digital environment can be used for designing a
vari- ety of wave algorithms. In this chapter we aim to study the geometrical
shapes of informational waves on integer grid generated in broadcasting
automata model as well as their potential use for metric approximation in a
discrete space. An explo- ration of the ability to vary the broadcasting radius
of each node leads to results of categorisations of digital discs, their form,
composition, encodings and gener- ation. Results pertaining to the nodal
patterns generated by arbitrary transmission radii on the plane are explored
with a connection to broadcasting sequences and ap- proximation of discrete
metrics of which results are given for the approximation of astroids, a
previously unachievable concave metric, through a novel application of the
aggregation of waves via a number of explored functions
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
Automatic chord transcription from audio using computational models of musical context
PhDThis thesis is concerned with the automatic transcription of chords from audio, with an emphasis
on modern popular music. Musical context such as the key and the structural segmentation aid
the interpretation of chords in human beings. In this thesis we propose computational models
that integrate such musical context into the automatic chord estimation process.
We present a novel dynamic Bayesian network (DBN) which integrates models of metric
position, key, chord, bass note and two beat-synchronous audio features (bass and treble
chroma) into a single high-level musical context model. We simultaneously infer the most probable
sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several
experiments with real world data show that adding context parameters results in a significant
increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed,
most complex method transcribes chords with a state-of-the-art accuracy of 73% on the song
collection used for the 2009 MIREX Chord Detection tasks. This method is used as a baseline
method for two further enhancements.
Firstly, we aim to improve chord confusion behaviour by modifying the audio front end
processing. We compare the effect of learning chord profiles as Gaussian mixtures to the effect
of using chromagrams generated from an approximate pitch transcription method. We show
that using chromagrams from approximate transcription results in the most substantial increase
in accuracy. The best method achieves 79% accuracy and significantly outperforms the state of
the art.
Secondly, we propose a method by which chromagram information is shared between
repeated structural segments (such as verses) in a song. This can be done fully automatically
using a novel structural segmentation algorithm tailored to this task. We show that the technique
leads to a significant increase in accuracy and readability. The segmentation algorithm itself
also obtains state-of-the-art results. A method that combines both of the above enhancements
reaches an accuracy of 81%, a statistically significant improvement over the best result (74%)
in the 2009 MIREX Chord Detection tasks.Engineering and Physical Research Council U
Automatic music transcription: challenges and future directions
Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects
Towards automatic extraction of harmony information from music signals
PhDIn this thesis we address the subject of automatic extraction of harmony
information from audio recordings. We focus on chord symbol recognition
and methods for evaluating algorithms designed to perform that task.
We present a novel six-dimensional model for equal tempered pitch
space based on concepts from neo-Riemannian music theory. This model
is employed as the basis of a harmonic change detection function which
we use to improve the performance of a chord recognition algorithm.
We develop a machine readable text syntax for chord symbols and
present a hand labelled chord transcription collection of 180 Beatles songs
annotated using this syntax. This collection has been made publicly available
and is already widely used for evaluation purposes in the research
community. We also introduce methods for comparing chord symbols
which we subsequently use for analysing the statistics of the transcription
collection. To ensure that researchers are able to use our transcriptions
with confidence, we demonstrate a novel alignment algorithm based on
simple audio fingerprints that allows local copies of the Beatles audio files
to be accurately aligned to our transcriptions automatically.
Evaluation methods for chord symbol recall and segmentation measures
are discussed in detail and we use our chord comparison techniques
as the basis for a novel dictionary-based chord symbol recall calculation.
At the end of the thesis, we evaluate the performance of fifteen chord
recognition algorithms (three of our own and twelve entrants to the 2009
MIREX chord detection evaluation) on the Beatles collection. Results
are presented for several different evaluation measures using a range of
evaluation parameters. The algorithms are compared with each other in
terms of performance but we also pay special attention to analysing and
discussing the benefits and drawbacks of the different evaluation methods
that are used
- …