4,344 research outputs found
Automatic chord transcription from audio using computational models of musical context
PhDThis thesis is concerned with the automatic transcription of chords from audio, with an emphasis
on modern popular music. Musical context such as the key and the structural segmentation aid
the interpretation of chords in human beings. In this thesis we propose computational models
that integrate such musical context into the automatic chord estimation process.
We present a novel dynamic Bayesian network (DBN) which integrates models of metric
position, key, chord, bass note and two beat-synchronous audio features (bass and treble
chroma) into a single high-level musical context model. We simultaneously infer the most probable
sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several
experiments with real world data show that adding context parameters results in a significant
increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed,
most complex method transcribes chords with a state-of-the-art accuracy of 73% on the song
collection used for the 2009 MIREX Chord Detection tasks. This method is used as a baseline
method for two further enhancements.
Firstly, we aim to improve chord confusion behaviour by modifying the audio front end
processing. We compare the effect of learning chord profiles as Gaussian mixtures to the effect
of using chromagrams generated from an approximate pitch transcription method. We show
that using chromagrams from approximate transcription results in the most substantial increase
in accuracy. The best method achieves 79% accuracy and significantly outperforms the state of
the art.
Secondly, we propose a method by which chromagram information is shared between
repeated structural segments (such as verses) in a song. This can be done fully automatically
using a novel structural segmentation algorithm tailored to this task. We show that the technique
leads to a significant increase in accuracy and readability. The segmentation algorithm itself
also obtains state-of-the-art results. A method that combines both of the above enhancements
reaches an accuracy of 81%, a statistically significant improvement over the best result (74%)
in the 2009 MIREX Chord Detection tasks.Engineering and Physical Research Council U
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Automatic music transcription: challenges and future directions
Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects
Multi-channel approaches for musical audio content analysis
The goal of this research project is to undertake a critical evaluation of signal representations for musical audio content analysis. In particular it will contrast three different means for undertaking the analysis of micro-rhythmic content in Afro-Latin American music, namely through the use of: i) stereo or mono mixed recordings; ii) separated sources obtained via state of the art musical audio source separation techniques; and iii) the use of perfectly separated multi-track stems.
In total the project comprises the following four objectives: i) To compile a dataset of mixed and multi-channel recordings of the Brazilian Maracatu musicians; ii) To conceive methods for rhythmical micro-variations analysis and pattern recognition; iii) To explore diverse music source separation approaches that preserve micro-rhythmic content; iv) To evaluate the performance of several automatic onset estimation approaches; and v) To compare the rhythmic analysis obtained from the original multi-channel sources versus the separated ones to evaluate separation quality regarding microtiming identification
- …