4 research outputs found
Large Vocabulary Automatic Chord Estimation Using Deep Neural Nets: Design Framework, System Variations and Limitations
In this paper, we propose a new system design framework for large vocabulary
automatic chord estimation. Our approach is based on an integration of
traditional sequence segmentation processes and deep learning chord
classification techniques. We systematically explore the design space of the
proposed framework for a range of parameters, namely deep neural nets, network
configurations, input feature representations, segment tiling schemes, and
training data sizes. Experimental results show that among the three proposed
deep neural nets and a baseline model, the recurrent neural network based
system has the best average chord quality accuracy that significantly
outperforms the other considered models. Furthermore, our bias-variance
analysis has identified a glass ceiling as a potential hindrance to future
improvements of large vocabulary automatic chord estimation systems
A Critical Look at the Applicability of Markov Logic Networks for Music Signal Analysis
In recent years, Markov logic networks (MLNs) have been proposed as a
potentially useful paradigm for music signal analysis. Because all hidden
Markov models can be reformulated as MLNs, the latter can provide an
all-encompassing framework that reuses and extends previous work in the field.
However, just because it is theoretically possible to reformulate previous work
as MLNs, does not mean that it is advantageous. In this paper, we analyse some
proposed examples of MLNs for musical analysis and consider their practical
disadvantages when compared to formulating the same musical dependence
relationships as (dynamic) Bayesian networks. We argue that a number of
practical hurdles such as the lack of support for sequences and for arbitrary
continuous probability distributions make MLNs less than ideal for the proposed
musical applications, both in terms of easy of formulation and computational
requirements due to their required inference algorithms. These conclusions are
not specific to music, but apply to other fields as well, especially when
sequential data with continuous observations is involved. Finally, we show that
the ideas underlying the proposed examples can be expressed perfectly well in
the more commonly used framework of (dynamic) Bayesian networks.Comment: Accepted for presentation at the Ninth International Workshop on
Statistical Relational AI (StarAI 2020) at the 34th AAAI Conference on
Artificial Intelligence (AAAI) in New York, on February 7th 202
Artificial Musical Intelligence: A Survey
Computers have been used to analyze and create music since they were first
introduced in the 1950s and 1960s. Beginning in the late 1990s, the rise of the
Internet and large scale platforms for music recommendation and retrieval have
made music an increasingly prevalent domain of machine learning and artificial
intelligence research. While still nascent, several different approaches have
been employed to tackle what may broadly be referred to as "musical
intelligence." This article provides a definition of musical intelligence,
introduces a taxonomy of its constituent components, and surveys the wide range
of AI methods that can be, and have been, brought to bear in its pursuit, with
a particular emphasis on machine learning methods.Comment: 99 pages, 5 figures, preprint: currently under revie
DECIBEL: Improving Audio Chord Estimation for Popular Music by Alignment and Integration of Crowd-Sourced Symbolic Representations
Automatic Chord Estimation (ACE) is a fundamental task in Music Information
Retrieval (MIR) and has applications in both music performance and MIR
research. The task consists of segmenting a music recording or score and
assigning a chord label to each segment. Although it has been a task in the
annual benchmarking evaluation MIREX for over 10 years, ACE is not yet a solved
problem, since performance has stagnated and modern systems have started to
tune themselves to subjective training data. We propose DECIBEL, a new ACE
system that exploits widely available MIDI and tab representations to improve
ACE from audio only. From an audio file and a set of MIDI and tab files
corresponding to the same popular music song, DECIBEL first estimates chord
sequences. For audio, state-of-the-art audio ACE methods are used. MIDI files
are aligned to the audio, followed by a MIDI chord estimation step. Tab files
are transformed into untimed chord sequences and then aligned to the audio.
Next, DECIBEL uses data fusion to integrate all estimated chord sequences into
one final output sequence. DECIBEL improves all tested state-of-the-art ACE
methods by over 3 percent on average. This result shows that the integration of
musical knowledge from heterogeneous symbolic music representations is a
suitable strategy for addressing challenging MIR tasks such as ACE.Comment: 81 pages, 47 figure