4,864 research outputs found
IDENTIFICATION OF COVER SONGS USING INFORMATION THEORETIC MEASURES OF SIMILARITY
13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted versio
Recommended from our members
Large-Scale Cover Song Recognition Using the 2D Fourier Transform Magnitude
Large-scale cover song recognition involves calculating item-to-item similarities that can accommodate differences in timing and tempo, rendering simple Euclidean measures unsuitable. Expensive solutions such as dynamic time warping do not scale to million of instances, making them inappropriate for commercial-scale applications. In this work, we transform a beat-synchronous chroma matrix with a 2D Fourier transform and show that the resulting representation has properties that fit the cover song recognition task. We can also apply PCA to efficiently scale comparisons. We report the best results to date on the largest available dataset of around 18,000 cover songs amid one million tracks, giving a mean average precision of 3.0%
Neural networks for musical chords recognition
peer reviewedIn this paper, we consider the challenging problem of music recognition and present an effective machine learning based method using a feed-forward neural network for chord recognition. The method uses the known feature vector for automatic chord recognition called the Pitch Class Profile (PCP). Although the PCP vector only provides attributes corresponding to 12 semi-tone values, we show that it is adequate for chord recognition.
Part of our work also relates to the design of a database of chords. Our database is primarily designed for chords typical of Western Europe music. In particular, we have built a large dataset filled with recorded guitar chords under different acquisition conditions (instruments, microphones, etc), but also with samples obtained with other instruments. Our experiments establish a twofold result: (1) the PCP is well suited for describing chords in a machine learning context, and (2) the algorithm is also capable to recognize chords played with other instruments, even unknown from the training phase
Computational Tonality Estimation: Signal Processing and Hidden Markov Models
PhDThis thesis investigates computational musical tonality estimation from an audio signal. We
present a hidden Markov model (HMM) in which relationships between chords and keys are
expressed as probabilities of emitting observable chords from a hidden key sequence. The model
is tested first using symbolic chord annotations as observations, and gives excellent global key
recognition rates on a set of Beatles songs.
The initial model is extended for audio input by using an existing chord recognition algorithm,
which allows it to be tested on a much larger database. We show that a simple model of the
upper partials in the signal improves percentage scores. We also present a variant of the HMM
which has a continuous observation probability density, but show that the discrete version gives
better performance.
Then follows a detailed analysis of the effects on key estimation and computation time of
changing the low level signal processing parameters. We find that much of the high frequency
information can be omitted without loss of accuracy, and significant computational savings can
be made by applying a threshold to the transform kernels. Results show that there is no single
ideal set of parameters for all music, but that tuning the parameters can make a difference to
accuracy.
We discuss methods of evaluating more complex tonal changes than a single global key, and
compare a metric that measures similarity to a ground truth to metrics that are rooted in music
retrieval. We show that the two measures give different results, and so recommend that the choice
of evaluation metric is determined by the intended application.
Finally we draw together our conclusions and use them to suggest areas for continuation of this
research, in the areas of tonality model development, feature extraction, evaluation methodology,
and applications of computational tonality estimation.Engineering and Physical
Sciences Research Council (EPSRC)
Sequential decision making in artificial musical intelligence
Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used in everyday technology. Despite its many recent successes and growing prevalence, certain meaningful facets of computational intelligence have not been as thoroughly explored. Such additional facets cover a wide array of complex mental tasks which humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over the last decade, many researchers have applied computational tools to carry out tasks such as genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents, able to mimic (at least partially) the complexity with which humans approach music. One key aspect which hasn't been sufficiently studied is that of sequential decision making in musical intelligence. This thesis strives to answer the following question: Can a sequential decision making perspective guide us in the creation of better music agents, and social agents in general? And if so, how? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and human-agent (and more generally agent-agent) interaction in the context of music. The key contributions of this thesis are the design of better music playlist recommendation algorithms; the design of algorithms for tracking user preferences over time; new approaches for modeling people's behavior in situations that involve music; and the design of agents capable of meaningful interaction with humans and other agents in a setting where music plays a roll (either directly or indirectly). Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, this thesis also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as different types of content recommendation. Showing the generality of insights from musical data in other contexts serves as evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques. Ultimately, this thesis demonstrates the overall usefulness of taking a sequential decision making approach in settings previously unexplored from this perspectiveComputer Science
- …