1,956 research outputs found
Modeling musicological information as trigrams in a system for simultaneous chord and local key extraction
In this paper, we discuss the introduction of a trigram musicological model in a simultaneous chord and local key extraction system. By enlarging the context of the musicological model, we hoped to achieve a higher accuracy that could justify the associated higher complexity and computational load of the search for the optimal solution. Experiments on multiple data sets have demonstrated that the trigram model has indeed a larger predictive power (a lower perplexity). This raised predictive power resulted in an improvement in the key extraction capabilities, but no improvement in chord extraction when compared to a system with a bigram musicological model
Towards a style-specific basis for computational beat tracking
Outlined in this paper are a number of sources of evidence, from psychological, ethnomusicological and engineering grounds, to suggest that current approaches to computational beat tracking are incomplete. It is contended that the degree to which cultural knowledge, that is, the specifics of style and associated learnt representational schema, underlie the human faculty of beat tracking has been severely underestimated. Difficulties in building general beat tracking solutions, which can provide both period and phase locking across a large corpus of styles, are highlighted. It is probable that no universal beat tracking model exists which does not utilise a switching model to recognise style and context prior to application
Beat histogram features for rhythm-based musical genre classification using multiple novelty functions
In this paper we present beat histogram features for multiple level rhythm description and evaluate them in a musical genre classification task. Audio features pertaining to various musical content categories and their related novelty functions are extracted as a basis for the creation of beat histograms. The proposed features capture not only amplitude, but also tonal and general spectral changes in the signal, aiming to represent as much rhythmic information as possible. The most and least informative features are identified through feature selection methods and are then tested using Support Vector Machines on five genre datasets concerning classification accuracy against a baseline feature set. Results show that the presented features provide comparable classification accuracy with respect to other genre classification approaches using periodicity histograms and display a performance close to that of much more elaborate up-to-date approaches for rhythm description. The use of bar boundary annotations for the texture frames has provided an improvement for the dance-oriented Ballroom dataset. The comparably small number of descriptors and the possibility of evaluating the influence of specific signal components to the general rhythmic content encourage the further use of the method in rhythm description tasks
Music content analysis: Key, chord and rhythm tracking in acoustic signals
Master'sMASTER OF SCIENC
Sparse and structured decomposition of audio signals on hybrid dictionaries using musical priors
International audienceThis paper investigates the use of musical priors for sparse expansion of audio signals of music, on an overcomplete dual-resolution dictionary taken from the union of two orthonormal bases that can describe both transient and tonal components of a music audio signal. More specifically, chord and metrical structure information are used to build a structured model that takes into account dependencies between coefficients of the decomposition, both for the tonal and for the transient layer. The denoising task application is used to provide a proof of concept of the proposed musical priors. Several configurations of the model are analyzed. Evaluation on monophonic and complex polyphonic excerpts of real music signals shows that the proposed approach provides results whose quality measured by the signal-to-noise ratio is competitive with state-of-the-art approaches, and more coherent with the semantic content of the signal. A detailed analysis of the model in terms of sparsity and in terms of interpretability of the representation is also provided, and shows that the model is capable of giving a relevant and legible representation of Western tonal music audio signals
Contributions to automatic multiple F0 detection in polyphonic music signals
Multiple fundamental frequency estimation, or multi-pitch estimation (MPE), is a key problem in automatic music transcription (AMT) and many other related audio processing tasks. Applications of AMT are numerous, ranging from musical genre classification to automatic piano tutoring, and these form a significant part of musical information retrieval tasks. Current AMT systems still perform considerably below human experts, and there is a consensus that the development of an automated system for full transcription of polyphonic music regardless of its complexity is still an open problem. The goal of this work is to propose contributions for the automatic detection of multiple fundamental frequencies in polyphonic music signals. A reference MPE method is chosen to be studied and implemented, and a modification is proposed to improve the performance of the system. Lastly, three refinement strategies are proposed to be incorporated into the modified method, in order to increase the quality of the results. Experimental tests reveal that such refinements improve the overall performance of the system, even if each one performs differently according to signal characteristics.Estimação de múltiplas frequências fundamentais (MPE, do inglês multipitch estimation) é um problema importante na área de transcrição musical automática (TMA) e em muitas outras tarefas relacionadas a processamento de áudio. Aplicações de TMA são diversas, desde classificação de gêneros musicais ao aprendizado automático de piano, as quais consistem em uma parcela significativa de tarefas de extração de informação musical. Métodos atuais de TMA ainda possuem um desempenho consideravelmente ruim quando comparados aos de profissionais da área, e há um consenso que o desenvolvimento de um sistema automatizado para a transcrição completa de música polifônica independentemente de sua complexidade ainda é um problema em aberto. O objetivo deste trabalho é propor contribuições para a detecção automática de múltiplas frequências fundamentais em sinais de música polifônica. Um método de referência para MPEé primeiramente escolhido para ser estudado e implementado, e uma modificação é proposta para melhorar o desempenho do sistema. Por fim, três estratégias de refinamento são propostas para serem incorporadas ao método modificado, com o objetivo de aumentar a qualidade dos resultados. Testes experimentais mostram que tais refinamentos melhoram em média o desempenho do sistema, embora cada um atue de uma maneira diferente de acordo com a natureza dos sinais
MorpheuS: Generating Structured Music with Constrained Patterns and Tension
Automatic music generation systems have gained in popularity and
sophistication as advances in cloud computing have enabled large-scale complex
computations such as deep models and optimization algorithms on personal
devices. Yet, they still face an important challenge, that of long-term
structure, which is key to conveying a sense of musical coherence. We present
the MorpheuS music generation system designed to tackle this problem. MorpheuS'
novel framework has the ability to generate polyphonic pieces with a given
tension profile and long- and short-term repeated pattern structures. A
mathematical model for tonal tension quantifies the tension profile and
state-of-the-art pattern detection algorithms extract repeated patterns in a
template piece. An efficient optimization metaheuristic, variable neighborhood
search, generates music by assigning pitches that best fit the prescribed
tension profile to the template rhythm while hard constraining long-term
structure through the detected patterns. This ability to generate affective
music with specific tension profile and long-term structure is particularly
useful in a game or film music context. Music generated by the MorpheuS system
has been performed live in concerts.Comment: IEEE Transactions on Affective Computing. PP(99
Sequential Complexity as a Descriptor for Musical Similarity
We propose string compressibility as a descriptor of temporal structure in
audio, for the purpose of determining musical similarity. Our descriptors are
based on computing track-wise compression rates of quantised audio features,
using multiple temporal resolutions and quantisation granularities. To verify
that our descriptors capture musically relevant information, we incorporate our
descriptors into similarity rating prediction and song year prediction tasks.
We base our evaluation on a dataset of 15500 track excerpts of Western popular
music, for which we obtain 7800 web-sourced pairwise similarity ratings. To
assess the agreement among similarity ratings, we perform an evaluation under
controlled conditions, obtaining a rank correlation of 0.33 between intersected
sets of ratings. Combined with bag-of-features descriptors, we obtain
performance gains of 31.1% and 10.9% for similarity rating prediction and song
year prediction. For both tasks, analysis of selected descriptors reveals that
representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio
09051 Abstracts Collection -- Knowledge representation for intelligent music processing
From the twenty-fifth to the thirtieth of January, 2009, the
Dagstuhl Seminar 09051 on ``Knowledge representation for intelligent music
processing\u27\u27 was held in Schloss Dagstuhl~--~Leibniz Centre for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts
of the presentations and demos given during the seminar as well as
plenary presentations, reports of workshop discussions, results and
ideas are put together in this paper. The first section describes the
seminar topics and goals in general, followed by plenary `stimulus\u27
papers, followed by reports and abstracts arranged by workshop
followed finally by some concluding materials providing views of both
the seminar itself and also forward to the longer-term goals of the
discipline. Links to extended abstracts, full papers and supporting
materials are provided, if available.
The organisers thank David Lewis for editing these proceedings
- …