169 research outputs found

    Automatic annotation of musical audio for interactive applications

    Get PDF
    PhDAs machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms. The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed. Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents); EPSRC grants GR/R54620; GR/S75802/01

    Extended Nonnegative Tensor Factorisation Models for Musical Sound Source Separation

    Get PDF
    Recently, shift-invariant tensor factorisation algorithms have been proposed for the purposes of sound source separation of pitched musical instruments. However, in practice, existing algorithms require the use of log-frequency spectrograms to allow shift invariance in frequency which causes problems when attempting to resynthesise the separated sources. Further, it is difficult to impose harmonicity constraints on the recovered basis functions. This paper proposes a new additive synthesis-based approach which allows the use of linear-frequency spectrograms as well as imposing strict harmonic constraints, resulting in an improved model. Further, these additional constraints allow the addition of a source filter model to the factorisation framework, and an extended model which is capable of separating mixtures of pitched and percussive instruments simultaneously

    Computational Models of Expressive Music Performance: A Comprehensive and Critical Review

    Get PDF
    Expressive performance is an indispensable part of music making. When playing a piece, expert performers shape various parameters (tempo, timing, dynamics, intonation, articulation, etc.) in ways that are not prescribed by the notated score, in this way producing an expressive rendition that brings out dramatic, affective, and emotional qualities that may engage and affect the listeners. Given the central importance of this skill for many kinds of music, expressive performance has become an important research topic for disciplines like musicology, music psychology, etc. This paper focuses on a specific thread of research: work on computational music performance models. Computational models are attempts at codifying hypotheses about expressive performance in terms of mathematical formulas or computer programs, so that they can be evaluated in systematic and quantitative ways. Such models can serve at least two purposes: they permit us to systematically study certain hypotheses regarding performance; and they can be used as tools to generate automated or semi-automated performances, in artistic or educational contexts. The present article presents an up-to-date overview of the state of the art in this domain. We explore recent trends in the field, such as a strong focus on data-driven (machine learning) approaches; a growing interest in interactive expressive systems, such as conductor simulators and automatic accompaniment systems; and an increased interest in exploring cognitively plausible features and models. We provide an in-depth discussion of several important design choices in such computer models, and discuss a crucial (and still largely unsolved) problem that is hindering systematic progress: the question of how to evaluate such models in scientifically and musically meaningful ways. From all this, we finally derive some research directions that should be pursued with priority, in order to advance the field and our understanding of expressive music performance

    Data-based melody generation through multi-objective evolutionary computation

    Get PDF
    Genetic-based composition algorithms are able to explore an immense space of possibilities, but the main difficulty has always been the implementation of the selection process. In this work, sets of melodies are utilized for training a machine learning approach to compute fitness, based on different metrics. The fitness of a candidate is provided by combining the metrics, but their values can range through different orders of magnitude and evolve in different ways, which makes it hard to combine these criteria. In order to solve this problem, a multi-objective fitness approach is proposed, in which the best individuals are those in the Pareto front of the multi-dimensional fitness space. Melodic trees are also proposed as a data structure for chromosomic representation of melodies and genetic operators are adapted to them. Some experiments have been carried out using a graphical interface prototype that allows one to explore the creative capabilities of the proposed system. An Online Supplement is provided and can be accessed at http://dx.doi.org/10.1080/17459737.2016.1188171, where the reader can find some technical details, information about the data used, generated melodies, and additional information about the developed prototype and its performance.This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte [FPU fellowship AP2012-0939]; and the Spanish Ministerio de Economía y Competitividad project TIMuL supported by UE FEDER funds [No. TIN2013–48152–C2–1–R]

    Non-Standard Sound Synthesis with Dynamic Models

    Get PDF
    Full version unavailable due to 3rd party copyright restrictions.This Thesis proposes three main objectives: (i) to provide the concept of a new generalized non-standard synthesis model that would provide the framework for incorporating other non-standard synthesis approaches; (ii) to explore dynamic sound modeling through the application of new non-standard synthesis techniques and procedures; and (iii) to experiment with dynamic sound synthesis for the creation of novel sound objects. In order to achieve these objectives, this Thesis introduces a new paradigm for non-standard synthesis that is based in the algorithmic assemblage of minute wave segments to form sound waveforms. This paradigm is called Extended Waveform Segment Synthesis (EWSS) and incorporates a hierarchy of algorithmic models for the generation of microsound structures. The concepts of EWSS are illustrated with the development and presentation of a novel non-standard synthesis system, the Dynamic Waveform Segment Synthesis (DWSS). DWSS features and combines a variety of algorithmic models for direct synthesis generation: list generation and permutation, tendency masks, trigonometric functions, stochastic functions, chaotic functions and grammars. The core mechanism of DWSS is based in an extended application of Cellular Automata. The potential of the synthetic capabilities of DWSS is explored in a series of Case Studies where a number of sound object were generated revealing (i) the capabilities of the system to generate sound morphologies belonging to other non-standard synthesis approaches and, (ii) the capabilities of the system of generating novel sound objects with dynamic morphologies. The introduction of EWSS and DWSS is preceded by an extensive and critical overview on the concepts of microsound synthesis, algorithmic composition, the two cultures of computer music, the heretical approach in composition, non- standard synthesis and sonic emergence along with the thorough examination of algorithmic models and their application in sound synthesis and electroacoustic composition. This Thesis also proposes (i) a new definition for “algorithmic composition”, (ii) the term “totalistic algorithmic composition”, and (iii) four discrete aspects of non-standard synthesis

    Non-negative mixtures

    Get PDF
    This is the author's accepted pre-print of the article, first published as M. D. Plumbley, A. Cichocki and R. Bro. Non-negative mixtures. In P. Comon and C. Jutten (Ed), Handbook of Blind Source Separation: Independent Component Analysis and Applications. Chapter 13, pp. 515-547. Academic Press, Feb 2010. ISBN 978-0-12-374726-6 DOI: 10.1016/B978-0-12-374726-6.00018-7file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:p\PlumbleyCichockiBro10-non-negative.pdf:PDF owner: markp timestamp: 2011.04.2

    Nonnegative Matrix Factorization with Gaussian Process Priors

    Get PDF
    We present a general method for including prior knowledge in a nonnegative matrix factorization (NMF), based on Gaussian process priors. We assume that the nonnegative factors in the NMF are linked by a strictly increasing function to an underlying Gaussian process specified by its covariance function. This allows us to find NMF decompositions that agree with our prior knowledge of the distribution of the factors, such as sparseness, smoothness, and symmetries. The method is demonstrated with an example from chemical shift brain imaging

    Singing voice resynthesis using concatenative-based techniques

    Get PDF
    Tese de Doutoramento. Engenharia Informática. Faculdade de Engenharia. Universidade do Porto. 201

    MMixte: a software architecture for Live Electronics with acoustic instruments : exemplary application cases

    Get PDF
    MMixte is a middleware based on Max for mixed music with live electronics. It enables programming for a “patcher concerto”, a platform, that is, for the management of live electronics in just a few minutes and with extreme simplicity. Dedicated to average and expert users, MMixte enables true programming of live electronics in very little time while also enabling easy adapting of previously developed modules, depending on the case and its needs. The architecture behind MMixte is based on a variation of so-called “pipeline architecture"; the analysis of the most widely used software architectures in the market and design patterns to program graphic interfaces has led to the conception of ways of organizing communication between various modules, the way they are being used and their graphic appearence. Analysis of other, “state of the art” module collections and other software programs dedicated to mixed music shows the absence of another work on software architecture for mixed music. Application of MMixte to some of my personal works shows demonstrates its flexibility and ease of adaptation. Computer programming for a piece of mixed music requires much that goes beyond just programming of audio signal processing. The present work seeks to provide an example of a solution to such needs
    corecore