10 research outputs found

    Improving polyphonic and poly-instrumental music to score alignment

    Get PDF
    6ppInternational audienceMusic alignment links events in a score and points on the audio performance time axis. All the parts of a recording can be thus indexed according to score information. The automatic alignment presented in this paper is based on a dynamic time warping method. Local distances are computed using the signal's spectral features through an attack plus sustain note modeling. The method is applied to mixtures of harmonic sustained instruments, excluding percussion for the moment. Good alignment has been obtained for polyphony of up to five instruments. The method is robust for difficulties such as trills, vibratos and fast sequences. It provides an accurate indicator giving position of score interpretation errors and extra or forgotten notes. Implementation optimizations allow aligning long sound files in a relatively short time. Evaluation results have been obtained on piano jazz recordings

    Improving Polyphonic and Poly-Instrumental Music to Score Alignment

    No full text
    Music alignment link events in a score and points on the audio performance time axis. All the parts of a recording can be thus indexed according to score information. The automatic alignment presented in this paper is based on a dynamic time warping methodology. Local distances are computed using the signal's spectral features through an attack plus sustain note modeling. Good alignment has been obtained for polyphony of up to five instruments. The method is robust for difficulties such as trills, vibratos and fast sequences. It provides an accurate indicator giving position of score interpretation errors and extra or forgotten notes. Implementation optimizations allow aligning long sound files in a relatively short time. Evaluation results have been obtained on piano jazz recordings

    Improving Polyphonic and Poly-Instrumental Music to Score Alignment

    No full text
    Music alignment links events in a score and points on the audio performance time axis. All the parts of a recording can be thus indexed according to score information. Th

    PIANO SCORE FOLLOWING WITH HIDDEN TIMBRE OR TEMPO USING SWITCHING KALMAN FILTERS

    Get PDF
    Thesis (Ph.D.) - Indiana University, University Graduate School/Luddy School of Informatics, Computing, and Engineering, 2020Score following is an AI technique that enables computer programs to “listen to” music: to track a live musical performance in relation to its written score, even through variations in tempo and amplitude. This ability can be transformative for musical practice, performance, education, and composition. Although score following has been successful on monophonic music (one note at a time), it has difficulty with polyphonic music. One of the greatest challenges is piano music, which is highly polyphonic. This dissertation investigates ways to overcome the challenges of polyphonic music, and casts light on the nature of the problem through empirical experiments. I propose two new approaches inspired by two important aspects of music that humans perceive during a performance: the pitch profile of the sound, and the timing. In the first approach, I account for changing timbre within a chord by tracking harmonic amplitudes to improve matching between the score and the sound. In the second approach, I model tempo in music, allowing it to deviate from the default tempo value within reasonable statistical constraints. For both methods, I develop switching Kalman filter models that are interesting in their own right. I have conducted experiments on 50 excerpts of real piano performances, and analyzed the results both case-by-case and statistically. The results indicate that modeling tempo is essential for piano score following, and the second method significantly outperformed the state-of-the-art baseline. The first method, although it did not show improvement over the baseline, still represents a promising new direction for future research. Taken together, the results contribute to a more nuanced and multifaceted understanding of the score-following problem

    Signal Processing Methods for Music Synchronization, Audio Matching, and Source Separation

    Get PDF
    The field of music information retrieval (MIR) aims at developing techniques and tools for organizing, understanding, and searching multimodal information in large music collections in a robust, efficient and intelligent manner. In this context, this thesis presents novel, content-based methods for music synchronization, audio matching, and source separation. In general, music synchronization denotes a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation. Here, the thesis presents three complementary synchronization approaches, which improve upon previous methods in terms of robustness, reliability, and accuracy. The first approach employs a late-fusion strategy based on multiple, conceptually different alignment techniques to identify those music passages that allow for reliable alignment results. The second approach is based on the idea of employing musical structure analysis methods in the context of synchronization to derive reliable synchronization results even in the presence of structural differences between the versions to be aligned. Finally, the third approach employs several complementary strategies for increasing the accuracy and time resolution of synchronization results. Given a short query audio clip, the goal of audio matching is to automatically retrieve all musically similar excerpts in different versions and arrangements of the same underlying piece of music. In this context, chroma-based audio features are a well-established tool as they possess a high degree of invariance to variations in timbre. This thesis describes a novel procedure for making chroma features even more robust to changes in timbre while keeping their discriminative power. Here, the idea is to identify and discard timbre-related information using techniques inspired by the well-known MFCC features, which are usually employed in speech processing. Given a monaural music recording, the goal of source separation is to extract musically meaningful sound sources corresponding, for example, to a melody, an instrument, or a drum track from the recording. To facilitate this complex task, one can exploit additional information provided by a musical score. Based on this idea, this thesis presents two novel, conceptually different approaches to source separation. Using score information provided by a given MIDI file, the first approach employs a parametric model to describe a given audio recording of a piece of music. The resulting model is then used to extract sound sources as specified by the score. As a computationally less demanding and easier to implement alternative, the second approach employs the additional score information to guide a decomposition based on non-negative matrix factorization (NMF)

    Une approche problèmes inverses pour la reconstruction de données multi-dimensionnelles par méthodes d'optimisation.

    Get PDF
    This work presents an ``inverse problems'' approach for reconstruction in two different fields: digital holography and blind deconvolution.The "inverse problems" approach consists in investigating the causes from their effects, i.e. estimate the parameters describing a system from its observation. In general, same causes produce same effects, same effects can however have different causes. To remove ambiguities, it is necessary to introduce a priori information. In this work, the parameters are estimated using optimization methods to minimize a cost function which consists of a likelihood term plus some prior terms.We use this approach to address the problem of heterogeneous multidimensional data blind deconvolution. Heterogeneous means that the different dimensions have different meanings and units (for instance position and wavelength). For that, we have established a general framework with a separable prior which have been successfully adapted to different applications: deconvolution of multi-spectral data in astronomy, of Bayer color images and blind deconvolution of bio-medical video sequences (in coronarography, conventional and confocal microscopy).We also applied this framework to digital holography for particles image velocimetry (DH-PIV). Using a model of the hologram formation, we use this "inverse problems" approach to circumvent the artifacts produced by the classical hologram restitution methods (distortions close to the image boundaries, multiple focusing, twin-images). The proposed algorithm detects micro-particles in a volume 16 times larger than the camera field of view and with a precision improved by a factor 5 compared with classical techniques.Ce travail utilise l'approche « problèmes inverses » pour la reconstruction dans deux domaines différents : l'holographie numérique de micro-particules et la deconvolution aveugle.L'approche « problèmes inverses » consiste à rechercher les causes à partir des effets ; c'est-à-dire estimer les paramètres décrivant un système d'après son observation. Pour cela, on utilise un modèle physique décrivant les liens de causes à effets entre les paramètres et les observations. Le terme inverse désigne ainsi l'inversion de ce modèle direct. Seulement si, en règle générale, les mêmes causes donnent les mêmes effets, un même effet peut avoir différentes causes et il est souvent nécessaire d'introduire des a priori pour restreindre les ambiguïtés de l'inversion. Dans ce travail, ce problème est résolu en estimant par des méthodes d'optimisations, les paramètres minimisant une fonction de coût regroupant un terme issu du modèle de formation des données et un terme d'a priori.Nous utilisons cette approche pour traiter le problème de la déconvolution aveugle de données multidimensionnelles hétérogène ; c'est-à-dire de données dont les différentes dimensions ont des significations et des unités différentes. Pour cela nous avons établi un cadre général avec un terme d'a priori séparable, que nous avons adapté avec succès à différentes applications : la déconvolution de données multi-spectrales en astronomie, d'images couleurs en imagerie de Bayer et la déconvolution aveugle de séquences vidéo bio-médicales (coronarographie, microscopie classique et confocale).Cette même approche a été utilisée en holographie numérique pour la vélocimétrie par image de particules (DH-PIV). Un hologramme de micro-particules sphériques est composé de figures de diffraction contenant l'information sur la la position 3D et le rayon de ces particules. En utilisant un modèle physique de formation de l'hologramme, l'approche « problèmes inverses » nous a permis de nous affranchir des problèmes liées à la restitution de l'hologramme (effet de bords, images jumelles...) et d'estimer les positions 3D et le rayon des particules avec une précision améliorée d'au moins un facteur 5 par rapport aux méthodes classiques utilisant la restitution. De plus, nous avons pu avec cette méthode détecter des particules hors du champs du capteur élargissant ainsi le volume d'intérêt d'un facteur 16

    Separation of musical sources and structure from single-channel polyphonic recordings

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore