Search CORE

16 research outputs found

Recommended from our members

Handling Asynchrony in Audio-Score Alignment

Author: Devaney Johanna
Ellis Daniel P. W.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Aligning a canonical score to an audio recording of a musical performance can provide very good information about the timing of individual notes. However, a score representation frequently treats multiple note events as simultaneous, whereas in reality different performers will start notes at slightly differing times, and these timing details may be significant in the analysis of performance and expression. Using an example of a four-part a cappella vocal piece where each voice was recorded separately, we compare note onset and offset times obtained by manual annotation to three difference types of alignment: forced alignment of each part individually to its corresponding track, simultaneous alignment of the polyphonic score to the full audio, and independent alignment of single parts to the polyphonic audio. In each case, we examine the kinds of errors that occur. We discuss how standard dynamic time warping may be extended so that it retains the advantages of polyphonic alignment while allowing ostensibly simultaneous notes to have different onset and offset times

Columbia University Academic Commons

University of Michigan Library Repository

Improving MIDI-audio alignment with acoustic features

Author: Devaney Johanna
Ellis Daniel P. W.
Mandel Michael I.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

This paper describes a technique to improve the accuracy of dynamic time warping-based MIDI-audio alignment. The technique implements a hidden Markov model that uses aperiodicity and power estimates from the signal as observations and the results of a dynamic time warping alignment as a prior. In addition to improving the overall alignment, this technique also identifies the transient and steady state sections of the note. This information is important for describing various aspects of a musical performance, including both pitch and rhythm

Crossref

Columbia University Academic Commons

Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips

Author: Nakamura Eita
Nakamura Tomohiko
Sagayama Shigeki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/12/2015
Field of study

This paper discusses real-time alignment of audio signals of music performance to the corresponding score (a.k.a. score following) which can handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips) in performances. This type of score following is particularly useful in automatic accompaniment for practices and rehearsals, where errors and repeats/skips are often made. Simple extensions of the algorithms previously proposed in the literature are not applicable in these situations for scores of practical length due to the problem of large computational complexity. To cope with this problem, we present two hidden Markov models of monophonic performance with errors and arbitrary repeats/skips, and derive efficient score-following algorithms with an assumption that the prior probability distributions of score positions before and after repeats/skips are independent from each other. We confirmed real-time operation of the algorithms with music scores of practical length (around 10000 notes) on a modern laptop and their tracking ability to the input performance within 0.7 s on average after repeats/skips in clarinet performance data. Further improvements and extension for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processin

arXiv.org e-Print Archive

Event-based Multitrack Alignment using a Probabilistic Framework

Author: A. Robertson
Cemgil A.T.
M.D. Plumbley
Puckette M.
Raphael C.
Raphael C.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Coherent Time Modeling of semi-Markov Models with Application to Real-Time Audio-to-Score Alignment

Author: Cont Arshia
Cuvillier Philippe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/09/2014
Field of study

International audienceThis paper proposes a novel insight to the problem of duration modeling for recognition setups where events are inferred from time-signals using a probabilistic framework. When a prior knowledge about the duration of events is available, Hidden Markov or Semi-Markov models allow the setting of individual duration distributions but give no clue about their choice. We propose two criteria of temporal coherency for such applications and prove they are fulfilled by statistical properties like infinite divisibility and log-concavity. We conclude by showing practical consequences of these properties in a real-time audio-to-score alignment experiment.Ce papier propose une nouvel éclairage sur la question de la modélisation des durées dans les algorithmes de reconnaissance, lorsque les événements reconnus sont inférés à partir de signaux temporels au moyen d'un modèle probabiliste. Si une connaissance a priori sur la durée nominale des événements est disponible, les modèles de Markov et de semi-Markov cachés permettent de choisir en fonction les distributions de durées de chaque événement, mais laissent ce choix complètement ouvert. Nous proposons deux critères de cohérence temporelle de tels algorithmes, et prouvons que ceux-ci si impliqués par des propriétés particulières étudiées en statistiques, telles que l'infinie divisibilité et la log-concavité. En conclusion, nous rapportons une expérience d'alignement audio-sur-partition en temps réel, qui montre l'intérêt pratique de ces propriétés théoriques

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Master of Science

Author: Pokkunuri Rama Krishna Sandeep
Publication venue: University of Utah
Publication date: 28/04/2011
Field of study

thesisMultiple Instance Learning (MIL) is a type of supervised learning with missing data. Here, each example (a.k.a. bag) has one or more instances. In the training set, we have only labels at bag level. The task is to label both bags and instances from the test set. In most practical MIL problems, there is a relationship between the instances of a bag. Capturing this relationship may help learn the underlying concept better. We present an algorithm that uses the structure of bags along with the features of instances. The key idea is to allow a structured support vector machine (SVM) to "guess" at the true underlying structure, so long as it is consistent with the bag labels. This idea is formalized and a new cutting plane algorithm is proposed for optimization. To verify this idea, we implemented our algorithm for a particular kind of structure - hidden markov models. We performed experiments on three datasets and found this algorithm to work better than the existing algorithms in MIL. We present the details of these experiments and the effects of varying different hyperparameters in detail. The key contribution from our work is a very simple loss function with only one hyperparameter that needs to be tuned using a small portion of the training set. The thesis of this work is that it is possible and desirable to exploit the structural relationship between instances in a bag, even though that structure is not observed at training time (i.e., correct labels for all the instances are unknown). Our work opens a new direction to solving the MIL problem. We suggest a few ideas to further our work in this direction

The University of Utah: J. Willard Marriott Digital Library

A Coupled Duration-Focused Architecture for Real-Time Music-to-Score Alignment

Author: Cont Arshia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

International audienceThe capacity for realtime synchronization and coordination is a common ability among trained musicians performing a music score that presents an interesting challenge for machine intelligence. Compared to speech recognition, which has influenced many music information retrieval systems, music's temporal dynamics and complexity pose challenging problems to common approximations regarding time modeling of data streams. In this paper, we propose a design for a realtime music to score alignment system. Given a live recording of a musician playing a music score, the system is capable of following the musician in realtime within the score and decoding the tempo (or pace) of its performance. The proposed design features two coupled audio and tempo agents within a unique probabilistic inference framework that adaptively updates its parameters based on the realtime context. Online decoding is achieved through the collaboration of the coupled agents in a Hidden Hybrid Markov/semi-Markov framework where prediction feedback of one agent affects the behavior of the other. We perform evaluations for both realtime alignment and the proposed temporal model. An implementation of the presented system has been widely used in real concert situations worldwide and the readers are encouraged to access the actual system and experiment the results

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Suivi de chansons par reconnaissance automatique de parole et alignement temporel

Author: Beaudette David
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2010
Field of study

Le suivi de partition est défini comme étant la synchronisation sur ordinateur entre une partition musicale connue et le signal sonore de l'interprète de cette partition. Dans le cas particulier de la voix chantée, il y a encore place à l'amélioration des algorithmes existants, surtout pour le suivi de partition en temps réel. L'objectif de ce projet est donc d'arriver à mettre en oeuvre un logiciel suiveur de partition robuste et en temps-réel utilisant le signal numérisé de voix chantée et le texte des chansons. Le logiciel proposé utilise à la fois plusieurs caractéristiques de la voix chantée (énergie, correspondance avec les voyelles et nombre de passages par zéro du signal) et les met en correspondance avec la partition musicale en format MusicXML. Ces caractéristiques, extraites pour chaque trame, sont alignées aux unités phonétiques de la partition. En parallèle avec cet alignement à court terme, le système ajoute un deuxième niveau d'estimation plus fiable sur la position en associant une segmentation du signal en blocs de chant à des sections chantées en continu dans la partition. La performance du système est évaluée en présentant les alignements obtenus en différé sur 3 extraits de chansons interprétés par 2 personnes différentes, un homme et une femme, en anglais et en français

Savoirs UdeS

REFINING MUSIC SIGNAL TO LYRIC TEXT SYNCHRONIZATION FROM LINE-LEVEL TO SYLLABLE-LEVEL BY CONSTRAINING DYNAMIC TIME WARPING SEARCH

Author: DENNY ISKANDAR
Publication venue
Publication date: 01/11/2007
Field of study

Master'sMASTER OF SCIENC

ScholarBank@NUS