Search CORE

745 research outputs found

HReMAS: Hybrid Real-time Musical Alignment System

Author: B Li
E. F. Combarro
F. J. Bris-Peñalver
FJ Rodriguez-Serrano
J Pätynen
M Miron
P Alonso
P Alonso
P. Cabañas-Molero
Pedro Alonso
Raquel Cortina-Parajón
RB Dannenberg
Z Duan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

[EN] This paper presents a real-time audio-to-score alignment system for musical applications. The aim of these systems is to synchronize a live musical performance with its symbolic representation in a music sheet. We have used as a base our previous real-time alignment system by enhancing it with a traceback stage, a stage used in offline alignment to improve the accuracy of the aligned note. This stage introduces some delay, what forces to assume a trade-off between output delay and alignment accuracy that must be considered in the design of this type of hybrid techniques. We have also improved our former system to execute faster in order to minimize this delay. Other interesting improvements, like identification of silence frames, have also been incorporated to our proposed system.This work has been supported by the "Ministerio de Economia y Competitividad" of Spain and FEDER under Projects TEC2015-67387-C4-{1,2,3}-R.Cabañas-Molero, P.; Cortina-Parajón, R.; Combarro, EF.; Alonso-Jordá, P.; Bris-Peñalver, FJ. (2019). HReMAS: Hybrid Real-time Musical Alignment System. The Journal of Supercomputing. 75(3):1001-1013. https://doi.org/10.1007/s11227-018-2265-1S10011013753Alonso P, Cortina R, Rodríguez-Serrano FJ, Vera-Candeas P, Alonso-González M, Ranilla J (2017) Parallel online time warping for real-time audio-to-score alignment in multi-core systems. J Supercomput 73(1):126–138Alonso P, Vera-Candeas P, Cortina R, Ranilla J (2017) An efficient musical accompaniment parallel system for mobile devices. J Supercomput 73(1):343–353Arzt A (2016) Flexible and robust music tracking. Ph.D. thesis, Johannes Kepler University Linz, Linz, ÖsterreichArzt A, Widmer G, Dixon S (2008) Automatic page turning for musicians via real-time machine listening. In: Proceedings of the 18th European Conference on Artificial Intelligence (ECAI), Amsterdam, pp 241–245Carabias-Orti J, Rodríguez-Serrano F, Vera-Candeas P, Ruiz-Reyes N, Cañadas-Quesada F (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: Proceedings of ISMIR, pp 742–748Cont A (2006) Realtime audio to score alignment for polyphonic music instruments, using sparse non-negative constraints and hierarchical HMMs. In: 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol 5. pp V–VCont A, Schwarz D, Schnell N, Raphael C (2007) Evaluation of real-time audio-to-score alignment. In: International Symposium on Music Information Retrieval (ISMIR), ViennaDannenberg RB, Raphael C (2006) Music score alignment and computer accompaniment. Commun ACM 49(8):38–43Devaney J, Ellis D (2009) Handling asynchrony in audio-score alignment. In: Proceedings of the International Computer Music Conference Computer Music Association. pp 29–32Dixon S (2005) An on-line time warping algorithm for tracking musical performances. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). pp 1727–1728Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Top Signal Process 5(6):1205–1215Ewert S, Muller M, Grosche P (2009) High resolution audio synchronization using chroma onset features. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009 (ICASSP 2009). pp 1869–1872Hu N, Dannenberg R, Tzanetakis G (2003) Polyphonic audio matching and alignment for music retrieval. In: 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. pp 185–188Kaprykowsky H, Rodet X (2006) Globally optimal short-time dynamic time warping, application to score to audio alignment. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing, vol 5. pp. V–VLi B, Duan Z (2016) An approach to score following for piano performances with the sustained effect. IEEE/ACM Trans Audio Speech Lang Process 24(12):2425–2438Miron M, Carabias-Orti JJ, Bosch JJ, Gómez E, Janer J (2016) Score-informed source separation for multichannel orchestral recordings. J Electr Comput Eng 2016(8363507):1–19Muñoz-Montoro A, Cabañas-Molero P, Bris-Peñalver F, Combarro E, Cortina R, Alonso P (2017) Discovering the composition of audio files by audio-to-midi alignment. In: Proceedings of the 17th International Conference on Computational and Mathematical Methods in Science and Engineering. pp 1522–1529Orio N, Schwarz D (2001) Alignment of monophonic and polyphonic music to a score. In: Proceedings of the International Computer Music Conference (ICMC), pp 155–158Pätynen J, Pulkki V, Lokki T (2008) Anechoic recording system for symphony orchestra. Acta Acust United Acust 94(6):856–865Raphael C (2010) Music plus one and machine learning. In: Proceedings of the 27th International Conference on Machine Learning (ICML), pp 21–28Rodriguez-Serrano FJ, Carabias-Orti JJ, Vera-Candeas P, Martinez-Munoz D (2016) Tempo driven audio-to-score alignment using spectral decomposition and online dynamic time warping. ACM Trans Intell Syst Technol 8(2):22:1–22:2

Crossref

Repositorio Institucional de la Universidad de Oviedo

RiuNet

Parallel Online Time Warping for Real-Time Audio-to-Score Alignment in Multi-core Systems

Author: C Joder
C Raphael
F Itakura
F. J. Rodríguez-Serrano
FJ Rodriguez-Serrano
JJ Carabias-Ortí
José Ranilla
M. Alonso-González
P. Vera-Candeas
Pedro Alonso
Raquel Cortina
Z Duan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

[EN] The Audio-to-Score framework consists of two separate stages: pre- processing and alignment. The alignment is commonly solved through offline Dynamic Time Warping (DTW), which is a method to find the path over the distortion matrix with the minimum cost to determine the relation between the performance and the musical score times. In this work we propose a par- allel online DTW solution based on a client-server architecture. The current version of the application has been implemented for multi-core architectures (x86, x64 and ARM), thus covering either powerful systems or mobile devices. An extensive experimentation has been conducted in order to validate the software. The experiments also show that our framework allows to achieve a good score alignment within the real-time window by using parallel computing techniques.This work has been partially supported by Spanish Ministry of Science and Innovation and FEDER under Projects TEC2012-38142-C04-01, TEC2012-38142-C04-03, TEC2012-38142-C04-04, TEC2015-67387-C4-1-R, TEC2015-67387-C4-3-R, TEC2015-67387-C4-4-R, the European Union FEDER (CAPAP-H5 network TIN2014-53522-REDT), and the Generalitat Valenciana under Grant PROMETEOII/2014/003.Alonso-Jordá, P.; Cortina, R.; Rodríguez-Serrano, F.; Vera-Candeas, P.; Alonso-González, M.; Ranilla, J. (2017). Parallel Online Time Warping for Real-Time Audio-to-Score Alignment in Multi-core Systems. The Journal of Supercomputing. 73(1):126-138. https://doi.org/10.1007/s11227-016-1647-5S126138731Joder C, Essid S, Richard G (2011) A conditional random field framework for robust and scalable audio-to-score matching. IEEE Trans Speech Audio Lang Process 19(8):2385–2397McNab RJ, Smith LA, Witten IH, Henderson CL, Cunningham SJ (1996) Towards the digital music library: tune retrieval from acoustic input. In: DL 96: Proceedings of the first ACM international conference on digital libraries. ACM, New York, pp 11–18Dannenberg RB (2007) An intelligent multi-track audio editor. In: Proceedings of international computer music conference (ICMC), vol 2, pp 89–94Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Topics Signal Process 5(6):1205–1215Dixon S (2005) Live tracking of musical performances using on-line time warping. In: Proceedings of the international conference on digital audio effects (DAFx), Madrid, Spain, pp 92–97Orio N, Schwarz D (2001) Alignment of monophonic and polyphonic music to a score. In: Proceedings of the international computer music conference (ICMC), pp 129–132Simon I, Morris D, Basu S (2008) MySong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 725–734Rodriguez-Serrano FJ, Duan Z, Vera-Candeas P, Pardo B, Carabias-Orti JJ (2015) Online score-informed source separation with adaptive instrument models. J New Music Res Lond 44(2):83–96Arzt A, Widmer G, Dixon S (2008) Automatic page turning for musicians via real-time machine listening. In: Proceedings of the 18th European conference on artificial intelligence. IOS Press, Amsterdam, pp 241–245Carabias-Orti JJ, Rodriguez-Serrano FJ, Vera-Candeas P, Canadas-Quesada FJ, Ruiz-Reyes N (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: 16th International Society for music information retrieval conference, pp 742–748Rodríguez-Serrano FJ, Menéndez-Canal J, Vidal A, Cañadas-Quesada FJ, Cortina R (2015) A DTW based score following method for score-informed sound source separation. In: Proceedings of the 12th sound and music computing conference 2015 (SMC-15), Ireland, pp 491–496Carabias-Ortí JJ, Rodríguez-Serrano FJ, Vera-Candeas P, Cañadas-Quesada FJ, Ruíz-Reyes N (2013) Constrained non-negative sparse coding using learnt instrument templates for realtime music transcription. Eng Appl Artif Intell 26(7):1671–1680Raphael C (2006) Aligning music audio with symbolic scores using a hybrid graphical model. Mach Learn 65:389–409Schreck-Ensemble (2001–2004) ComParser 1.42. http://home.hku.nl/~pieter.suurmond/SOFT/CMP/doc/cmp.html . Accessed Sept 2015Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23:52–72Dannenberg R, Hu N (2003) Polyphonic audio matching for score following and intelligent audio editors. In: Proceedings of the international computer music conference. International Computer Music Association, San Francisco, pp 27–34Mueller M, Kurth F, Roeder T (2004) Towards an efficient algorithm for automatic score-to-audio synchronization. In: Proceedings of the 5th international conference on music information retrieval, Barcelona, SpainMueller M, Mattes H, Kurth F (2006) An efficient multiscale approach to audio synchronization. In: Proceedings of the 7th international conference on music information retrieval, Victoria, CanadaKaprykowsky H, Rodet X (2006) Globally optimal short-time dynamic time warping applications to score to audio alignment. In: IEEE ICASSP, Toulouse, France, pp 249–252Fremerey C, Müller M, Clausen M (2010) Handling repeats and jumps in score-performance synchronization. In: Proceedings of ISMIR, pp 243–248Arzt A, Widmer G (2010) Towards effective any-time music tracking. In: Proceedings of starting AI researchers symposium (STAIRS), Lisbon, Portugal, pp 24–3

Crossref

Repositorio Institucional de la Universidad de Oviedo

RiuNet

Analysis on Using Synthesized Singing Techniques in Assistive Interfaces for Visually Impaired to Study Music

Author: Jayaratne Dr. Lakshman
Ranasinghe Kavindu
Publication venue: GSTF Journal on Computing (JoC)
Publication date: 10/04/2016
Field of study

Tactile and auditory senses are the basic types of methods that visually impaired people sense the world. Their interaction with assistive technologies also focuses mainly on tactile and auditory interfaces. This research paper discuss about the validity of using most appropriate singing synthesizing techniques as a mediator in assistive technologies specifically built to address their music learning needs engaged with music scores and lyrics. Music scores with notations and lyrics are considered as the main mediators in musical communication channel which lies between a composer and a performer. Visually impaired music lovers have less opportunity to access this main mediator since most of them are in visual format. If we consider a music score, the vocal performer’s melody is married to all the pleasant sound producible in the form of singing. Singing best fits for a format in temporal domain compared to a tactile format in spatial domain. Therefore, conversion of existing visual format to a singing output will be the most appropriate nonlossy transition as proved by the initial research on adaptive music score trainer for visually impaired [1]. In order to extend the paths of this initial research, this study seek on existing singing synthesizing techniques and researches on auditory interfaces

GSTF Digital Library (GSTF-DL): Open Journal Systems (Global Science and Technology Forum)

FastDTW is approximate and Generally Slower than the Algorithm it Approximates

Author: Keogh Eamonn J.
Wu Renjie
Publication venue
Publication date: 08/09/2020
Field of study

Many time series data mining problems can be solved with repeated use of distance measure. Examples of such tasks include similarity search, clustering, classification, anomaly detection and segmentation. For over two decades it has been known that the Dynamic Time Warping (DTW) distance measure is the best measure to use for most tasks, in most domains. Because the classic DTW algorithm has quadratic time complexity, many ideas have been introduced to reduce its amortized time, or to quickly approximate it. One of the most cited approximate approaches is FastDTW. The FastDTW algorithm has well over a thousand citations and has been explicitly used in several hundred research efforts. In this work, we make a surprising claim. In any realistic data mining application, the approximate FastDTW is much slower than the exact DTW. This fact clearly has implications for the community that uses this algorithm: allowing it to address much larger datasets, get exact results, and do so in less time

arXiv.org e-Print Archive

Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models

Author: Le Guen Vincent
Thome Nicolas
Publication venue: HAL CCSD
Publication date: 09/12/2019
Field of study

International audienceThis paper addresses the problem of time series forecasting for non-stationarysignals and multiple future steps prediction. To handle this challenging task, weintroduce DILATE (DIstortion Loss including shApe and TimE), a new objectivefunction for training deep neural networks. DILATE aims at accurately predictingsudden changes, and explicitly incorporates two terms supporting precise shapeand temporal change detection. We introduce a differentiable loss function suitablefor training deep neural nets, and provide a custom back-prop implementation forspeeding up optimization. We also introduce a variant of DILATE, which providesa smooth generalization of temporally-constrained Dynamic Time Warping (DTW).Experiments carried out on various non-stationary datasets reveal the very goodbehaviour of DILATE compared to models trained with the standard Mean SquaredError (MSE) loss function, and also to DTW and variants. DILATE is also agnosticto the choice of the model, and we highlight its benefit for training fully connectednetworks as well as specialized recurrent architectures, showing its capacity toimprove over state-of-the-art trajectory forecasting approaches

Signal Processing Methods for Music Synchronization, Audio Matching, and Source Separation

Author: Ewert Sebastian
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The field of music information retrieval (MIR) aims at developing techniques and tools for organizing, understanding, and searching multimodal information in large music collections in a robust, efficient and intelligent manner. In this context, this thesis presents novel, content-based methods for music synchronization, audio matching, and source separation. In general, music synchronization denotes a procedure which, for a given position in one representation of a piece of music, determines the corresponding position within another representation. Here, the thesis presents three complementary synchronization approaches, which improve upon previous methods in terms of robustness, reliability, and accuracy. The first approach employs a late-fusion strategy based on multiple, conceptually different alignment techniques to identify those music passages that allow for reliable alignment results. The second approach is based on the idea of employing musical structure analysis methods in the context of synchronization to derive reliable synchronization results even in the presence of structural differences between the versions to be aligned. Finally, the third approach employs several complementary strategies for increasing the accuracy and time resolution of synchronization results. Given a short query audio clip, the goal of audio matching is to automatically retrieve all musically similar excerpts in different versions and arrangements of the same underlying piece of music. In this context, chroma-based audio features are a well-established tool as they possess a high degree of invariance to variations in timbre. This thesis describes a novel procedure for making chroma features even more robust to changes in timbre while keeping their discriminative power. Here, the idea is to identify and discard timbre-related information using techniques inspired by the well-known MFCC features, which are usually employed in speech processing. Given a monaural music recording, the goal of source separation is to extract musically meaningful sound sources corresponding, for example, to a melody, an instrument, or a drum track from the recording. To facilitate this complex task, one can exploit additional information provided by a musical score. Based on this idea, this thesis presents two novel, conceptually different approaches to source separation. Using score information provided by a given MIDI file, the first approach employs a parametric model to describe a given audio recording of a piece of music. The resulting model is then used to extract sound sources as specified by the score. As a computationally less demanding and easier to implement alternative, the second approach employs the additional score information to guide a decomposition based on non-negative matrix factorization (NMF)

bonndoc – Der Publikationsserver der Universität Bonn

Recommended from our members

Graph Construction for Manifold Discovery

Author: Carey CJ
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

Manifold learning is a class of machine learning methods that exploits the observation that high-dimensional data tend to lie on a smooth lower-dimensional manifold. Manifold discovery is the essential first component of manifold learning methods, in which the manifold structure is inferred from available data. This task is typically posed as a graph construction problem: selecting a set of vertices and edges that most closely approximates the true underlying manifold. The quality of this learned graph is critical to the overall accuracy of the manifold learning method. Thus, it is essential to develop accurate, efficient, and reliable algorithms for constructing manifold approximation graphs. To aid in this investigation of graph construction methods, we propose new methods for evaluating graph quality. These quality measures act as a proxy for ground-truth manifold approximation error and are applicable even when prior information about the dataset is limited. We then develop an incremental update scheme for some quality measures, demonstrating their usefulness for efficient parameter tuning. We then propose two novel methods for graph construction, the Manifold Spanning Graph and the Mutual Neighbors Graph algorithms. Each method leverages assumptions about the structure of both the input data and the subsequent manifold learning task. The algorithms are experimentally validated against state of the art graph construction techniques on a multi-disciplinary set of application domains, including image classification, directional audio prediction, and spectroscopic analysis. The final contribution of the thesis is a method for aligning sequential datasets while still respecting each set’s internal manifold structure. The use of high quality manifold approximation graphs enables accurate alignments with few ground-truth correspondences

ScholarWorks@UMass Amherst