1,084 research outputs found

    Analysing multi-person timing in music and movement : event based methods

    Get PDF
    Accurate timing of movement in the hundreds of milliseconds range is a hallmark of human activities such as music and dance. Its study requires accurate measurement of the times of events (often called responses) based on the movement or acoustic record. This chapter provides a comprehensive over - view of methods developed to capture, process, analyse, and model individual and group timing [...] This chapter is structured in five main sections, as follows. We start with a review of data capture methods, working, in turn, through a low cost system to research simple tapping, complex movements, use of video, inertial measurement units, and dedicated sensorimotor synchronisation software. This is followed by a section on music performance, which includes topics on the selection of music materials, sound recording, and system latency. The identification of events in the data stream can be challenging and this topic is treated in the next section, first for movement then for music. Finally, we cover methods of analysis, including alignment of the channels, computation of between channel asynchrony errors and modelling of the data set

    Improving MIDI-audio alignment with acoustic features

    Get PDF
    This paper describes a technique to improve the accuracy of dynamic time warping-based MIDI-audio alignment. The technique implements a hidden Markov model that uses aperiodicity and power estimates from the signal as observations and the results of a dynamic time warping alignment as a prior. In addition to improving the overall alignment, this technique also identifies the transient and steady state sections of the note. This information is important for describing various aspects of a musical performance, including both pitch and rhythm

    Music Information Retrieval Meets Music Education

    Get PDF
    This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commercially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are described: music transcription, solo and accompaniment track creation, and generation of performance instructions. As a representative example of a music learning system developed within the MIR community, the Songs2See software is outlined. Finally, challenges and directions for future research are described

    Fusion of Multimodal Information in Music Content Analysis

    Get PDF
    Music is often processed through its acoustic realization. This is restrictive in the sense that music is clearly a highly multimodal concept where various types of heterogeneous information can be associated to a given piece of music (a musical score, musicians\u27 gestures, lyrics, user-generated metadata, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to "multimodal music analysis" studies. This article gives a synthetic overview of methods that have been successfully employed in multimodal signal analysis. In particular, their use in music content processing is discussed in more details through five case studies that highlight different multimodal integration techniques. The case studies include an example of cross-modal correlation for music video analysis, an audiovisual drum transcription system, a description of the concept of informed source separation, a discussion of multimodal dance-scene analysis, and an example of user-interactive music analysis. In the light of these case studies, some perspectives of multimodality in music processing are finally suggested

    노래 신호의 자동 전사

    Get PDF
    학위논문 (박사)-- 서울대학교 융합과학기술대학원 융합과학부, 2017. 8. 이교구.Automatic music transcription refers to an automatic extraction of musical attributes such as notes from an audio signal to a symbolic level. The symbolized music data are applicable for various purposes such as music education and production by providing higher-level information to both consumers and creators. Although the singing voice is the easiest one to listen and play among various music signals, traditional transcription methods for musical instruments are not suitable due to the acoustic complexity in the human voice. The main goal of this thesis is to develop a fully-automatic singing transcription system that exceeds existing methods. We first take a look at some typical approaches for pitch tracking and onset detection, which are two fundamental tasks of music transcription, and then propose several methods for each task. In terms of pitch tracking, we examine the effect of data sampling on the performance of periodicity analysis of music signals. For onset detection, the local homogeneity in the harmonic structure is exploited through the cepstral analysis and unsupervised classification. The final transcription system includes feature extraction and probabilistic model of the harmonic structure, and note transition based on the hidden Markov model. It achieved the best performance (an F-measure of 82%) in the note-level evaluation including the state-of-the-art systems.Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Definitions 5 1.2.1 Musical keywords 5 1.2.2 Scientific keywords 7 1.2.3 Representations 7 1.3 Problems in singing transcription 9 1.4 Topics of interest 10 1.5 Outline of the thesis 13 Chapter 2 Background 16 2.1 Pitch estimation 17 2.1.1 Time-domain methods 17 2.1.2 Frequency-domain methods 18 2.2 Note segmentation 20 2.2.1 Onset detection 20 2.2.2 Offset detection 23 2.3 Singing transcription 24 2.4 Evaluation methodology 26 2.4.1 Pitch estimation 26 2.4.2 Note segmentation 27 2.4.3 Dataset 28 2.5 Summary 31 Chapter 3 Periodicity Analysis by Sampling in the Time/Frequency Domain for Pitch Tracking 32 3.1 Introduction 32 3.2 Data sampling 34 3.3 Sampled ACF/DF in the time domain 37 3.4 Sampled ACF/DF in the frequency domain 38 3.5 Iterative F0 estimation 40 3.6 Experimental setup 42 3.7 Result 46 3.8 Summary 49 Chapter 4 Note Onset Detection based on Harmonic Cepstrum regularity 50 4.1 Introduction 50 4.2 Cepstral analysis 52 4.3 Harmonic cepstrum regularity 56 4.3.1 Harmonic quefrency selection 57 4.3.2 Sub-harmonic regularity function 58 4.3.3 Adaptive thresholding 59 4.3.4 Picking onsets 59 4.4 Experiments 61 4.4.1 Dataset description 61 4.4.2 Evaluation results 62 4.5 Summary 64 Chapter 5 Robust Singing Transcription System using Local Homogeneity in the Harmonic Structure 66 5.1 Introduction 66 5.2 F0 tracking 71 5.3 Feature extraction 72 5.4 Mixture model 76 5.5 Note detection 80 5.5.1 Transition boundary detection 81 5.5.2 Note boundary selection 83 5.5.3 Note pitch decision 84 5.6 Evaluation 86 5.6.1 Dataset 86 5.6.2 Criteria and measures 87 5.6.3 Experimental setup 89 5.7 Results and discussions 90 5.7.1 Failure analysis 95 5.8 Summary 97 Chapter 6 Conclusion and Future Work 99 6.1 Contributions 99 6.2 Future work 103 6.2.1 Precise partial tracking using instantaneous frequency 103 6.2.2 Linguistic model for note segmentation 105 Appendix 108 Derivation of the instantaneous frequency 108 Bibliography 110 초 록 124Docto

    Singing information processing: techniques and applications

    Get PDF
    Por otro lado, se presenta un método para el cambio realista de intensidad de voz cantada. Esta transformación se basa en un modelo paramétrico de la envolvente espectral, y mejora sustancialmente la percepción de realismo al compararlo con software comerciales como Melodyne o Vocaloid. El inconveniente del enfoque propuesto es que requiere intervención manual, pero los resultados conseguidos arrojan importantes conclusiones hacia la modificación automática de intensidad con resultados realistas. Por último, se propone un método para la corrección de disonancias en acordes aislados. Se basa en un análisis de múltiples F0, y un desplazamiento de la frecuencia de su componente sinusoidal. La evaluación la ha realizado un grupo de músicos entrenados, y muestra un claro incremento de la consonancia percibida después de la transformación propuesta.La voz cantada es una componente esencial de la música en todas las culturas del mundo, ya que se trata de una forma increíblemente natural de expresión musical. En consecuencia, el procesado automático de voz cantada tiene un gran impacto desde la perspectiva de la industria, la cultura y la ciencia. En este contexto, esta Tesis contribuye con un conjunto variado de técnicas y aplicaciones relacionadas con el procesado de voz cantada, así como con un repaso del estado del arte asociado en cada caso. En primer lugar, se han comparado varios de los mejores estimadores de tono conocidos para el caso de uso de recuperación por tarareo. Los resultados demuestran que \cite{Boersma1993} (con un ajuste no obvio de parámetros) y \cite{Mauch2014}, tienen un muy buen comportamiento en dicho caso de uso dada la suavidad de los contornos de tono extraídos. Además, se propone un novedoso sistema de transcripción de voz cantada basada en un proceso de histéresis definido en tiempo y frecuencia, así como una herramienta para evaluación de voz cantada en Matlab. El interés del método propuesto es que consigue tasas de error cercanas al estado del arte con un método muy sencillo. La herramienta de evaluación propuesta, por otro lado, es un recurso útil para definir mejor el problema, y para evaluar mejor las soluciones propuestas por futuros investigadores. En esta Tesis también se presenta un método para evaluación automática de la interpretación vocal. Usa alineamiento temporal dinámico para alinear la interpretación del usuario con una referencia, proporcionando de esta forma una puntuación de precisión de afinación y de ritmo. La evaluación del sistema muestra una alta correlación entre las puntuaciones dadas por el sistema, y las puntuaciones anotadas por un grupo de músicos expertos
    corecore