6 research outputs found

    Automatic Drum Transcription and Source Separation

    Get PDF
    While research has been carried out on automated polyphonic music transcription, to-date the problem of automated polyphonic percussion transcription has not received the same degree of attention. A related problem is that of sound source separation, which attempts to separate a mixture signal into its constituent sources. This thesis focuses on the task of polyphonic percussion transcription and sound source separation of a limited set of drum instruments, namely the drums found in the standard rock/pop drum kit. As there was little previous research on polyphonic percussion transcription a broad review of music information retrieval methods, including previous polyphonic percussion systems, was also carried out to determine if there were any methods which were of potential use in the area of polyphonic drum transcription. Following on from this a review was conducted of general source separation and redundancy reduction techniques, such as Independent Component Analysis and Independent Subspace Analysis, as these techniques have shown potential in separating mixtures of sources. Upon completion of the review it was decided that a combination of the blind separation approach, Independent Subspace Analysis (ISA), with the use of prior knowledge as used in music information retrieval methods, was the best approach to tackling the problem of polyphonic percussion transcription as well as that of sound source separation. A number of new algorithms which combine the use of prior knowledge with the source separation abilities of techniques such as ISA are presented. These include sub-band ISA, Prior Subspace Analysis (PSA), and an automatic modelling and grouping technique which is used in conjunction with PSA to perform polyphonic percussion transcription. These approaches are demonstrated to be effective in the task of polyphonic percussion transcription, and PSA is also demonstrated to be capable of transcribing drums in the presence of pitched instruments

    Estudio de sistemas de estimación automática de acordes en música digitalizada

    Get PDF
    La transcripción de música supone la escritura de la partitura que origina un sonido musical. Este trabajo es difícil incluso para los especialistas humanos y realizarla de manera automática es un área de investigación abierta desde hace muchos años. Los sistemas más exitosos son aquellos capaces de utilizar diferentes informaciones auxiliares para ayudar en la tarea. Entre ellas se encuentra la armonía del sonido, el tempo, la métrica, la dinámica, etc. Se propone un trabajo de estudio e implementación de diferentes técnicas de análisis automático de estas características del sonido para analizar su influencia de la transcripció

    Analysis and resynthesis of polyphonic music

    Get PDF
    This thesis examines applications of Digital Signal Processing to the analysis, transformation, and resynthesis of musical audio. First I give an overview of the human perception of music. I then examine in detail the requirements for a system that can analyse, transcribe, process, and resynthesise monaural polyphonic music. I then describe and compare the possible hardware and software platforms. After this I describe a prototype hybrid system that attempts to carry out these tasks using a method based on additive synthesis. Next I present results from its application to a variety of musical examples, and critically assess its performance and limitations. I then address these issues in the design of a second system based on Gabor wavelets. I conclude by summarising the research and outlining suggestions for future developments

    Wavelet analysis for onset detection

    Get PDF
    Many of the auditory perception processes which researchers have sought to automate can be decomposed into stages, the first of which involves segmentation of the input audio. In music this stage equates to locating note onsets, and advances in this task should therefore ease further analyses. There are also many direct applications of onset detection, including synchronisation of audio with other media and location of significant time points in graphical editing of audio. It is for these reasons that this work focuses on the task of detecting onsets. An onset is considered as a particular type of change in the time-frequency representation of a sound. The modulus plane derived from a semitone-based harmonic wavelet analysis is first transformed, to account for the varying frequency sensitivity and mapping from amplitude to loudness observed in the human auditory system. Vectors are then derived from adjacent regions of the plane, and compared for change using Minkowski's distance measure. Peaks of distance correspond to significant changes, and commencing partials are sought at peak locations to identify onset peaks. The process of testing the method is considered in some detail, and an experiment is derived in which a test piece is recorded using a wide range of timbres from a MIDI synthesiser. The piece includes a repeated note and a range of intervals, and legato and staccato styles are demonstrated. Separate test cases demonstrate results in the presence of reverberation, dynamic variation, low notes, short notes, vibrato, tremolo and drum sounds (with overlapping cymbals). The main body of tests was conducted using a large number of parameter settings and variations of the analysis method (including different loudness scales and exponents in the distance measure) to achieve optimal results, but the reduction to a single analysis method with one parameter was also considered. The use of a novel technique to compensate for slowly rising onsets is also investigated. Although the domain is restricted to monophonic musical audio, many of the test cases contain overlap and the method is shown to have some potential in the analysis of polyphonic examples. The results of this experiment are assessed in the context of error tolerances, derived from consideration of a number of typical applications. It is shown that such assessment is not a straightforward matter and, for example, there may be interaction between the type of timbre and the error tolerance which will apply in a specific application. In summary, the thesis establishes that onset detection can be accomplished by monitoring a distance measure calculated from a harmonic wavelet analysis; and does this via the design and implementation of a comprehensive experiment

    Music-listening systems

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Architecture, 2000.Includes bibliographical references (p. [235]-248).When human listeners are confronted with musical sounds, they rapidly and automatically orient themselves in the music. Even musically untrained listeners have an exceptional ability to make rapid judgments about music from very short examples, such as determining the music's style, performer, beat, complexity, and emotional impact. However, there are presently no theories of music perception that can explain this behavior, and it has proven very difficult to build computer music-analysis tools with similar capabilities. This dissertation examines the psychoacoustic origins of the early stages of music listening in humans, using both experimental and computer-modeling approaches. The results of this research enable the construction of automatic machine-listening systems that can make human-like judgments about short musical stimuli. New models are presented that explain the perception of musical tempo, the perceived segmentation of sound scenes into multiple auditory images, and the extraction of musical features from complex musical sounds. These models are implemented as signal-processing and pattern-recognition computer programs, using the principle of understanding without separation. Two experiments with human listeners study the rapid assignment of high-level judgments to musical stimuli, and it is demonstrated that many of the experimental results can be explained with a multiple-regression model on the extracted musical features. From a theoretical standpoint, the thesis shows how theories of music perception can be grounded in a principled way upon psychoacoustic models in a computational-auditory-scene-analysis framework. Further, the perceptual theory presented is more relevant to everyday listeners and situations than are previous cognitive-structuralist approaches to music perception and cognition. From a practical standpoint, the various models form a set of computer signal-processing and pattern-recognition tools that can mimic human perceptual abilities on a variety of musical tasks such as tapping along with the beat, parsing music into sections, making semantic judgments about musical examples, and estimating the similarity of two pieces of music.Eric D. Scheirer.Ph.D
    corecore