5 research outputs found

    Discriminant analysis for perceptionally comparable classes

    Full text link

    An review of automatic drum transcription

    Get PDF
    In Western popular music, drums and percussion are an important means to emphasize and shape the rhythm, often deļ¬ning the musical style. If computers were able to analyze the drum part in recorded music, it would enable a variety of rhythm-related music processing tasks. Especially the detection and classiļ¬cation of drum sound events by computational methods is considered to be an important and challenging research problem in the broader ļ¬eld of Music Information Retrieval. Over the last two decades, several authors have attempted to tackle this problem under the umbrella term Automatic Drum Transcription(ADT).This paper presents a comprehensive review of ADT research, including a thorough discussion of the task-speciļ¬c challenges, categorization of existing techniques, and evaluation of several state-of-the-art systems. To provide more insights on the practice of ADT systems, we focus on two families of ADT techniques, namely methods based on Nonnegative Matrix Factorization and Recurrent Neural Networks. We explain the methodsā€™ technical details and drum-speciļ¬c variations and evaluate these approaches on publicly available datasets with a consistent experimental setup. Finally, the open issues and under-explored areas in ADT research are identiļ¬ed and discussed, providing future directions in this ļ¬el

    Isochronous rhythmic organization of learned animal vocalizations

    Get PDF
    The evolutionary path that led to music as we know it today is difficult to trace. Cross-species comparative research can help us uncover the biological substrates that enabled humans to develop this peculiar behavior. Rhythm, the organization of events in time, is a central component in the structure of all forms of music. Oftentimes musical rhythm gives rise to a perceptionally isochronous beat, or pulse. Learned vocalizations of non-human animals, such as birdsong and the songs of certain bat species, show striking parallels to vocal music (i.e. human song). This thesis investigates these vocalizations for the presence of an isochronous rhythmic structure that could allow a conspecific listener to perceive such a beat. To this end, I have developed a generate-and-test (GAT) method to extract an isochronous pulse from a temporal sequence of events, such as the onsets of notes. This method is compared to a variety of existing analytic techniques for analyzing different aspects of rhythms in vocalizations, movements and other behaviors developing over time. The suitability of the different methods for addressing particular questions is illustrated through various examples. The application of the GAT approach to different types of vocalizations of the greater sac-winged bat (Saccopteryx bilineata) revealed a common temporal regularity that might point towards an interesting relationship between physiologically determined rhythm and the rhythm of learned social vocalizations. In the songs of zebra finches (Taeniopygia guttata) we discovered a hierarchical isochronous structure that is reminiscent of the metrical structure of many types of music. We then report the effect of genetic manipulations on the song learning success of zebra finches. The expression of FoxP2, a gene involved in speech acquisition and birdsong learning, as well as of two related genes, FoxP1 and FoxP4, was experimentally reduced in juvenile birds during their learning period. Among other effects, the adult birds produced song with an impaired isochronous structure. Surprisingly, control animals whose FoxP levels were not reduced, showed a similar effect in this regard. I discuss possible interpretations of this result in the light of current knowledge about neural mechanisms and behavioral processes of song learning and production

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 4th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2005, held 29-31 October 2005, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
    corecore