13 research outputs found

    Probabilistic Segmentation of Folk Music Recordings

    Get PDF
    The paper presents a novel method for automatic segmentation of folk music field recordings. The method is based on a distance measure that uses dynamic time warping to cope with tempo variations and a dynamic programming approach to handle pitch drifting for finding similarities and estimating the length of repeating segment. A probabilistic framework based on HMM is used to find segment boundaries, searching for optimal match between the expected segment length, between-segment similarities, and likely locations of segment beginnings. Evaluation of several current state-of-the-art approaches for segmentation of commercial music is presented and their weaknesses when dealing with folk music are exposed, such as intolerance to pitch drift and variable tempo. The proposed method is evaluated and its performance analyzed on a collection of 206 folk songs of different ensemble types: solo, two- and three-voiced, choir, instrumental, and instrumental with singing. It outperforms current commercial music segmentation methods for noninstrumental music and is on a par with the best for instrumental recordings. The method is also comparable to a more specialized method for segmentation of solo singing folk music recordings

    Signal processing methods for beat tracking, music segmentation, and audio retrieval

    Get PDF
    The goal of music information retrieval (MIR) is to develop novel strategies and techniques for organizing, exploring, accessing, and understanding music data in an efficient manner. The conversion of waveform-based audio data into semantically meaningful feature representations by the use of digital signal processing techniques is at the center of MIR and constitutes a difficult field of research because of the complexity and diversity of music signals. In this thesis, we introduce novel signal processing methods that allow for extracting musically meaningful information from audio signals. As main strategy, we exploit musical knowledge about the signals\u27 properties to derive feature representations that show a significant degree of robustness against musical variations but still exhibit a high musical expressiveness. We apply this general strategy to three different areas of MIR: Firstly, we introduce novel techniques for extracting tempo and beat information, where we particularly consider challenging music with changing tempo and soft note onsets. Secondly, we present novel algorithms for the automated segmentation and analysis of folk song field recordings, where one has to cope with significant fluctuations in intonation and tempo as well as recording artifacts. Thirdly, we explore a cross-version approach to content-based music retrieval based on the query-by-example paradigm. In all three areas, we focus on application scenarios where strong musical variations make the extraction of musically meaningful information a challenging task.Ziel der automatisierten Musikverarbeitung ist die Entwicklung neuer Strategien und Techniken zur effizienten Organisation großer Musiksammlungen. Ein Schwerpunkt liegt in der Anwendung von Methoden der digitalen Signalverarbeitung zur Umwandlung von Audiosignalen in musikalisch aussagekrĂ€ftige Merkmalsdarstellungen. Große Herausforderungen bei dieser Aufgabe ergeben sich aus der KomplexitĂ€t und Vielschichtigkeit der Musiksignale. In dieser Arbeit werden neuartige Methoden vorgestellt, mit deren Hilfe musikalisch interpretierbare Information aus Musiksignalen extrahiert werden kann. Hierbei besteht eine grundlegende Strategie in der konsequenten Ausnutzung musikalischen Vorwissens, um Merkmalsdarstellungen abzuleiten die zum einen ein hohes Maß an Robustheit gegenĂŒber musikalischen Variationen und zum anderen eine hohe musikalische Ausdruckskraft besitzen. Dieses Prinzip wenden wir auf drei verschieden Aufgabenstellungen an: Erstens stellen wir neuartige AnsĂ€tze zur Extraktion von Tempo- und Beat-Information aus Audiosignalen vor, die insbesondere auf anspruchsvolle Szenarien mit wechselnden Tempo und weichen NotenanfĂ€ngen angewendet werden. Zweitens tragen wir mit neuartigen Algorithmen zur Segmentierung und Analyse von Feldaufnahmen von Volksliedern unter Vorliegen großer Intonationsschwankungen bei. Drittens entwickeln wir effiziente Verfahren zur inhaltsbasierten Suche in großen DatenbestĂ€nden mit dem Ziel, verschiedene Interpretationen eines MusikstĂŒckes zu detektieren. In allen betrachteten Szenarien richten wir unser Augenmerk insbesondere auf die FĂ€lle in denen auf Grund erheblicher musikalischer Variationen die Extraktion musikalisch aussagekrĂ€ftiger Informationen eine große Herausforderung darstellt

    Visualizing music structure using Spotify data

    Get PDF

    Finding the most representative part of vocal folksongs with transcription and segmentation

    Get PDF
    The goal of musical segmentation is to develop algorithms that will find similar patterns in audio signal according to desired aspect (melody, rhythm, timbre) and to define the boundaries between the repetitions. The goal of musical transcription is to develop algorithms that will extract pitches from the audio signal in every time frame either for monophonic or polyphonic music. Music segmentation and transcription represent two very important parts of music information retrieval research field. The results can be used in many real-life applications: with music segmentation we can define musical structure, melodic repetitions in music or we can use it in search for most representative part; transcription results can be used in automatic generation of scores, as a support in manual transcription process or in search of similar melodies in musical collections. In the presented dissertation we are addressing specific problems of musical segmentation and transcription of audio recordings: segmentation and transcription of folk music audio recordings. Currently developed methods fail on folk music due to it's specifics, such as bad recording conditions and amateur performers, which are the reason for high level of noise in recordings, inaccurate singing, pitch drifting throughout the song etc. In introduction section we give the motivation for conducting the research and define the problems and goals of the thesis in the detail. The first part of the dissertation presents the research from field of music segmentation, where we present a folk music segmentation method, that outperforms current state-of-the-art methods on a collection of folk music. The presented segmentation method bases on a probabilistic model for finding melodically repeating parts in recording and defining their beginnings. The method was evaluated on a folk music collection of different types: solo singing, two- and three-voiced singing, choir songs, instrumental songs and mixed assembles. The developed method was also evaluated according to robustness aspect, where resistance to different degradations was tested and evaluated. The second part of the dissertation addresses musical transcription, where we present a folk music transcription method. The method uses the segmentation results to find a representative part of a song and transcribes it with use of all the repetitions within the song. The method takes multiple fundamental frequencies estimations calculated with an existing method and song segmentation. With use of segmentation results the method aligns the multiple fundamental frequencies estimations in temporal and frequency domain, removes local inaccuracies and joins the transcriptions of all repeating parts. In next stage the method calculates notes using two-level probabilistic model based on explicit duration Hidden Markov models, used to model notes, rests and note transitions. The presented method was evaluated on collection of polyphonic folk music, where it returns better results of current state-of-the-art music transcription methods. In the conclusions we highlight the scientific contributions of the thesis and give the directions for possible future improvements and extensions of the method

    A Cross-Cultural Analysis of Music Structure

    Get PDF
    PhDMusic signal analysis is a research field concerning the extraction of meaningful information from musical audio signals. This thesis analyses the music signals from the note-level to the song-level in a bottom-up manner and situates the research in two Music information retrieval (MIR) problems: audio onset detection (AOD) and music structural segmentation (MSS). Most MIR tools are developed for and evaluated on Western music with specific musical knowledge encoded. This thesis approaches the investigated tasks from a cross-cultural perspective by developing audio features and algorithms applicable for both Western and non-Western genres. Two Chinese Jingju databases are collected to facilitate respectively the AOD and MSS tasks investigated. New features and algorithms for AOD are presented relying on fusion techniques. We show that fusion can significantly improve the performance of the constituent baseline AOD algorithms. A large-scale parameter analysis is carried out to identify the relations between system configurations and the musical properties of different music types. Novel audio features are developed to summarise music timbre, harmony and rhythm for its structural description. The new features serve as effective alternatives to commonly used ones, showing comparable performance on existing datasets, and surpass them on the Jingju dataset. A new segmentation algorithm is presented which effectively captures the structural characteristics of Jingju. By evaluating the presented audio features and different segmentation algorithms incorporating different structural principles for the investigated music types, this thesis also identifies the underlying relations between audio features, segmentation methods and music genres in the scenario of music structural analysis.China Scholarship Council EPSRC C4DM Travel Funding, EPSRC Fusing Semantic and Audio Technologies for Intelligent Music Production and Consumption (EP/L019981/1), EPSRC Platform Grant on Digital Music (EP/K009559/1), European Research Council project CompMusic, International Society for Music Information Retrieval Student Grant, QMUL Postgraduate Research Fund, QMUL-BUPT Joint Programme Funding Women in Music Information Retrieval Grant

    Explaining Listener Differences in the Perception of Musical Structure

    Get PDF
    PhDState-of-the-art models for the perception of grouping structure in music do not attempt to account for disagreements among listeners. But understanding these disagreements, sometimes regarded as noise in psychological studies, may be essential to fully understanding how listeners perceive grouping structure. Over the course of four studies in different disciplines, this thesis develops and presents evidence to support the hypothesis that attention is a key factor in accounting for listeners' perceptions of boundaries and groupings, and hence a key to explaining their disagreements. First, we conduct a case study of the disagreements between two listeners. By studying the justi cations each listener gave for their analyses, we argue that the disagreements arose directly from differences in attention, and indirectly from differences in information, expectation, and ontological commitments made in the opening moments. Second, in a large-scale corpus study, we study the extent to which acoustic novelty can account for the boundary perceptions of listeners. The results indicate that novelty is correlated with boundary salience, but that novelty is a necessary but not su cient condition for being perceived as a boundary. Third, we develop an algorithm that optimally reconstructs a listener's analysis in terms of the patterns of similarity within a piece of music. We demonstrate how the output can identify good justifications for an analysis and account for disagreements between two analyses. Finally, having introduced and developed the hypothesis that disagreements between listeners may be attributable to differences in attention, we test the hypothesis in a sequence of experiments. We find that by manipulating the attention of participants, we are able to influence the groupings and boundaries they find most salient. From the sum of this research, we conclude that a listener's attention is a crucial factor affecting how listeners perceive the grouping structure of music.Social Sciences and Humanities Research Council; a PhD studentship from Queen Mary University of London; a Provost's Ph.D. Fellowship from the University of Southern California. This material is also based in part on work supported by the National Science Foundation under Grant No. 0347988

    Patterns in Motion - From the Detection of Primitives to Steering Animations

    Get PDF
    In recent decades, the world of technology has developed rapidly. Illustrative of this trend is the growing number of affrdable methods for recording new and bigger data sets. The resulting masses of multivariate and high-dimensional data represent a new challenge for research and industry. This thesis is dedicated to the development of novel methods for processing multivariate time series data, thus meeting this Data Science related challenge. This is done by introducing a range of different methods designed to deal with time series data. The variety of methods re ects the different requirements and the typical stage of data processing ranging from pre-processing to post- processing and data recycling. Many of the techniques introduced work in a general setting. However, various types of motion recordings of human and animal subjects were chosen as representatives of multi-variate time series. The different data modalities include Motion Capture data, accelerations, gyroscopes, electromyography, depth data (Kinect) and animated 3D-meshes. It is the goal of this thesis to provide a deeper understanding of working with multi-variate time series by taking the example of multi-variate motion data. However, in order to maintain an overview of the matter, the thesis follows a basic general pipeline. This pipeline was developed as a guideline for time series processing and is the first contribution of this work. Each part of the thesis represents one important stage of this pipeline which can be summarized under the topics segmentation, analysis and synthesis. Specific examples of different data modalities, processing requirements and methods to meet those are discussed in the chapters of the respective parts. One important contribution of this thesis is a novel method for temporal segmentation of motion data. It is based on the idea of self-similarities within motion data and is capable of unsupervised segmentation of range of motion data into distinct activities and motion primitives. The examples concerned with the analysis of multi-variate time series re ect the role of data analysis in different inter-disciplinary contexts and also the variety of requirements that comes with collaboration with other sciences. These requirements are directly connected to current challenges in data science. Finally, the problem of synthesis of multi-variate time series is discussed using a graph-based example and examples related to rigging or steering of meshes. Synthesis is an important stage in data processing because it creates new data from existing ones in a controlled way. This makes exploiting existing data sets and and access of more condensed data possible, thus providing feasible alternatives to otherwise time-consuming manual processing.Muster in Bewegung - Von der Erkennung von Primitiven zur Steuerung von Animationen In den letzten Jahrzehnten hat sich die Welt der Technologie rapide entwickelt. Beispielhaft fĂŒr diese Entwicklung ist die wachsende Zahl erschwinglicher Methoden zum Aufzeichnen neuer und immer grĂ¶ĂŸerer Datenmengen. Die sich daraus ergebenden Massen multivariater und hochdimensionaler Daten stellen Forschung wie Industrie vor neuartige Probleme. Diese Arbeit ist der Entwicklung neuer Verfahren zur Verarbeitung multivariater Zeitreihen gewidmet und stellt sich damit einer großen Herausforderung, welche unmittelbar mit dem neuen Feld der sogenannten Data Science verbunden ist. In ihr werden ein Reihe von verschiedenen Verfahren zur Verarbeitung multivariater Zeitserien eingefĂŒhrt. Die verschiedenen Verfahren gehen jeweils auf unterschiedliche Anforderungen und typische Stadien der Datenverarbeitung ein und reichen von Vorverarbeitung bis zur Nachverarbeitung und darĂŒber hinaus zur Wiederverwertung. Viele der vorgestellten Techniken eignen sich zur Verarbeitung allgemeiner multivariater Zeitreihen. Allerdings wurden hier eine Anzahl verschiedenartiger Aufnahmen von menschlichen und tierischen Subjekte ausgewĂ€hlt, welche als Vertreter fĂŒr allgemeine multivariate Zeitreihen gelten können. Zu den unterschiedlichen ModalitĂ€ten der Aufnahmen gehören Motion Capture Daten, Beschleunigungen, Gyroskopdaten, Elektromyographie, Tiefenbilder ( Kinect ) und animierte 3D -Meshes. Es ist das Ziel dieser Arbeit, am Beispiel der multivariaten Bewegungsdaten ein tieferes Verstndnis fĂŒr den Umgang mit multivariaten Zeitreihen zu vermitteln. Um jedoch einen Überblick ber die Materie zu wahren, folgt sie jedoch einer grundlegenden und allgemeinen Pipeline. Diese Pipeline wurde als Leitfaden fĂŒr die Verarbeitung von Zeitreihen entwickelt und ist der erste Beitrag dieser Arbeit. Jeder weitere Teil der Arbeit behandelt eine von drei grĂ¶ĂŸeren Stationen in der Pipeline, welche sich unter unter die Themen Segmentierung, Analyse und Synthese eingliedern lassen. Beispiele verschiedener DatenmodalitĂ€ten und Anforderungen an ihre Verarbeitung erlĂ€utern die jeweiligen Verfahren. Ein wichtiger Beitrag dieser Arbeit ist ein neuartiges Verfahren zur zeitlichen Segmentierung von Bewegungsdaten. Dieses basiert auf der Idee der SelbstĂ€hnlichkeit von Bewegungsdaten und ist in der Lage, verschiedenste Bewegungsdaten voll-automatisch in unterschiedliche AktivitĂ€ten und Bewegungs-Primitive zu zerlegen. Die Beispiele fr die Analyse multivariater Zeitreihen spiegeln die Rolle der Datenanalyse in verschiedenen interdisziplinĂ€ren ZusammenhĂ€nge besonders wider und illustrieren auch die Vielfalt der Anforderungen, die sich in interdisziplinĂ€ren Kontexten auftun. Schließlich wird das Problem der Synthese multivariater Zeitreihen unter Verwendung eines graph-basierten und eines Steering Beispiels diskutiert. Synthese ist insofern ein wichtiger Schritt in der Datenverarbeitung, da sie es erlaubt, auf kontrollierte Art neue Daten aus vorhandenen zu erzeugen. Dies macht die Nutzung bestehender DatensĂ€tze und den Zugang zu dichteren Datenmodellen möglich, wodurch Alternativen zur ansonsten zeitaufwendigen manuellen Verarbeitung aufgezeigt werden

    Semantic Audio Analysis Utilities and Applications.

    Get PDF
    PhDExtraction, representation, organisation and application of metadata about audio recordings are in the concern of semantic audio analysis. Our broad interpretation, aligned with recent developments in the field, includes methodological aspects of semantic audio, such as those related to information management, knowledge representation and applications of the extracted information. In particular, we look at how Semantic Web technologies may be used to enhance information management practices in two audio related areas: music informatics and music production. In the first area, we are concerned with music information retrieval (MIR) and related research. We examine how structured data may be used to support reproducibility and provenance of extracted information, and aim to support multi-modality and context adaptation in the analysis. In creative music production, our goals can be summarised as follows: O↔-the-shelf sound editors do not hold appropriately structured information about the edited material, thus human-computer interaction is inefficient. We believe that recent developments in sound analysis and music understanding are capable of bringing about significant improvements in the music production workflow. Providing visual cues related to music structure can serve as an example of intelligent, context-dependent functionality. The central contributions of this work are a Semantic Web ontology for describing recording studios, including a model of technological artefacts used in music production, methodologies for collecting data about music production workflows and describing the work of audio engineers which facilitates capturing their contribution to music production, and finally a framework for creating Web-based applications for automated audio analysis. This has applications demonstrating how Semantic Web technologies and ontologies can facilitate interoperability between music research tools, and the creation of semantic audio software, for instance, for music recommendation, temperament estimation or multi-modal music tutorin
    corecore