45 research outputs found

    Rhythm Inference From Audio Recordings of Irish Traditional Music

    Get PDF
    A new method is proposed to infer rhythmic information from audio recordings of Irish traditional tunes. The method relies on he repetitive nature of this musical genre. Low-level spectral features and autocorrelation are used to obtain a low-dimensional representation, on which logistic regression models are trained. Two experiments are conducted to predict rhythmic information at different levels of precision. The method is tested on a collec- ion of session recordings, and high accuracy scores are reported. A new method is proposed to infer rhythmic information from audio recordings of Irish traditional tunes. The method relies on he repetitive nature of this musical genre. Low-level spectral features and autocorrelation are used to obtain a low-dimensional representation, on which logistic regression models are trained. Two experiments are conducted to predict rhythmic information at different levels of precision. The method is tested on a collection of session recordings, and high accuracy scores are reported

    Music Information Retrieval for Irish Traditional Music Automatic Analysis of Harmonic, Rhythmic, and Melodic Features for Efficient Key-Invariant Tune Recognition

    Get PDF
    Music making and listening practices increasingly rely on techno logy,and,asaconsequence,techniquesdevelopedinmusicinformation retrieval (MIR) research are more readily available to end users, in par ticular via online tools and smartphone apps. However, the majority of MIRresearchfocusesonWesternpopandclassicalmusic,andthusdoes not address specificities of other musical idioms. Irishtraditionalmusic(ITM)ispopularacrosstheglobe,withregular sessionsorganisedonallcontinents. ITMisadistinctivemusicalidiom, particularly in terms of heterophony and modality, and these character istics can constitute challenges for existing MIR algorithms. The bene fitsofdevelopingMIRmethodsspecificallytailoredtoITMisevidenced by Tunepal, a query-by-playing tool that has become popular among ITM practitioners since its release in 2009. As of today, Tunepal is the state of the art for tune recognition in ITM. The research in this thesis addresses existing limitations of Tunepal. The main goal is to find solutions to add key-invariance to the tune re cognitionsystem,animportantfeaturethatiscurrentlymissinginTune pal. Techniques from digital signal processing and machine learning are used and adapted to the specificities of ITM to extract harmonic iv and temporal features, respectively with improvements on existing key detection methods, and a novel method for rhythm classification. These featuresarethenusedtodevelopakey-invarianttunerecognitionsystem that is computationally efficient while maintaining retrieval accuracy to a comparable level to that of the existing system

    A statistical framework for embodied music cognition

    Get PDF

    A comparative study on human activity classification with miniature inertial and magnetic sensors

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 2011.Thesis (Master's) -- Bilkent University, 2011.Includes bibliographical references leaves 57-67.This study provides a comparative assessment on the different techniques of classifying human activities that are performed using body-worn miniature inertial and magnetic sensors. The classification techniques compared in this study are: naive Bayesian (NB) classifier, artificial neural networks (ANNs), dissimilarity-based classifier (DBC), various decision-tree methods, Gaussian mixture model (GMM), and support vector machines (SVM). The algorithms for these techniques are provided on two commonly used open source environments: Waikato environment for knowledge analysis (WEKA), a Java-based software; and pattern recognition toolbox (PRTools), a MATLAB toolbox. Human activities are classified using five sensor units worn on the chest, the arms, and the legs. Each sensor unit comprises a tri-axial gyroscope, a tri-axial accelerometer, and a tri-axial magnetometer. A feature set extracted from the raw sensor data using principal component analysis (PCA) is used in the classification process. Three different cross-validation techniques are employed to validate the classifiers. A performance comparison of the classification techniques is provided in terms of their correct differentiation rates, confusion matrices, and computational cost. The methods that result in the highest correct differentiation rates are found to be ANN (99.2%), SVM (99.2%), and GMM (99.1%). The magnetometer is the best type of sensor to be used in classification whereas gyroscope is the least useful. Considering the locations of the sensor units on body, the sensors worn on the legs seem to provide the most valuable information.Yüksek, Murat CihanM.S

    A Cross-Cultural Analysis of Music Structure

    Get PDF
    PhDMusic signal analysis is a research field concerning the extraction of meaningful information from musical audio signals. This thesis analyses the music signals from the note-level to the song-level in a bottom-up manner and situates the research in two Music information retrieval (MIR) problems: audio onset detection (AOD) and music structural segmentation (MSS). Most MIR tools are developed for and evaluated on Western music with specific musical knowledge encoded. This thesis approaches the investigated tasks from a cross-cultural perspective by developing audio features and algorithms applicable for both Western and non-Western genres. Two Chinese Jingju databases are collected to facilitate respectively the AOD and MSS tasks investigated. New features and algorithms for AOD are presented relying on fusion techniques. We show that fusion can significantly improve the performance of the constituent baseline AOD algorithms. A large-scale parameter analysis is carried out to identify the relations between system configurations and the musical properties of different music types. Novel audio features are developed to summarise music timbre, harmony and rhythm for its structural description. The new features serve as effective alternatives to commonly used ones, showing comparable performance on existing datasets, and surpass them on the Jingju dataset. A new segmentation algorithm is presented which effectively captures the structural characteristics of Jingju. By evaluating the presented audio features and different segmentation algorithms incorporating different structural principles for the investigated music types, this thesis also identifies the underlying relations between audio features, segmentation methods and music genres in the scenario of music structural analysis.China Scholarship Council EPSRC C4DM Travel Funding, EPSRC Fusing Semantic and Audio Technologies for Intelligent Music Production and Consumption (EP/L019981/1), EPSRC Platform Grant on Digital Music (EP/K009559/1), European Research Council project CompMusic, International Society for Music Information Retrieval Student Grant, QMUL Postgraduate Research Fund, QMUL-BUPT Joint Programme Funding Women in Music Information Retrieval Grant

    Sequential decision making in artificial musical intelligence

    Get PDF
    Over the past 60 years, artificial intelligence has grown from a largely academic field of research to a ubiquitous array of tools and approaches used in everyday technology. Despite its many recent successes and growing prevalence, certain meaningful facets of computational intelligence have not been as thoroughly explored. Such additional facets cover a wide array of complex mental tasks which humans carry out easily, yet are difficult for computers to mimic. A prime example of a domain in which human intelligence thrives, but machine understanding is still fairly limited, is music. Over the last decade, many researchers have applied computational tools to carry out tasks such as genre identification, music summarization, music database querying, and melodic segmentation. While these are all useful algorithmic solutions, we are still a long way from constructing complete music agents, able to mimic (at least partially) the complexity with which humans approach music. One key aspect which hasn't been sufficiently studied is that of sequential decision making in musical intelligence. This thesis strives to answer the following question: Can a sequential decision making perspective guide us in the creation of better music agents, and social agents in general? And if so, how? More specifically, this thesis focuses on two aspects of musical intelligence: music recommendation and human-agent (and more generally agent-agent) interaction in the context of music. The key contributions of this thesis are the design of better music playlist recommendation algorithms; the design of algorithms for tracking user preferences over time; new approaches for modeling people's behavior in situations that involve music; and the design of agents capable of meaningful interaction with humans and other agents in a setting where music plays a roll (either directly or indirectly). Though motivated primarily by music-related tasks, and focusing largely on people's musical preferences, this thesis also establishes that insights from music-specific case studies can also be applicable in other concrete social domains, such as different types of content recommendation. Showing the generality of insights from musical data in other contexts serves as evidence for the utility of music domains as testbeds for the development of general artificial intelligence techniques. Ultimately, this thesis demonstrates the overall usefulness of taking a sequential decision making approach in settings previously unexplored from this perspectiveComputer Science

    Measuring Expressive Music Performances: a Performance Science Model using Symbolic Approximation

    Get PDF
    Music Performance Science (MPS), sometimes termed systematic musicology in Northern Europe, is concerned with designing, testing and applying quantitative measurements to music performances. It has applications in art musics, jazz and other genres. It is least concerned with aesthetic judgements or with ontological considerations of artworks that stand alone from their instantiations in performances. Musicians deliver expressive performances by manipulating multiple, simultaneous variables including, but not limited to: tempo, acceleration and deceleration, dynamics, rates of change of dynamic levels, intonation and articulation. There are significant complexities when handling multivariate music datasets of significant scale. A critical issue in analyzing any types of large datasets is the likelihood of detecting meaningless relationships the more dimensions are included. One possible choice is to create algorithms that address both volume and complexity. Another, and the approach chosen here, is to apply techniques that reduce both the dimensionality and numerosity of the music datasets while assuring the statistical significance of results. This dissertation describes a flexible computational model, based on symbolic approximation of timeseries, that can extract time-related characteristics of music performances to generate performance fingerprints (dissimilarities from an ‘average performance’) to be used for comparative purposes. The model is applied to recordings of Arnold Schoenberg’s Phantasy for Violin with Piano Accompaniment, Opus 47 (1949), having initially been validated on Chopin Mazurkas.1 The results are subsequently used to test hypotheses about evolution in performance styles of the Phantasy since its composition. It is hoped that further research will examine other works and types of music in order to improve this model and make it useful to other music researchers. In addition to its benefits for performance analysis, it is suggested that the model has clear applications at least in music fraud detection, Music Information Retrieval (MIR) and in pedagogical applications for music education

    Machine learning techniques for music information retrieval

    Get PDF
    Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015The advent of digital music has changed the rules of music consumption, distribution and sales. With it has emerged the need to effectively search and manage vast music collections. Music information retrieval is an interdisciplinary field of research that focuses on the development of new techniques with that aim in mind. This dissertation addresses a specific aspect of this field: methods that automatically extract musical information exclusively based on the audio signal. We propose a method for automatic music-based classification, label inference, and music similarity estimation. Our method consist in representing the audio with a finite set of symbols and then modeling the symbols time evolution. The symbols are obtained via vector quantization in which a single codebook is used to quantize the audio descriptors. The symbols time evolution is modeled via a first order Markov process. Based on systematic evaluations we carried out on publicly available sets, we show that our method achieves performances on par with most techniques found in literature. We also present and discuss the problems that appear when computers try to classify or annotate songs using the audio as the only source of information. In our method, the separation of quantization process from the creation and training of classification models helped us in that analysis. It enabled us to examine how instantaneous sound attributes (henceforth features) are distributed in term of musical genre, and how designing codebooks specially tailored for these distributions affects the performance of ours and other classification systems commonly used for this task. On this issue, we show that there is no apparent benefit in seeking a thorough representation of the feature space. This is a bit unexpected since it goes against the assumption that features carry equally relevant information loads and somehow capture the specificities of musical facets, implicit in many genre recognition methods. Label inference is the task of automatically annotating songs with semantic words - this tasks is also known as autotagging. In this context, we illustrate the importance of a number of issues, that in our perspective, are often overlooked. We show that current techniques are fragile in the sense that small alterations in the set of labels may lead to dramatically different results. Furthermore, through a series of experiments, we show that autotagging systems fail to learn tag models capable to generalize to datasets of different origins. We also show that the performance achieved with these techniques is not sufficient to be able to take advantage of the correlations between tags.Fundação para a Ciência e a Tecnologia (FCT
    corecore