1,192 research outputs found

    Musical Instrument Timbres Classification with Spectral Features

    Get PDF
    A set of features is evaluated for recognition of musical instruments out of monophonic musical signals. Aiming to achieve a compact representation, the adopted features regard only spectral characteristics of sound and are limited in number. On top of these descriptors, various classification methods are implemented and tested. Over a dataset of 1007 tones from 27 musical instruments, support vector machines and quadratic discriminant analysis show comparable results with success rates close to 70% of successful classifications. Canonical discriminant analysis never had momentous results, while nearest neighbours performed on average among the employed classifiers. Strings have been the most misclassified instrument family, while very satisfactory results have been obtained with brass and woodwinds. The most relevant features are demonstrated to be the inharmonicity, the spectral centroid, and the energy contained in the first partial

    A flexible bio-inspired hierarchical model for analyzing musical timbre

    Get PDF
    A flexible and multipurpose bio-inspired hierarchical model for analyzing musical timbre is presented in this paper. Inspired by findings in the fields of neuroscience, computational neuroscience, and psychoacoustics, not only does the model extract spectral and temporal characteristics of a signal, but it also analyzes amplitude modulations on different timescales. It uses a cochlear filter bank to resolve the spectral components of a sound, lateral inhibition to enhance spectral resolution, and a modulation filter bank to extract the global temporal envelope and roughness of the sound from amplitude modulations. The model was evaluated in three applications. First, it was used to simulate subjective data from two roughness experiments. Second, it was used for musical instrument classification using the k-NN algorithm and a Bayesian network. Third, it was applied to find the features that characterize sounds whose timbres were labeled in an audiovisual experiment. The successful application of the proposed model in these diverse tasks revealed its potential in capturing timbral information

    Audio Classification from Time-Frequency Texture

    Full text link
    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments

    Note-based segmentation and hierarchy in the classification of digital musical instruments

    Get PDF
    The ability to automatically identify the musical instruments occurring in a recorded piece of music has important uses for various music-related applications. This paper examines the case of instrument classification where the raw data consists of musical phrases performed on digital instruments from eight instrument families. We compare the use of extracted features from a continuous sample of approximately one second, to the use of a systematic segmentation of the audio on note boundaries and using multiple, aligned note samples as input to classifiers. The accuracy of the segmented approach was greater than the one of the unsegmented approach. The best method was using a two-tiered hierarchical method which performed slightly better than the single-tiered flat approach. The best performing instrument category was woodwind, with an accuracy of 94% for the segmented approach, using the Bayesian network classifier. Distinguishing different types of pianos was difficult for all classifiers, with the segmented approach yielding an accuracy of 56%. For humans, broadly similar results were found, in that pianos were difficult to distinguish, along with woodwind and solo string instruments. However there was no symmetry between human comparisons of identical instruments and different instruments, with half of the broad instrument categories having widely different accuracies for the two cases

    ASSISTED SOUND SAMPLE GENERATION WITH MUSICAL CONDITIONING IN ADVERSARIAL AUTO-ENCODERS

    Get PDF
    International audienceDeep generative neural networks have thrived in the field of computer vision, enabling unprecedented intelligent image processes. Yet the results in audio remain less advanced and many applications are still to be investigated. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including semantic controls that can be adapted to different sound libraries and specific tags. These generative variables should allow expressive modulations of target musical qualities and continuously mix into new styles. To this extent we train auto-encoders on an orchestral database of individual note samples, along with their intrinsic attributes: note class, timbre domain (an instrument subset) and extended playing techniques. We condition the decoder for explicit control over the rendered note attributes and use latent adversarial training for learning expressive style parameters that can ultimately be mixed. We evaluate both generative performances and correlations of the attributes with the latent representation. Our ablation study demonstrates the effectiveness of the musical conditioning. The proposed model generates individual notes as magnitude spec-trograms from any probabilistic latent code samples (each latent point maps to a single note), with expressive control of orchestral timbres and playing styles. Its training data subsets can directly be visualized in the 3-dimensional latent representation. Wave-form rendering can be done offline with the Griffin-Lim algorithm. In order to allow real-time interactions, we fine-tune the decoder with a pretrained magnitude spectrogram inversion network and embed the full waveform generation pipeline in a plugin. Moreover the encoder could be used to process new input samples, after manipulating their latent attribute representation, the decoder can generate sample variations as an audio effect would. Our solution remains rather lightweight and fast to train, it can directly be applied to other sound domains, including an user's libraries with custom sound tags that could be mapped to specific genera-tive controls. As a result, it fosters creativity and intuitive audio style experimentations. Sound examples and additional visualiza-tions are available on Github 1 , as well as codes after the review process

    Ontology of music performance variation

    Get PDF
    Performance variation in rhythm determines the extent that humans perceive and feel the effect of rhythmic pulsation and music in general. In many cases, these rhythmic variations can be linked to percussive performance. Such percussive performance variations are often absent in current percussive rhythmic models. The purpose of this thesis is to present an interactive computer model, called the PD-103, that simulates the micro-variations in human percussive performance. This thesis makes three main contributions to existing knowledge: firstly, by formalising a new method for modelling percussive performance; secondly, by developing a new compositional software tool called the PD-103 that models human percussive performance, and finally, by creating a portfolio of different musical styles to demonstrate the capabilities of the software. A large database of recorded samples are classified into zones based upon the vibrational characteristics of the instruments, to model timbral variation in human percussive performance. The degree of timbral variation is governed by principles of biomechanics and human percussive performance. A fuzzy logic algorithm is applied to analyse current and first-order sample selection in order to formulate an ontological description of music performance variation. Asynchrony values were extracted from recorded performances of three different performance skill levels to create \timing fingerprints" which characterise unique features to each percussionist. The PD-103 uses real performance timing data to determine asynchrony values for each synthesised note. The spectral content of the sample database forms a three-dimensional loudness/timbre space, intersecting instrumental behaviour with music composition. The reparameterisation of the sample database, following the analysis of loudness, spectral flatness, and spectral centroid, provides an opportunity to explore the timbral variations inherent in percussion instruments, to creatively explore dimensions of timbre. The PD-103 was used to create a music portfolio exploring different rhythmic possibilities with a focus on meso-periodic rhythms common to parts of West Africa, jazz drumming, and electroacoustic music. The portfolio also includes new timbral percussive works based on spectral features and demonstrates the central aim of this thesis, which is the creation of a new compositional software tool that integrates human percussive performance and subsequently extends this model to different genres of music

    Concatenative Synthesis for Novel Timbral Creation

    Get PDF
    Modern day musicians rely on a variety of instruments for musical expression. Tones produced from electronic instruments have become almost as commonplace as those produced by traditional ones as evidenced by the plethora of artists who can be found composing and performing with nothing more than a personal computer. This desire to embrace technical innovation as a means to augment performance art has created a budding field in computer science that explores the creation and manipulation of sound for artistic purposes. One facet of this new frontier concerns timbral creation, or the development of new sounds with unique characteristics that can be wielded by the musician as a virtual instrument. This thesis presents Timcat, a software system that can be used to create novel timbres from prerecorded audio. Various techniques for timbral feature extraction from short audio clips, or grains, are evaluated for use in timbral feature spaces. Clustering is performed on feature vectors in these spaces and groupings are recombined using concatenative synthesis techniques in order to form new instrument patches. The results reveal that interesting timbres can be created using features extracted by both newly developed and existing signal analysis techniques, many common in other fields though not often applied to music audio signals. Several of the features employed also show high accuracy for instrument separation in randomly mixed tracks. Survey results demonstrate positive feedback concerning the timbres created by Timcat from electronic music composers, musicians, and music lovers alike
    corecore