Search CORE

1,192 research outputs found

Musical Instrument Timbres Classification with Spectral Features

Author: Emanuele Pollastri
Giulio Agostini
Maurizio Longari
Publication venue
Publication date: 01/01/2003
Field of study

A set of features is evaluated for recognition of musical instruments out of monophonic musical signals. Aiming to achieve a compact representation, the adopted features regard only spectral characteristics of sound and are limited in number. On top of these descriptors, various classification methods are implemented and tested. Over a dataset of 1007 tones from 27 musical instruments, support vector machines and quadratic discriminant analysis show comparable results with success rates close to 70% of successful classifications. Canonical discriminant analysis never had momentous results, while nearest neighbours performed on average among the employed classifiers. Strings have been the most misclassified instrument family, while very satisfactory results have been obtained with brass and woodwinds. The most relevant features are demonstrated to be the inharmonicity, the spectral centroid, and the energy contained in the first partial

Springer - Publisher Connector

Directory of Open Access Journals

Open Access Repository

Recommended from our members

Timbre space as synthesis space: towards a navigation based approach to timbre specification

Author: Holland Simon
Mulholland Paul
Seago Allan
Publication venue
Publication date: 01/01/2008
Field of study

Much research into timbre, its perception and classification over the last forty years has modelled timbre as an n-dimensional co-ordinate space or timbre space, whose axes are measurable acoustical quantities (variously, spectral density, simultaneity of partial onsets etc). Typically, these spaces have been constructed from data generated from similarity/dissimilarity listening tests, using multidimensional scaling (MDS) analysis techniques. Our current research is the computer assisted synthesis of new timbres using a timbre space search strategy, in which a previously constructed simple timbre space is used as a search space by an algorithm designed to synthesize desired new timbres steered by iterative user input. The success of such an algorithm clearly depends on establishing suitable mapping between its quantifiable features and its perceptual features. We therefore present here, firstly, some of the findings of a series of listening tests aimed at establishing the perceptual topography and granularity of a simple, predefined timbre space, and secondly, the results of preliminary tests of two search strategies designed to navigate this space. The behaviour of these strategies in a circumscribed space of this kind, together with the corresponding user experience is intended to provide a baseline to applications in a more complex space

Open Research Online (The Open University)

A flexible bio-inspired hierarchical model for analyzing musical timbre

Author: Adeli Mohammad
Molotchnikoff Stéphane
Plourde Eric
Rouat Jean
Wood Sean
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

A flexible and multipurpose bio-inspired hierarchical model for analyzing musical timbre is presented in this paper. Inspired by findings in the fields of neuroscience, computational neuroscience, and psychoacoustics, not only does the model extract spectral and temporal characteristics of a signal, but it also analyzes amplitude modulations on different timescales. It uses a cochlear filter bank to resolve the spectral components of a sound, lateral inhibition to enhance spectral resolution, and a modulation filter bank to extract the global temporal envelope and roughness of the sound from amplitude modulations. The model was evaluated in three applications. First, it was used to simulate subjective data from two roughness experiments. Second, it was used for musical instrument classification using the k-NN algorithm and a Bayesian network. Third, it was applied to find the features that characterize sounds whose timbres were labeled in an audiovisual experiment. The successful application of the proposed model in these diverse tasks revealed its potential in capturing timbral information

Crossref

Savoirs UdeS

Audio Classification from Time-Frequency Texture

Author: Slotine Jean-Jacques
Yu Guoshen
Publication venue
Publication date: 25/09/2008
Field of study

Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Note-based segmentation and hierarchy in the classification of digital musical instruments

Author: Somerville P
Uitdenbogerd A
Publication venue: Scholarly Publishing Office, University of Michigan Library (San Francisco, USA)
Publication date: 01/01/2007
Field of study

The ability to automatically identify the musical instruments occurring in a recorded piece of music has important uses for various music-related applications. This paper examines the case of instrument classification where the raw data consists of musical phrases performed on digital instruments from eight instrument families. We compare the use of extracted features from a continuous sample of approximately one second, to the use of a systematic segmentation of the audio on note boundaries and using multiple, aligned note samples as input to classifiers. The accuracy of the segmented approach was greater than the one of the unsegmented approach. The best method was using a two-tiered hierarchical method which performed slightly better than the single-tiered flat approach. The best performing instrument category was woodwind, with an accuracy of 94% for the segmented approach, using the Bayesian network classifier. Distinguishing different types of pianos was difficult for all classifiers, with the segmented approach yielding an accuracy of 56%. For humans, broadly similar results were found, in that pianos were difficult to distinguish, along with woodwind and solo string instruments. However there was no symmetry between human comparisons of identical instruments and different instruments, with half of the broad instrument categories having widely different accuracies for the two cases

University of Michigan Library Repository

RMIT Research Repository

ASSISTED SOUND SAMPLE GENERATION WITH MUSICAL CONDITIONING IN ADVERSARIAL AUTO-ENCODERS

Author: bitton adrien
Caillon Antoine
Esling Philippe
Fouilleul Martin
Publication venue: HAL CCSD
Publication date: 01/09/2019
Field of study

International audienceDeep generative neural networks have thrived in the field of computer vision, enabling unprecedented intelligent image processes. Yet the results in audio remain less advanced and many applications are still to be investigated. Our project targets real-time sound synthesis from a reduced set of high-level parameters, including semantic controls that can be adapted to different sound libraries and specific tags. These generative variables should allow expressive modulations of target musical qualities and continuously mix into new styles. To this extent we train auto-encoders on an orchestral database of individual note samples, along with their intrinsic attributes: note class, timbre domain (an instrument subset) and extended playing techniques. We condition the decoder for explicit control over the rendered note attributes and use latent adversarial training for learning expressive style parameters that can ultimately be mixed. We evaluate both generative performances and correlations of the attributes with the latent representation. Our ablation study demonstrates the effectiveness of the musical conditioning. The proposed model generates individual notes as magnitude spec-trograms from any probabilistic latent code samples (each latent point maps to a single note), with expressive control of orchestral timbres and playing styles. Its training data subsets can directly be visualized in the 3-dimensional latent representation. Wave-form rendering can be done offline with the Griffin-Lim algorithm. In order to allow real-time interactions, we fine-tune the decoder with a pretrained magnitude spectrogram inversion network and embed the full waveform generation pipeline in a plugin. Moreover the encoder could be used to process new input samples, after manipulating their latent attribute representation, the decoder can generate sample variations as an audio effect would. Our solution remains rather lightweight and fast to train, it can directly be applied to other sound domains, including an user's libraries with custom sound tags that could be mapped to specific genera-tive controls. As a result, it fosters creativity and intuitive audio style experimentations. Sound examples and additional visualiza-tions are available on Github 1 , as well as codes after the review process

Ontology of music performance variation

Author: Taylor John
Publication venue: Sydney Conservatorium of Music
Publication date: 01/01/2014
Field of study

Performance variation in rhythm determines the extent that humans perceive and feel the effect of rhythmic pulsation and music in general. In many cases, these rhythmic variations can be linked to percussive performance. Such percussive performance variations are often absent in current percussive rhythmic models. The purpose of this thesis is to present an interactive computer model, called the PD-103, that simulates the micro-variations in human percussive performance. This thesis makes three main contributions to existing knowledge: firstly, by formalising a new method for modelling percussive performance; secondly, by developing a new compositional software tool called the PD-103 that models human percussive performance, and finally, by creating a portfolio of different musical styles to demonstrate the capabilities of the software. A large database of recorded samples are classified into zones based upon the vibrational characteristics of the instruments, to model timbral variation in human percussive performance. The degree of timbral variation is governed by principles of biomechanics and human percussive performance. A fuzzy logic algorithm is applied to analyse current and first-order sample selection in order to formulate an ontological description of music performance variation. Asynchrony values were extracted from recorded performances of three different performance skill levels to create \timing fingerprints" which characterise unique features to each percussionist. The PD-103 uses real performance timing data to determine asynchrony values for each synthesised note. The spectral content of the sample database forms a three-dimensional loudness/timbre space, intersecting instrumental behaviour with music composition. The reparameterisation of the sample database, following the analysis of loudness, spectral flatness, and spectral centroid, provides an opportunity to explore the timbral variations inherent in percussion instruments, to creatively explore dimensions of timbre. The PD-103 was used to create a music portfolio exploring different rhythmic possibilities with a focus on meso-periodic rhythms common to parts of West Africa, jazz drumming, and electroacoustic music. The portfolio also includes new timbral percussive works based on spectral features and demonstrates the central aim of this thesis, which is the creation of a new compositional software tool that integrates human percussive performance and subsequently extends this model to different genres of music

Sydney eScholarship

Concatenative Synthesis for Novel Timbral Creation

Author: Bilous James Eric
Publication venue: DigitalCommons@CalPoly
Publication date: 01/06/2016
Field of study

Modern day musicians rely on a variety of instruments for musical expression. Tones produced from electronic instruments have become almost as commonplace as those produced by traditional ones as evidenced by the plethora of artists who can be found composing and performing with nothing more than a personal computer. This desire to embrace technical innovation as a means to augment performance art has created a budding field in computer science that explores the creation and manipulation of sound for artistic purposes. One facet of this new frontier concerns timbral creation, or the development of new sounds with unique characteristics that can be wielded by the musician as a virtual instrument. This thesis presents Timcat, a software system that can be used to create novel timbres from prerecorded audio. Various techniques for timbral feature extraction from short audio clips, or grains, are evaluated for use in timbral feature spaces. Clustering is performed on feature vectors in these spaces and groupings are recombined using concatenative synthesis techniques in order to form new instrument patches. The results reveal that interesting timbres can be created using features extracted by both newly developed and existing signal analysis techniques, many common in other fields though not often applied to music audio signals. Several of the features employed also show high accuracy for instrument separation in randomly mixed tracks. Survey results demonstrate positive feedback concerning the timbres created by Timcat from electronic music composers, musicians, and music lovers alike

DigitalCommons@CalPoly