14 research outputs found
Efficient Coding and Statistically Optimal Weighting of Covariance among Acoustic Attributes in Novel Sounds
To the extent that sensorineural systems are efficient, redundancy should be extracted to optimize transmission of information, but perceptual evidence for this has been limited. Stilp and colleagues recently reported efficient coding of robust correlation (r = .97) among complex acoustic attributes (attack/decay, spectral shape) in novel sounds. Discrimination of sounds orthogonal to the correlation was initially inferior but later comparable to that of sounds obeying the correlation. These effects were attenuated for less-correlated stimuli (r = .54) for reasons that are unclear. Here, statistical properties of correlation among acoustic attributes essential for perceptual organization are investigated. Overall, simple strength of the principal correlation is inadequate to predict listener performance. Initial superiority of discrimination for statistically consistent sound pairs was relatively insensitive to decreased physical acoustic/psychoacoustic range of evidence supporting the correlation, and to more frequent presentations of the same orthogonal test pairs. However, increased range supporting an orthogonal dimension has substantial effects upon perceptual organization. Connectionist simulations and Eigenvalues from closed-form calculations of principal components analysis (PCA) reveal that perceptual organization is near-optimally weighted to shared versus unshared covariance in experienced sound distributions. Implications of reduced perceptual dimensionality for speech perception and plausible neural substrates are discussed
A neurally-inspired musical instrument classification system based upon the sound onset
Physiological evidence suggests that sound onset detection in the auditory system may be performed by specialized neurons as early as the cochlear nucleus. Psychoacoustic evidence shows that the sound onset can be important for the recognition of musical sounds. Here the sound onset is used in isolation to form tone descriptors for a musical instrument classification task. The task involves 2085 isolated musical tones from the McGill dataset across five instrument categories. A neurally inspired tone descriptor is created using a model of the auditory system's response to sound onset. A gammatone filterbank and spiking onset detectors, built from dynamic synapses and leaky integrate-and-fire neurons, create parallel spike trains that emphasize the sound onset. These are coded as a descriptor called the onset fingerprint. Classification uses a time-domain neural network, the echo state network. Reference strategies, based upon mel-frequency cepstral coefficients, evaluated either over the whole tone or only during the sound onset, provide context to the method. Classification success rates for the neurally-inspired method are around 75%. The cepstral methods perform between 73% and 76%. Further testing with tones from the Iowa MIS collection shows that the neurally inspired method is considerably more robust when tested with data from an unrelated dataset
Problems with Automatic Classification of Musical Sounds
Convenient searching of multimedia databases requires well annotated data. Labeling sound data with information like pitch or timbre must be done through sound analysis. In this paper, we deal with the problem of automatic classification of musical instrument on the basis of its sound. Although there are algorithms for basic sound descriptors extraction, correct identification of instrument still poses a problem. We describe di#culties encountered when classifying woodwinds, brass, and strings of contemporary orchestra. We discuss most di#cult cases and explain why these sounds cause problems. The conclusions are drawn and presented in brief summary closing the paper
Acoustic structure of the five perceptual dimensions of timbre in orchestral instrument tones
Attempts to relate the perceptual dimensions of timbre to quantitative acoustical dimensions have been tenuous, leading to claims that timbre is an emergent property, if measurable at all. Here, a three-pronged analysis shows that the timbre space of sustained instrument tones occupies 5 dimensions and that a specific combination of acoustic properties uniquely determines gestalt perception of timbre. Firstly, multidimensional scaling (MDS) of dissimilarity judgments generated a perceptual timbre space in which 5 dimensions were cross-validated and selected by traditional model comparisons. Secondly, subjects rated tones on semantic scales. A discriminant function analysis (DFA) accounting for variance of these semantic ratings across instruments and between subjects also yielded 5 significant dimensions with similar stimulus ordination. The dimensions of timbre space were then interpreted semantically by rotational and reflectional projection of the MDS solution into two DFA dimensions. Thirdly, to relate this final space to acoustical structure, the perceptual MDS coordinates of each sound were regressed with its joint spectrotemporal modulation power spectrum. Sound structures correlated significantly with distances in perceptual timbre space. Contrary to previous studies, most perceptual timbre dimensions are not the result of purely temporal or spectral features but instead depend on signature spectrotemporal patterns