13,033 research outputs found

    Polyphonic music information retrieval based on multi-label cascade classification system

    Get PDF
    Recognition and separation of sounds played by various instruments is very useful in labeling audio files with semantic information. This is a non-trivial task requiring sound analysis, but the results can aid automatic indexing and browsing music data when searching for melodies played by user specified instruments. Melody match based on pitch detection technology has drawn much attention and a lot of MIR systems have been developed to fulfill this task. However, musical instrument recognition remains an unsolved problem in the domain. Numerous approaches on acoustic feature extraction have already been proposed for timbre recognition. Unfortunately, none of those monophonic timbre estimation algorithms can be successfully applied to polyphonic sounds, which are the more usual cases in the real music world. This has stimulated the research on multi-labeled instrument classification and new features development for content-based automatic music information retrieval. The original audio signals are the large volume of unstructured sequential values, which are not suitable for traditional data mining algorithms; while the acoustical features are sometime not sufficient for instrument recognition in polyphonic sounds because they are higher-level representatives of raw signal lacking details of original information. In order to capture the patterns which evolve on the time scale, new temporal features are introduced to supply more temporal information for the timbre recognition. We will introduce the multi-labeled classification system to estimate multiple timbre information from the polyphonic sound by classification based on acoustic features and short-term power spectrum matching. In order to achieve higher estimation rate, we introduced the hierarchically structured cascade classification system under the inspiration of the human perceptual process. This cascade classification system makes a first estimate on the higher level decision attribute, which stands for the musical instrument family. Then, the further estimation is done within that specific family range. Experiments showed better performance of a hierarchical system than the traditional flat classification method which directly estimates the instrument without higher level of family information analysis. Traditional hierarchical structures were constructed in human semantics, which are meaningful from human perspective but not appropriate for the cascade system. We introduce the new hierarchical instrument schema according to the clustering results of the acoustic features. This new schema better describes the similarity among different instruments or among different playing techniques of the same instrument. The classification results show the higher accuracy of cascade system with the new schema compared to the traditional schemas. The query answering system is built based on the cascade classifier

    Audio Features Affected by Music Expressiveness

    Full text link
    Within a Music Information Retrieval perspective, the goal of the study presented here is to investigate the impact on sound features of the musician's affective intention, namely when trying to intentionally convey emotional contents via expressiveness. A preliminary experiment has been performed involving 1010 tuba players. The recordings have been analysed by extracting a variety of features, which have been subsequently evaluated by combining both classic and machine learning statistical techniques. Results are reported and discussed.Comment: Submitted to ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), Pisa, Italy, July 17-21, 201

    It's not what you play, it's how you play it: timbre affects perception of emotion in music.

    Get PDF
    Salient sensory experiences often have a strong emotional tone, but the neuropsychological relations between perceptual characteristics of sensory objects and the affective information they convey remain poorly defined. Here we addressed the relationship between sound identity and emotional information using music. In two experiments, we investigated whether perception of emotions is influenced by altering the musical instrument on which the music is played, independently of other musical features. In the first experiment, 40 novel melodies each representing one of four emotions (happiness, sadness, fear, or anger) were each recorded on four different instruments (an electronic synthesizer, a piano, a violin, and a trumpet), controlling for melody, tempo, and loudness between instruments. Healthy participants (23 young adults aged 18-30 years, 24 older adults aged 58-75 years) were asked to select which emotion they thought each musical stimulus represented in a four-alternative forced-choice task. Using a generalized linear mixed model we found a significant interaction between instrument and emotion judgement with a similar pattern in young and older adults (p < .0001 for each age group). The effect was not attributable to musical expertise. In the second experiment using the same melodies and experimental design, the interaction between timbre and perceived emotion was replicated (p < .05) in another group of young adults for novel synthetic timbres designed to incorporate timbral cues to particular emotions. Our findings show that timbre (instrument identity) independently affects the perception of emotions in music after controlling for other acoustic, cognitive, and performance factors

    Toward an ecological conception of timbre

    Get PDF
    This paper is part of a series in which we had worked in the last 6 months, and, specifically, intend to investigate the notion of timbre through the ecological perspective proposed by James Gibson in his Theory of Direct Perception. First of all, we discussed the traditional approach to timbre, mainly as developed in acoustics and psychoacoustics. Later, we proposed a new conception of timbre that was born in concepts of ecological approach. The ecological approach to perception proposed by Gibson (1966, 1979) presupposes a level of analysis of perceptual stimulated that includes, but is quite broader than the usual physical aspect. Gibson suggests as focus the relationship between the perceiver and his environment. At the core of this approach, is the notion of affordances, invariant combinations of properties at the ecological level, taken with reference to the anatomy and action systems of species or individual, and also with reference to its biological and social needs. Objects and events are understood as relates to a perceiving organism by the meaning of structured information, thus affording possibilities of action by the organism. Event perception aims at identifying properties of events to specify changes of the environment that are relevant to the organism. The perception of form is understood as a special instance of event perception, which is the identity of an object depends on the nature of the events in which is involved and what remains invariant over time. From this perspective, perception is not in any sense created by the brain, but is a part of the world where information can be found. Consequently, an ecological approach represents a form of direct realism that opposes the indirect realist based on predominant approaches to perception borrowed from psychoacoustics and computational approach

    Enhancing timbre model using MFCC and its time derivatives for music similarity estimation

    No full text
    One of the popular methods for content-based music similarity estimation is to model timbre with MFCC as a single multivariate Gaussian with full covariance matrix, then use symmetric Kullback-Leibler divergence. From the field of speech recognition, we propose to use the same approach on the MFCCs’ time derivatives to enhance the timbre model. The Gaussian models for the delta and acceleration coefficients are used to create their respective distance matrix. The distance matrices are then combined linearly to form a full distance matrix for music similarity estimation. In our experiments on two datasets, our novel approach performs better than using MFCC alone.Moreover, performing genre classification using k-NN showed that the accuracies obtained are already close to the state-of-the-art
    • 

    corecore