44 research outputs found

    Decision Manifolds: Classification Inspired by Self-Organization

    Get PDF
    We present a classifier algorithm that approximates the decision surface of labeled data by a patchwork of separating hyperplanes. The hyperplanes are arranged in a way inspired by how Self-Organizing Maps are trained. We take advantage of the fact that the boundaries can often be approximated by linear ones connected by a low-dimensional nonlinear manifold. The resulting classifier allows for a voting scheme that averages over the classifiction results of neighboring hyperplanes. Our algorithm is computationally efficient both in terms of training and classification. Further, we present a model selection framework for estimation of the paratmeters of the classification boundary, and show results for artificial and real-world data sets

    A cartesian ensemble of feature subspace classifiers for music categorization

    Get PDF
    We present a cartesian ensemble classification system that is based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. The framework is built on the Weka machine learning toolkit and able to combine arbitrary feature sets and learning schemes. In our scenario, we use it for the ensemble classification of multiple feature sets from the audio and symbolic domains. We present an extensive set of experiments in the context of music genre classification, based on numerous Music IR benchmark datasets, and evaluate a set of combination/voting rules. The results show that the approach is superior to the best choice of a single algorithm on a single feature set. Moreover, it also releases the user from making this choice explicitly.International Society for Music Information Retrieva

    Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap

    Get PDF
    The academic discipline focusing on the processing and organization of digital music information, commonly known as Music Information Retrieval (MIR), has multidisciplinary roots and interests. Thus, MIR technologies have the potential to have impact across disciplinary boundaries and to enhance the handling of music information in many different user communities. However, in practice, many MIR research agenda items appear to have a hard time leaving the lab in order to be widely adopted by their intended audiences. On one hand, this is because the MIR field still is relatively young, and technologies therefore need to mature. On the other hand, there may be deeper, more fundamental challenges with regard to the user audience. In this contribution, we discuss MIR technology adoption issues that were experienced with professional music stakeholders in audio mixing, performance, musicology and sales industry. Many of these stakeholders have mindsets and priorities that differ considerably from those of most MIR academics, influencing their reception of new MIR technology. We mention the major observed differences and their backgrounds, and argue that these are essential to be taken into account to allow for truly successful cross-disciplinary collaboration and technology adoption in MIR

    2005, ‘MIREX 2005:Combined Fluctuation Features for Music Genre Classification’. Extended Abstract. MIREX genre classification contest (www.music-ir.org/evaluation/mirex-results

    No full text
    CLASSIFICATION We submitted a system that uses combinations of three feature sets (Rhythm Patterns, Statistical Spectrum Descriptor and Rhythm Histogram) to the MIREX 2005 audio genre classification task. All feature sets are based on fluctuation of modulation amplitudes in psychoacoustically transformed spectrum data. For classification we applied Support Vector Machines. Our best approach achieved 75.27 % combined overall classification accuracy, which is rank 5. 1 IMPLEMENTATION 1.1 Feature Extraction We extract 3 feature sets from audio data, using algorithms implemented in MATLAB. The algorithms process audio tracks in standard digital PCM format with 44.1 kHz or 22.05 kHz sampling frequency. Audio compressed with e.g. the MP3 format will be decoded by an external program in a pre-processing step. Audio with multiple channels will be merged to mono. Prior to feature extraction, each audio track is segmented into pieces of 6 seconds length. The first and the last segment are skipped, in order to exclude lead-in and fade-out effects. In the MIREX setting, only every third segment is processed. For each set of features, the characteristics of an entire piece of music are computed by averaging the feature vectors from the segments (using median or mean). For a more detailed description of the feature sets and the combination approach see (Lidy and Rauber, 2005). 1.1.1 Rhythm Patterns A short time Fast Fourier Transform (STFT) using a hanning window function (23 ms windows with 50 % overlap) is applied to retrieve the spectrum data from the audio. The frequency bands of the spectrogram are summed up to 24 so-called critical bands, according to the Bark scale (Zwicker and Fastl, 1999), with narrow bands in low frequency regions and broader bands in high frequency regions, according to the human auditory system. Successively, the data is transformed into the logarithmic decibel scale, the Phon scale by applying the psychoacoustically motivated equal-loudness curves (Zwicker and Fastl, 1999) and afterwards into the unit Sone, reflecting specific loudness sensation
    corecore