39,911 research outputs found

    A quick search method for audio signals based on a piecewise linear representation of feature trajectories

    Full text link
    This paper presents a new method for a quick similarity-based search through long unlabeled audio streams to detect and locate audio clips provided by users. The method involves feature-dimension reduction based on a piecewise linear representation of a sequential feature trajectory extracted from a long audio stream. Two techniques enable us to obtain a piecewise linear representation: the dynamic segmentation of feature trajectories and the segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method guarantees the same search results as the search method without the proposed feature-dimension reduction method in principle. Experiment results indicate significant improvements in search speed. For example the proposed method reduced the total search time to approximately 1/12 that of previous methods and detected queries in approximately 0.3 seconds from a 200-hour audio database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and Language Processin

    FRIDA: FRI-Based DOA Estimation for Arbitrary Array Layouts

    Get PDF
    In this paper we present FRIDA---an algorithm for estimating directions of arrival of multiple wideband sound sources. FRIDA combines multi-band information coherently and achieves state-of-the-art resolution at extremely low signal-to-noise ratios. It works for arbitrary array layouts, but unlike the various steered response power and subspace methods, it does not require a grid search. FRIDA leverages recent advances in sampling signals with a finite rate of innovation. It is based on the insight that for any array layout, the entries of the spatial covariance matrix can be linearly transformed into a uniformly sampled sum of sinusoids.Comment: Submitted to ICASSP201

    Classification of music genres using sparse representations in overcomplete dictionaries

    Get PDF
    This paper presents a simple, but efficient and robust, method for music genre classification that utilizes sparse representations in overcomplete dictionaries. The training step involves creating dictionaries, using the K-SVD algorithm, in which data corresponding to a particular music genre has a sparse representation. In the classification step, the Orthogonal Matching Pursuit (OMP) algorithm is used to separate feature vectors that consist only of Linear Predictive Coding (LPC) coefficients. The paper analyses in detail a popular case study from the literature, the ISMIR 2004 database. Using the presented method, the correct classification percentage of the 6 music genres is 85.59, result that is comparable with the best results published so far

    Multiple-F0 estimation of piano sounds exploiting spectral structure and temporal evolution

    Get PDF
    This paper proposes a system for multiple fundamental frequency estimation of piano sounds using pitch candidate selection rules which employ spectral structure and temporal evolution. As a time-frequency representation, the Resonator Time-Frequency Image of the input signal is employed, a noise suppression model is used, and a spectral whitening procedure is performed. In addition, a spectral flux-based onset detector is employed in order to select the steady-state region of the produced sound. In the multiple-F0 estimation stage, tuning and inharmonicity parameters are extracted and a pitch salience function is proposed. Pitch presence tests are performed utilizing information from the spectral structure of pitch candidates, aiming to suppress errors occurring at multiples and sub-multiples of the true pitches. A novel feature for the estimation of harmonically related pitches is proposed, based on the common amplitude modulation assumption. Experiments are performed on the MAPS database using 8784 piano samples of classical, jazz, and random chords with polyphony levels between 1 and 6. The proposed system is computationally inexpensive, being able to perform multiple-F0 estimation experiments in realtime. Experimental results indicate that the proposed system outperforms state-of-the-art approaches for the aforementioned task in a statistically significant manner. Index Terms: multiple-F0 estimation, resonator timefrequency image, common amplitude modulatio

    Onset Event Decoding Exploiting the Rhythmic Structure of Polyphonic Music

    Get PDF
    (c)2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Journal of Selected Topics in Signal Processing 5(6): 1228-1239, Oct 2011. DOI:10.1109/JSTSP.2011.214622
    • …
    corecore