133,320 research outputs found

    Identifying Cover Songs Using Information-Theoretic Measures of Similarity

    Get PDF
    This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/This paper investigates methods for quantifying similarity between audio signals, specifically for the task of cover song detection. We consider an information-theoretic approach, where we compute pairwise measures of predictability between time series. We compare discrete-valued approaches operating on quantized audio features, to continuous-valued approaches. In the discrete case, we propose a method for computing the normalized compression distance, where we account for correlation between time series. In the continuous case, we propose to compute information-based measures of similarity as statistics of the prediction error between time series. We evaluate our methods on two cover song identification tasks using a data set comprised of 300 Jazz standards and using the Million Song Dataset. For both datasets, we observe that continuous-valued approaches outperform discrete-valued approaches. We consider approaches to estimating the normalized compression distance (NCD) based on string compression and prediction, where we observe that our proposed normalized compression distance with alignment (NCDA) improves average performance over NCD, for sequential compression algorithms. Finally, we demonstrate that continuous-valued distances may be combined to improve performance with respect to baseline approaches. Using a large-scale filter-and-refine approach, we demonstrate state-of-the-art performance for cover song identification using the Million Song Dataset.The work of P. Foster was supported by an Engineering and Physical Sciences Research Council Doctoral Training Account studentship

    Buffered Reset Leads to Improved Compression in Fuzzy ARTMAP Classification of Radar Range Profiles

    Full text link
    Fuzzy ARTMAP has to date been applied to a variety of automatic target recognition tasks, including radar range profile classification. In simulations of this task, it has demonstrated significant compression compared to k-nearest-neighbor classifiers. During supervised learning, match tracking search allocates memory based on the degree of similarity between newly encountered and previously encountered inputs, regardless of their prior predictive success. Here we invesetigate techniques that buffer reset based on a category's previous predictive success and thereby substantially improve the compression achieved with minimal loss of accuracy.Office of Naval Research (N00014-95-1-0657, N00014-95-1-0409, N00014-96-1-0659

    A fast compression-based similarity measure with applications to content-based image retrieval

    Get PDF
    Compression-based similarity measures are effectively employed in applications on diverse data types with a basically parameter-free approach. Nevertheless, there are problems in applying these techniques to medium-to-large datasets which have been seldom addressed. This paper proposes a similarity measure based on compression with dictionaries, the Fast Compression Distance (FCD), which reduces the complexity of these methods, without degradations in performance. On its basis a content-based color image retrieval system is defined, which can be compared to state-of-the-art methods based on invariant color features. Through the FCD a better understanding of compression-based techniques is achieved, by performing experiments on datasets which are larger than the ones analyzed so far in literature

    Advances in Similarity-Based Audio Compression

    Get PDF
    Existing lossy audio compression techniques such as MP3, WMA and Ogg Vorbis, for example, demonstrate great success in providing compression ratios which successfully reduce the data size from the original sampled audio. These techniques employ psychoacoustic models and traditional statistical coding techniques to achieve data reduction. However, these methods do not take into account the perceived content of the audio, which is often particularly relevant in musical audio. In this paper, we present our research and development work completed to date, in producing a system for audio analysis, which will consider and exploit the repetitive nature of audio and the similarities which frequently occur in audio recordings. We demonstrate the feasibility and scope of the analysis system and consider the techniques and challenges that are employed to achieve data reduction

    A Codebook Generation Algorithm for Document Image Compression

    Full text link
    Pattern-matching-based document-compression systems (e.g. for faxing) rely on finding a small set of patterns that can be used to represent all of the ink in the document. Finding an optimal set of patterns is NP-hard; previous compression schemes have resorted to heuristics. This paper describes an extension of the cross-entropy approach, used previously for measuring pattern similarity, to this problem. This approach reduces the problem to a k-medians problem, for which the paper gives a new algorithm with a provably good performance guarantee. In comparison to previous heuristics (First Fit, with and without generalized Lloyd's/k-means postprocessing steps), the new algorithm generates a better codebook, resulting in an overall improvement in compression performance of almost 17%

    Sequential Complexity as a Descriptor for Musical Similarity

    Get PDF
    We propose string compressibility as a descriptor of temporal structure in audio, for the purpose of determining musical similarity. Our descriptors are based on computing track-wise compression rates of quantised audio features, using multiple temporal resolutions and quantisation granularities. To verify that our descriptors capture musically relevant information, we incorporate our descriptors into similarity rating prediction and song year prediction tasks. We base our evaluation on a dataset of 15500 track excerpts of Western popular music, for which we obtain 7800 web-sourced pairwise similarity ratings. To assess the agreement among similarity ratings, we perform an evaluation under controlled conditions, obtaining a rank correlation of 0.33 between intersected sets of ratings. Combined with bag-of-features descriptors, we obtain performance gains of 31.1% and 10.9% for similarity rating prediction and song year prediction. For both tasks, analysis of selected descriptors reveals that representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio
    corecore