344,066 research outputs found

    DeepPermNet: Visual Permutation Learning

    Full text link
    We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Unfortunately, permutation matrices are discrete, thereby posing difficulties for gradient-based methods. To this end, we resort to a continuous approximation of these matrices using doubly-stochastic matrices which we generate from standard CNN predictions using Sinkhorn iterations. Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on two challenging computer vision problems, namely, (i) relative attributes learning and (ii) self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for (i) and on the classification and segmentation tasks on the PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and Pattern Recognition CVPR 201

    SuPP & MaPP: Adaptable Structure-Based Representations For Mir Tasks

    Get PDF
    Accurate and flexible representations of music data are paramount to addressing MIR tasks, yet many of the existing approaches are difficult to interpret or rigid in nature. This work introduces two new song representations for structure-based retrieval methods: Surface Pattern Preservation (SuPP), a continuous song representation, and Matrix Pattern Preservation (MaPP), SuPP’s discrete counterpart. These representations come equipped with several user-defined parameters so that they are adaptable for a range of MIR tasks. Experimental results show MaPP as successful in addressing the cover song task on a set of Mazurka scores, with a mean precision of 0.965 and recall of 0.776. SuPP and MaPP also show promise in other MIR applications, such as novel-segment detection and genre classification, the latter of which demonstrates their suitability as inputs for machine learning problems

    Sparse representations in multi-kernel dictionaries for in-situ classification of underwater objects

    Get PDF
    2017 Spring.Includes bibliographical references.The performance of the kernel-based pattern classification algorithms depends highly on the selection of the kernel function and its parameters. Consequently in the recent years there has been a growing interest in machine learning algorithms to select kernel functions automatically from a predefined dictionary of kernels. In this work we develop a general mathematical framework for multi-kernel classification that makes use of sparse representation theory for automatically selecting the kernel functions and their parameters that best represent a set of training samples. We construct a dictionary of different kernel functions with different parametrizations. Using a sparse approximation algorithm, we represent the ideal score of each training sample as a sparse linear combination of the kernel functions in the dictionary evaluated at all training samples. Moreover, we incorporate the high-level operator's concepts into the learning by using the in-situ learning for the new unseen samples whose scores can not be represented suitably using the previously selected representative samples. Finally, we evaluate the viability of this method for in-situ classification of a database of underwater object images. Results are presented in terms of ROC curve, confusion matrix and correct classification rate measures

    Learning parametric dictionaries for graph signals

    Get PDF
    In sparse signal representation, the choice of a dictionary often involves a tradeoff between two desirable properties -- the ability to adapt to specific signal data and a fast implementation of the dictionary. To sparsely represent signals residing on weighted graphs, an additional design challenge is to incorporate the intrinsic geometric structure of the irregular data domain into the atoms of the dictionary. In this work, we propose a parametric dictionary learning algorithm to design data-adapted, structured dictionaries that sparsely represent graph signals. In particular, we model graph signals as combinations of overlapping local patterns. We impose the constraint that each dictionary is a concatenation of subdictionaries, with each subdictionary being a polynomial of the graph Laplacian matrix, representing a single pattern translated to different areas of the graph. The learning algorithm adapts the patterns to a training set of graph signals. Experimental results on both synthetic and real datasets demonstrate that the dictionaries learned by the proposed algorithm are competitive with and often better than unstructured dictionaries learned by state-of-the-art numerical learning algorithms in terms of sparse approximation of graph signals. In contrast to the unstructured dictionaries, however, the dictionaries learned by the proposed algorithm feature localized atoms and can be implemented in a computationally efficient manner in signal processing tasks such as compression, denoising, and classification
    • …
    corecore