344,066 research outputs found
DeepPermNet: Visual Permutation Learning
We present a principled approach to uncover the structure of visual data by
solving a novel deep learning task coined visual permutation learning. The goal
of this task is to find the permutation that recovers the structure of data
from shuffled versions of it. In the case of natural images, this task boils
down to recovering the original image from patches shuffled by an unknown
permutation matrix. Unfortunately, permutation matrices are discrete, thereby
posing difficulties for gradient-based methods. To this end, we resort to a
continuous approximation of these matrices using doubly-stochastic matrices
which we generate from standard CNN predictions using Sinkhorn iterations.
Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet,
an end-to-end CNN model for this task. The utility of DeepPermNet is
demonstrated on two challenging computer vision problems, namely, (i) relative
attributes learning and (ii) self-supervised representation learning. Our
results show state-of-the-art performance on the Public Figures and OSR
benchmarks for (i) and on the classification and segmentation tasks on the
PASCAL VOC dataset for (ii).Comment: Accepted in IEEE International Conference on Computer Vision and
Pattern Recognition CVPR 201
SuPP & MaPP: Adaptable Structure-Based Representations For Mir Tasks
Accurate and flexible representations of music data are paramount to addressing MIR tasks, yet many of the existing approaches are difficult to interpret or rigid in nature. This work introduces two new song representations for structure-based retrieval methods: Surface Pattern Preservation (SuPP), a continuous song representation, and Matrix Pattern Preservation (MaPP), SuPP’s discrete counterpart. These representations come equipped with several user-defined parameters so that they are adaptable for a range of MIR tasks. Experimental results show MaPP as successful in addressing the cover song task on a set of Mazurka scores, with a mean precision of 0.965 and recall of 0.776. SuPP and MaPP also show promise in other MIR applications, such as novel-segment detection and genre classification, the latter of which demonstrates their suitability as inputs for machine learning problems
Sparse representations in multi-kernel dictionaries for in-situ classification of underwater objects
2017 Spring.Includes bibliographical references.The performance of the kernel-based pattern classification algorithms depends highly on the selection of the kernel function and its parameters. Consequently in the recent years there has been a growing interest in machine learning algorithms to select kernel functions automatically from a predefined dictionary of kernels. In this work we develop a general mathematical framework for multi-kernel classification that makes use of sparse representation theory for automatically selecting the kernel functions and their parameters that best represent a set of training samples. We construct a dictionary of different kernel functions with different parametrizations. Using a sparse approximation algorithm, we represent the ideal score of each training sample as a sparse linear combination of the kernel functions in the dictionary evaluated at all training samples. Moreover, we incorporate the high-level operator's concepts into the learning by using the in-situ learning for the new unseen samples whose scores can not be represented suitably using the previously selected representative samples. Finally, we evaluate the viability of this method for in-situ classification of a database of underwater object images. Results are presented in terms of ROC curve, confusion matrix and correct classification rate measures
Learning parametric dictionaries for graph signals
In sparse signal representation, the choice of a dictionary often involves a
tradeoff between two desirable properties -- the ability to adapt to specific
signal data and a fast implementation of the dictionary. To sparsely represent
signals residing on weighted graphs, an additional design challenge is to
incorporate the intrinsic geometric structure of the irregular data domain into
the atoms of the dictionary. In this work, we propose a parametric dictionary
learning algorithm to design data-adapted, structured dictionaries that
sparsely represent graph signals. In particular, we model graph signals as
combinations of overlapping local patterns. We impose the constraint that each
dictionary is a concatenation of subdictionaries, with each subdictionary being
a polynomial of the graph Laplacian matrix, representing a single pattern
translated to different areas of the graph. The learning algorithm adapts the
patterns to a training set of graph signals. Experimental results on both
synthetic and real datasets demonstrate that the dictionaries learned by the
proposed algorithm are competitive with and often better than unstructured
dictionaries learned by state-of-the-art numerical learning algorithms in terms
of sparse approximation of graph signals. In contrast to the unstructured
dictionaries, however, the dictionaries learned by the proposed algorithm
feature localized atoms and can be implemented in a computationally efficient
manner in signal processing tasks such as compression, denoising, and
classification
- …