5,935 research outputs found
Deformable Prototypes for Encoding Shape Categories in Image Databases
We describe a method for shape-based image database search that uses deformable prototypes to represent categories. Rather than directly comparing a candidate shape with all shape entries in the database, shapes are compared in terms of the types of nonrigid deformations (differences) that relate them to a small subset of representative prototypes. To solve the shape correspondence and alignment problem, we employ the technique of modal matching, an information-preserving shape decomposition for matching, describing, and comparing shapes despite sensor variations and nonrigid deformations. In modal matching, shape is decomposed into an ordered basis of orthogonal principal components. We demonstrate the utility of this approach for shape comparison in 2-D image databases.Office of Naval Research (Young Investigator Award N00014-06-1-0661
Unsupervised Domain Adaptation with Similarity Learning
The objective of unsupervised domain adaptation is to leverage features from
a labeled source domain and learn a classifier for an unlabeled target domain,
with a similar but different data distribution. Most deep learning approaches
to domain adaptation consist of two steps: (i) learn features that preserve a
low risk on labeled samples (source domain) and (ii) make the features from
both domains to be as indistinguishable as possible, so that a classifier
trained on the source can also be applied on the target domain. In general, the
classifiers in step (i) consist of fully-connected layers applied directly on
the indistinguishable features learned in (ii). In this paper, we propose a
different way to do the classification, using similarity learning. The proposed
method learns a pairwise similarity function in which classification can be
performed by computing similarity between prototype representations of each
category. The domain-invariant features and the categorical prototype
representations are learned jointly and in an end-to-end fashion. At inference
time, images from the target domain are compared to the prototypes and the
label associated with the one that best matches the image is outputed. The
approach is simple, scalable and effective. We show that our model achieves
state-of-the-art performance in different unsupervised domain adaptation
scenarios
A Distributed and Approximated Nearest Neighbors Algorithm for an Efficient Large Scale Mean Shift Clustering
In this paper we target the class of modal clustering methods where clusters
are defined in terms of the local modes of the probability density function
which generates the data. The most well-known modal clustering method is the
k-means clustering. Mean Shift clustering is a generalization of the k-means
clustering which computes arbitrarily shaped clusters as defined as the basins
of attraction to the local modes created by the density gradient ascent paths.
Despite its potential, the Mean Shift approach is a computationally expensive
method for unsupervised learning. Thus, we introduce two contributions aiming
to provide clustering algorithms with a linear time complexity, as opposed to
the quadratic time complexity for the exact Mean Shift clustering. Firstly we
propose a scalable procedure to approximate the density gradient ascent.
Second, our proposed scalable cluster labeling technique is presented. Both
propositions are based on Locality Sensitive Hashing (LSH) to approximate
nearest neighbors. These two techniques may be used for moderate sized
datasets. Furthermore, we show that using our proposed approximations of the
density gradient ascent as a pre-processing step in other clustering methods
can also improve dedicated classification metrics. For the latter, a
distributed implementation, written for the Spark/Scala ecosystem is proposed.
For all these considered clustering methods, we present experimental results
illustrating their labeling accuracy and their potential to solve concrete
problems.Comment: Algorithms are available at
https://github.com/Clustering4Ever/Clustering4Eve
Spatio-temporal Video Parsing for Abnormality Detection
Abnormality detection in video poses particular challenges due to the
infinite size of the class of all irregular objects and behaviors. Thus no (or
by far not enough) abnormal training samples are available and we need to find
abnormalities in test data without actually knowing what they are.
Nevertheless, the prevailing concept of the field is to directly search for
individual abnormal local patches or image regions independent of another. To
address this problem, we propose a method for joint detection of abnormalities
in videos by spatio-temporal video parsing. The goal of video parsing is to
find a set of indispensable normal spatio-temporal object hypotheses that
jointly explain all the foreground of a video, while, at the same time, being
supported by normal training samples. Consequently, we avoid a direct detection
of abnormalities and discover them indirectly as those hypotheses which are
needed for covering the foreground without finding an explanation for
themselves by normal samples. Abnormalities are localized by MAP inference in a
graphical model and we solve it efficiently by formulating it as a convex
optimization problem. We experimentally evaluate our approach on several
challenging benchmark sets, improving over the state-of-the-art on all standard
benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table
Residual-Sparse Fuzzy -Means Clustering Incorporating Morphological Reconstruction and Wavelet frames
Instead of directly utilizing an observed image including some outliers,
noise or intensity inhomogeneity, the use of its ideal value (e.g. noise-free
image) has a favorable impact on clustering. Hence, the accurate estimation of
the residual (e.g. unknown noise) between the observed image and its ideal
value is an important task. To do so, we propose an
regularization-based Fuzzy -Means (FCM) algorithm incorporating a
morphological reconstruction operation and a tight wavelet frame transform. To
achieve a sound trade-off between detail preservation and noise suppression,
morphological reconstruction is used to filter an observed image. By combining
the observed and filtered images, a weighted sum image is generated. Since a
tight wavelet frame system has sparse representations of an image, it is
employed to decompose the weighted sum image, thus forming its corresponding
feature set. Taking it as data for clustering, we present an improved FCM
algorithm by imposing an regularization term on the residual between
the feature set and its ideal value, which implies that the favorable
estimation of the residual is obtained and the ideal value participates in
clustering. Spatial information is also introduced into clustering since it is
naturally encountered in image segmentation. Furthermore, it makes the
estimation of the residual more reliable. To further enhance the segmentation
effects of the improved FCM algorithm, we also employ the morphological
reconstruction to smoothen the labels generated by clustering. Finally, based
on the prototypes and smoothed labels, the segmented image is reconstructed by
using a tight wavelet frame reconstruction operation. Experimental results
reported for synthetic, medical, and color images show that the proposed
algorithm is effective and efficient, and outperforms other algorithms.Comment: 12 pages, 11 figur
Hypothesis-based image segmentation for object learning and recognition
Denecke A. Hypothesis-based image segmentation for object learning and recognition. Bielefeld: Universität Bielefeld; 2010.This thesis addresses the figure-ground segmentation problem in the context of complex systems for automatic object recognition as well as for the online and interactive acquisition of visual representations. First the problem of image segmentation in general terms and next its importance for object learning in current state-of-the-art systems is introduced. Secondly a method using artificial neural networks is presented. This approach on the basis of Generalized Learning Vector Quantization is investigated in challenging scenarios such as the real-time figure-ground segmentation of complex shaped objects under continuously changing environment conditions. The ability to fulfill these requirements characterizes the novelty of the approach compared to state-of-the-art methods.
Finally our technique is extended towards online adaption of model complexity and the integration of several segmentation cues. This yields a framework for object segmentation that is applicable to improve current systems for visual object learning and recognition
- …