88 research outputs found
Scaling Analysis of Affinity Propagation
We analyze and exploit some scaling properties of the Affinity Propagation
(AP) clustering algorithm proposed by Frey and Dueck (2007). First we observe
that a divide and conquer strategy, used on a large data set hierarchically
reduces the complexity to , for a
data-set of size and a depth of the hierarchical strategy. For a
data-set embedded in a -dimensional space, we show that this is obtained
without notably damaging the precision except in dimension . In fact, for
larger than 2 the relative loss in precision scales like
. Finally, under some conditions we observe that there is a
value of the penalty coefficient, a free parameter used to fix the number
of clusters, which separates a fragmentation phase (for ) from a
coalescent one (for ) of the underlying hidden cluster structure. At
this precise point holds a self-similarity property which can be exploited by
the hierarchical strategy to actually locate its position. From this
observation, a strategy based on \AP can be defined to find out how many
clusters are present in a given dataset.Comment: 28 pages, 14 figures, Inria research repor
Multiclass Semi-Supervised Learning on Graphs using Ginzburg-Landau Functional Minimization
We present a graph-based variational algorithm for classification of
high-dimensional data, generalizing the binary diffuse interface model to the
case of multiple classes. Motivated by total variation techniques, the method
involves minimizing an energy functional made up of three terms. The first two
terms promote a stepwise continuous classification function with sharp
transitions between classes, while preserving symmetry among the class labels.
The third term is a data fidelity term, allowing us to incorporate prior
information into the model in a semi-supervised framework. The performance of
the algorithm on synthetic data, as well as on the COIL and MNIST benchmark
datasets, is competitive with state-of-the-art graph-based multiclass
segmentation methods.Comment: 16 pages, to appear in Springer's Lecture Notes in Computer Science
volume "Pattern Recognition Applications and Methods 2013", part of series on
Advances in Intelligent and Soft Computin
Learning Behavioural Context
The original publication is available at www.springerlink.co
Uncertainty quantification in graph-based classification of high dimensional data
Classification of high dimensional data finds wide-ranging applications. In
many of these applications equipping the resulting classification with a
measure of uncertainty may be as important as the classification itself. In
this paper we introduce, develop algorithms for, and investigate the properties
of, a variety of Bayesian models for the task of binary classification; via the
posterior distribution on the classification labels, these methods
automatically give measures of uncertainty. The methods are all based around
the graph formulation of semi-supervised learning.
We provide a unified framework which brings together a variety of methods
which have been introduced in different communities within the mathematical
sciences. We study probit classification in the graph-based setting, generalize
the level-set method for Bayesian inverse problems to the classification
setting, and generalize the Ginzburg-Landau optimization-based classifier to a
Bayesian setting; we also show that the probit and level set approaches are
natural relaxations of the harmonic function approach introduced in [Zhu et al
2003].
We introduce efficient numerical methods, suited to large data-sets, for both
MCMC-based sampling as well as gradient-based MAP estimation. Through numerical
experiments we study classification accuracy and uncertainty quantification for
our models; these experiments showcase a suite of datasets commonly used to
evaluate graph-based semi-supervised learning algorithms.Comment: 33 pages, 14 figure
Action Recognition with a Bio--Inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions
International audienceHere we show that reproducing the functional properties of MT cells with various center--surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio--inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure, and more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio--inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos
Music genre profiling based on Fisher manifolds and Probabilistic Quantum Clustering
Probabilistic classifiers induce a similarity metric at each location in the space of the data. This is measured by the Fisher Information Matrix. Pairwise distances in this Riemannian space, calculated along geodesic paths, can be used to generate a similarity map of the data. The novelty in the paper is twofold; to improve the methodology for visualisation of data structures in low-dimensional manifolds, and to illustrate the value of inferring the structure from a probabilistic classifier by metric learning, through application to music data. This leads to the discovery of new structures and song similarities beyond the original genre classification labels. These similarities are not directly observable by measuring Euclidean distances between features of the original space, but require the correct metric to reflect similarity based on genre. The results quantify the extent to which music from bands typically associated with one particular genre can, in fact, crossover strongly to another genre
Minimal-cut model composition
Constructing new, complex models is often done by reusing parts of existing models, typically by applying a sequence of segmentation, alignment and composition operations. Segmentation, either manual or automatic, is rarely adequate for this task, since it is applied to each model independently, leaving it to the user to trim the models and determine where to connect them. In this paper we propose a new composition tool. Our tool obtains as input two models, aligned either manually or automatically, and a small set of constraints indicating which portions of the two models should be preserved in the final output. It then automatically negotiates the best location to connect the models, trimming and stitching them as required to produce a seamless result. We offer a method based on the graph theoretic minimal cut as a means of implementing this new tool. We describe a system intended for both expert and novice users, allowing easy and flexible control over the composition result. In addition, we show our method to be well suited for a variety of model processing applications such as model repair, hole filling, and piecewise rigid deformations. 1
- …