8,069 research outputs found
Bayesian Inference of Log Determinants
The log-determinant of a kernel matrix appears in a variety of machine
learning problems, ranging from determinantal point processes and generalized
Markov random fields, through to the training of Gaussian processes. Exact
calculation of this term is often intractable when the size of the kernel
matrix exceeds a few thousand. In the spirit of probabilistic numerics, we
reinterpret the problem of computing the log-determinant as a Bayesian
inference problem. In particular, we combine prior knowledge in the form of
bounds from matrix theory and evidence derived from stochastic trace estimation
to obtain probabilistic estimates for the log-determinant and its associated
uncertainty within a given computational budget. Beyond its novelty and
theoretic appeal, the performance of our proposal is competitive with
state-of-the-art approaches to approximating the log-determinant, while also
quantifying the uncertainty due to budget-constrained evidence.Comment: 12 pages, 3 figure
Gaussian Process Morphable Models
Statistical shape models (SSMs) represent a class of shapes as a normal
distribution of point variations, whose parameters are estimated from example
shapes. Principal component analysis (PCA) is applied to obtain a
low-dimensional representation of the shape variation in terms of the leading
principal components. In this paper, we propose a generalization of SSMs,
called Gaussian Process Morphable Models (GPMMs). We model the shape variations
with a Gaussian process, which we represent using the leading components of its
Karhunen-Loeve expansion. To compute the expansion, we make use of an
approximation scheme based on the Nystrom method. The resulting model can be
seen as a continuous analogon of an SSM. However, while for SSMs the shape
variation is restricted to the span of the example data, with GPMMs we can
define the shape variation using any Gaussian process. For example, we can
build shape models that correspond to classical spline models, and thus do not
require any example data. Furthermore, Gaussian processes make it possible to
combine different models. For example, an SSM can be extended with a spline
model, to obtain a model that incorporates learned shape characteristics, but
is flexible enough to explain shapes that cannot be represented by the SSM. We
introduce a simple algorithm for fitting a GPMM to a surface or image. This
results in a non-rigid registration approach, whose regularization properties
are defined by a GPMM. We show how we can obtain different registration
schemes,including methods for multi-scale, spatially-varying or hybrid
registration, by constructing an appropriate GPMM. As our approach strictly
separates modelling from the fitting process, this is all achieved without
changes to the fitting algorithm. We show the applicability and versatility of
GPMMs on a clinical use case, where the goal is the model-based segmentation of
3D forearm images
Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck operators
This paper presents a diffusion based probabilistic interpretation of
spectral clustering and dimensionality reduction algorithms that use the
eigenvectors of the normalized graph Laplacian. Given the pairwise adjacency
matrix of all points, we define a diffusion distance between any two data
points and show that the low dimensional representation of the data by the
first few eigenvectors of the corresponding Markov matrix is optimal under a
certain mean squared error criterion. Furthermore, assuming that data points
are random samples from a density p(\x) = e^{-U(\x)} we identify these
eigenvectors as discrete approximations of eigenfunctions of a Fokker-Planck
operator in a potential 2U(\x) with reflecting boundary conditions. Finally,
applying known results regarding the eigenvalues and eigenfunctions of the
continuous Fokker-Planck operator, we provide a mathematical justification for
the success of spectral clustering and dimensional reduction algorithms based
on these first few eigenvectors. This analysis elucidates, in terms of the
characteristics of diffusion processes, many empirical findings regarding
spectral clustering algorithms.Comment: submitted to NIPS 200
On landmark selection and sampling in high-dimensional data analysis
In recent years, the spectral analysis of appropriately defined kernel
matrices has emerged as a principled way to extract the low-dimensional
structure often prevalent in high-dimensional data. Here we provide an
introduction to spectral methods for linear and nonlinear dimension reduction,
emphasizing ways to overcome the computational limitations currently faced by
practitioners with massive datasets. In particular, a data subsampling or
landmark selection process is often employed to construct a kernel based on
partial information, followed by an approximate spectral analysis termed the
Nystrom extension. We provide a quantitative framework to analyse this
procedure, and use it to demonstrate algorithmic performance bounds on a range
of practical approaches designed to optimize the landmark selection process. We
compare the practical implications of these bounds by way of real-world
examples drawn from the field of computer vision, whereby low-dimensional
manifold structure is shown to emerge from high-dimensional video data streams.Comment: 18 pages, 6 figures, submitted for publicatio
Parsimonious Mahalanobis Kernel for the Classification of High Dimensional Data
The classification of high dimensional data with kernel methods is considered
in this article. Exploit- ing the emptiness property of high dimensional
spaces, a kernel based on the Mahalanobis distance is proposed. The computation
of the Mahalanobis distance requires the inversion of a covariance matrix. In
high dimensional spaces, the estimated covariance matrix is ill-conditioned and
its inversion is unstable or impossible. Using a parsimonious statistical
model, namely the High Dimensional Discriminant Analysis model, the specific
signal and noise subspaces are estimated for each considered class making the
inverse of the class specific covariance matrix explicit and stable, leading to
the definition of a parsimonious Mahalanobis kernel. A SVM based framework is
used for selecting the hyperparameters of the parsimonious Mahalanobis kernel
by optimizing the so-called radius-margin bound. Experimental results on three
high dimensional data sets show that the proposed kernel is suitable for
classifying high dimensional data, providing better classification accuracies
than the conventional Gaussian kernel
- …