336 research outputs found
On the optimality of shape and data representation in the spectral domain
A proof of the optimality of the eigenfunctions of the Laplace-Beltrami
operator (LBO) in representing smooth functions on surfaces is provided and
adapted to the field of applied shape and data analysis. It is based on the
Courant-Fischer min-max principle adapted to our case. % The theorem we present
supports the new trend in geometry processing of treating geometric structures
by using their projection onto the leading eigenfunctions of the decomposition
of the LBO. Utilisation of this result can be used for constructing numerically
efficient algorithms to process shapes in their spectrum. We review a couple of
applications as possible practical usage cases of the proposed optimality
criteria. % We refer to a scale invariant metric, which is also invariant to
bending of the manifold. This novel pseudo-metric allows constructing an LBO by
which a scale invariant eigenspace on the surface is defined. We demonstrate
the efficiency of an intermediate metric, defined as an interpolation between
the scale invariant and the regular one, in representing geometric structures
while capturing both coarse and fine details. Next, we review a numerical
acceleration technique for classical scaling, a member of a family of
flattening methods known as multidimensional scaling (MDS). There, the
optimality is exploited to efficiently approximate all geodesic distances
between pairs of points on a given surface, and thereby match and compare
between almost isometric surfaces. Finally, we revisit the classical principal
component analysis (PCA) definition by coupling its variational form with a
Dirichlet energy on the data manifold. By pairing the PCA with the LBO we can
handle cases that go beyond the scope defined by the observation set that is
handled by regular PCA
Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies
In motion analysis and understanding it is important to be able to fit a
suitable model or structure to the temporal series of observed data, in order
to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is
available, such a structure has to be learned from the data in a bottom-up
fashion. In recent times, volumetric approaches in which the motion is captured
from a number of cameras and a voxel-set representation of the body is built
from the camera views, have gained ground due to attractive features such as
inherent view-invariance and robustness to occlusions. Automatic, unsupervised
segmentation of moving bodies along entire sequences, in a temporally-coherent
and robust way, has the potential to provide a means of constructing a
bottom-up model of the moving body, and track motion cues that may be later
exploited for motion classification. Spectral methods such as locally linear
embedding (LLE) can be useful in this context, as they preserve "protrusions",
i.e., high-curvature regions of the 3D volume, of articulated shapes, while
improving their separation in a lower dimensional space, making them in this
way easier to cluster. In this paper we therefore propose a spectral approach
to unsupervised and temporally-coherent body-protrusion segmentation along time
sequences. Volumetric shapes are clustered in an embedding space, clusters are
propagated in time to ensure coherence, and merged or split to accommodate
changes in the body's topology. Experiments on both synthetic and real
sequences of dense voxel-set data are shown. This supports the ability of the
proposed method to cluster body-parts consistently over time in a totally
unsupervised fashion, its robustness to sampling density and shape quality, and
its potential for bottom-up model constructionComment: 31 pages, 26 figure
On the Sample Complexity of Subspace Learning
A large number of algorithms in machine learning, from principal component
analysis (PCA), and its non-linear (kernel) extensions, to more recent spectral
embedding and support estimation methods, rely on estimating a linear subspace
from samples. In this paper we introduce a general formulation of this problem
and derive novel learning error estimates. Our results rely on natural
assumptions on the spectral properties of the covariance operator associated to
the data distribu- tion, and hold for a wide class of metrics between
subspaces. As special cases, we discuss sharp error estimates for the
reconstruction properties of PCA and spectral support estimation. Key to our
analysis is an operator theoretic approach that has broad applicability to
spectral learning methods.Comment: Extendend Version of conference pape
Unified Framework for Spectral Dimensionality Reduction, Maximum Variance Unfolding, and Kernel Learning By Semidefinite Programming: Tutorial and Survey
This is a tutorial and survey paper on unification of spectral dimensionality
reduction methods, kernel learning by Semidefinite Programming (SDP), Maximum
Variance Unfolding (MVU) or Semidefinite Embedding (SDE), and its variants. We
first explain how the spectral dimensionality reduction methods can be unified
as kernel Principal Component Analysis (PCA) with different kernels. This
unification can be interpreted as eigenfunction learning or representation of
kernel in terms of distance matrix. Then, since the spectral methods are
unified as kernel PCA, we say let us learn the best kernel for unfolding the
manifold of data to its maximum variance. We first briefly introduce kernel
learning by SDP for the transduction task. Then, we explain MVU in detail.
Various versions of supervised MVU using nearest neighbors graph, by class-wise
unfolding, by Fisher criterion, and by colored MVU are explained. We also
explain out-of-sample extension of MVU using eigenfunctions and kernel mapping.
Finally, we introduce other variants of MVU including action respecting
embedding, relaxed MVU, and landmark MVU for big data.Comment: To appear as a part of an upcoming textbook on dimensionality
reduction and manifold learning. v2: corrected some typo
From which world is your graph?
Discovering statistical structure from links is a fundamental problem in the
analysis of social networks. Choosing a misspecified model, or equivalently, an
incorrect inference algorithm will result in an invalid analysis or even
falsely uncover patterns that are in fact artifacts of the model. This work
focuses on unifying two of the most widely used link-formation models: the
stochastic blockmodel (SBM) and the small world (or latent space) model (SWM).
Integrating techniques from kernel learning, spectral graph theory, and
nonlinear dimensionality reduction, we develop the first statistically sound
polynomial-time algorithm to discover latent patterns in sparse graphs for both
models. When the network comes from an SBM, the algorithm outputs a block
structure. When it is from an SWM, the algorithm outputs estimates of each
node's latent position.Comment: To appear in NIPS 201
High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation
The ratio between two probability density functions is an important component
of various tasks, including selection bias correction, novelty detection and
classification. Recently, several estimators of this ratio have been proposed.
Most of these methods fail if the sample space is high-dimensional, and hence
require a dimension reduction step, the result of which can be a significant
loss of information. Here we propose a simple-to-implement, fully nonparametric
density ratio estimator that expands the ratio in terms of the eigenfunctions
of a kernel-based operator; these functions reflect the underlying geometry of
the data (e.g., submanifold structure), often leading to better estimates
without an explicit dimension reduction step. We show how our general framework
can be extended to address another important problem, the estimation of a
likelihood function in situations where that function cannot be
well-approximated by an analytical form. One is often faced with this situation
when performing statistical inference with data from the sciences, due the
complexity of the data and of the processes that generated those data. We
emphasize applications where using existing likelihood-free methods of
inference would be challenging due to the high dimensionality of the sample
space, but where our spectral series method yields a reasonable estimate of the
likelihood function. We provide theoretical guarantees and illustrate the
effectiveness of our proposed method with numerical experiments.Comment: With supplementary materia
Diffusion Maps for dimensionality reduction and visualization of meteorological data
This is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, VOL 163, (2015) DOI 10.1016/j.neucom.2014.08.090The growing interest in big data problems implies the need for unsupervised methods for data visualization and dimensionality reduction. Diffusion Maps (DM) is a recent technique that can capture the lower dimensional geometric structure underlying the sample patterns in a way which can be made to be independent of the sampling distribution. Moreover, DM allows us to define an embedding whose Euclidean metric relates to the sample's intrinsic one which, in turn, enables a principled application of k-means clustering. In this work we give a self-contained review of DM and discuss two methods to compute the DM embedding coordinates to new out-of-sample data. Then, we will apply them on two meteorological data problems that involve time and spatial compression of numerical weather forecasts and show how DM is capable to, first, greatly reduce the initial dimension while still capturing relevant information in the original data and, also, how the sample-derived DM embedding coordinates can be extended to new patterns.The authors acknowledge partial support from Spain's grant TIN2010-21575-C02-01 and the UAM{ADIC Chair for Machine Learning. The first author is also supported by an FPI{UAM grant and kindly thanks the Applied Mathematics Department of Yale University for receiving her during her visits
- …