21,645 research outputs found
Multi-view Metric Learning in Vector-valued Kernel Spaces
We consider the problem of metric learning for multi-view data and present a
novel method for learning within-view as well as between-view metrics in
vector-valued kernel spaces, as a way to capture multi-modal structure of the
data. We formulate two convex optimization problems to jointly learn the metric
and the classifier or regressor in kernel feature spaces. An iterative
three-step multi-view metric learning algorithm is derived from the
optimization problems. In order to scale the computation to large training
sets, a block-wise Nystr{\"o}m approximation of the multi-view kernel matrix is
introduced. We justify our approach theoretically and experimentally, and show
its performance on real-world datasets against relevant state-of-the-art
methods
Sparse Deterministic Approximation of Bayesian Inverse Problems
We present a parametric deterministic formulation of Bayesian inverse
problems with input parameter from infinite dimensional, separable Banach
spaces. In this formulation, the forward problems are parametric, deterministic
elliptic partial differential equations, and the inverse problem is to
determine the unknown, parametric deterministic coefficients from noisy
observations comprising linear functionals of the solution.
We prove a generalized polynomial chaos representation of the posterior
density with respect to the prior measure, given noisy observational data. We
analyze the sparsity of the posterior density in terms of the summability of
the input data's coefficient sequence. To this end, we estimate the
fluctuations in the prior. We exhibit sufficient conditions on the prior model
in order for approximations of the posterior density to converge at a given
algebraic rate, in terms of the number of unknowns appearing in the
parameteric representation of the prior measure. Similar sparsity and
approximation results are also exhibited for the solution and covariance of the
elliptic partial differential equation under the posterior. These results then
form the basis for efficient uncertainty quantification, in the presence of
data with noise
TMB: Automatic Differentiation and Laplace Approximation
TMB is an open source R package that enables quick implementation of complex
nonlinear random effect (latent variable) models in a manner similar to the
established AD Model Builder package (ADMB, admb-project.org). In addition, it
offers easy access to parallel computations. The user defines the joint
likelihood for the data and the random effects as a C++ template function,
while all the other operations are done in R; e.g., reading in the data. The
package evaluates and maximizes the Laplace approximation of the marginal
likelihood where the random effects are automatically integrated out. This
approximation, and its derivatives, are obtained using automatic
differentiation (up to order three) of the joint likelihood. The computations
are designed to be fast for problems with many random effects (~10^6) and
parameters (~10^3). Computation times using ADMB and TMB are compared on a
suite of examples ranging from simple models to large spatial models where the
random effects are a Gaussian random field. Speedups ranging from 1.5 to about
100 are obtained with increasing gains for large problems. The package and
examples are available at http://tmb-project.org
A mixed regularization approach for sparse simultaneous approximation of parameterized PDEs
We present and analyze a novel sparse polynomial technique for the
simultaneous approximation of parameterized partial differential equations
(PDEs) with deterministic and stochastic inputs. Our approach treats the
numerical solution as a jointly sparse reconstruction problem through the
reformulation of the standard basis pursuit denoising, where the set of jointly
sparse vectors is infinite. To achieve global reconstruction of sparse
solutions to parameterized elliptic PDEs over both physical and parametric
domains, we combine the standard measurement scheme developed for compressed
sensing in the context of bounded orthonormal systems with a novel mixed-norm
based regularization method that exploits both energy and sparsity. In
addition, we are able to prove that, with minimal sample complexity, error
estimates comparable to the best -term and quasi-optimal approximations are
achievable, while requiring only a priori bounds on polynomial truncation error
with respect to the energy norm. Finally, we perform extensive numerical
experiments on several high-dimensional parameterized elliptic PDE models to
demonstrate the superior recovery properties of the proposed approach.Comment: 23 pages, 4 figure
Low Complexity Regularization of Linear Inverse Problems
Inverse problems and regularization theory is a central theme in contemporary
signal processing, where the goal is to reconstruct an unknown signal from
partial indirect, and possibly noisy, measurements of it. A now standard method
for recovering the unknown signal is to solve a convex optimization problem
that enforces some prior knowledge about its structure. This has proved
efficient in many problems routinely encountered in imaging sciences,
statistics and machine learning. This chapter delivers a review of recent
advances in the field where the regularization prior promotes solutions
conforming to some notion of simplicity/low-complexity. These priors encompass
as popular examples sparsity and group sparsity (to capture the compressibility
of natural signals and images), total variation and analysis sparsity (to
promote piecewise regularity), and low-rank (as natural extension of sparsity
to matrix-valued data). Our aim is to provide a unified treatment of all these
regularizations under a single umbrella, namely the theory of partial
smoothness. This framework is very general and accommodates all low-complexity
regularizers just mentioned, as well as many others. Partial smoothness turns
out to be the canonical way to encode low-dimensional models that can be linear
spaces or more general smooth manifolds. This review is intended to serve as a
one stop shop toward the understanding of the theoretical properties of the
so-regularized solutions. It covers a large spectrum including: (i) recovery
guarantees and stability to noise, both in terms of -stability and
model (manifold) identification; (ii) sensitivity analysis to perturbations of
the parameters involved (in particular the observations), with applications to
unbiased risk estimation ; (iii) convergence properties of the forward-backward
proximal splitting scheme, that is particularly well suited to solve the
corresponding large-scale regularized optimization problem
Elastic-Net Regularization in Learning Theory
Within the framework of statistical learning theory we analyze in detail the
so-called elastic-net regularization scheme proposed by Zou and Hastie for the
selection of groups of correlated variables. To investigate on the statistical
properties of this scheme and in particular on its consistency properties, we
set up a suitable mathematical framework. Our setting is random-design
regression where we allow the response variable to be vector-valued and we
consider prediction functions which are linear combination of elements ({\em
features}) in an infinite-dimensional dictionary. Under the assumption that the
regression function admits a sparse representation on the dictionary, we prove
that there exists a particular ``{\em elastic-net representation}'' of the
regression function such that, if the number of data increases, the elastic-net
estimator is consistent not only for prediction but also for variable/feature
selection. Our results include finite-sample bounds and an adaptive scheme to
select the regularization parameter. Moreover, using convex analysis tools, we
derive an iterative thresholding algorithm for computing the elastic-net
solution which is different from the optimization procedure originally proposed
by Zou and HastieComment: 32 pages, 3 figure
CayleyNets: Graph Convolutional Neural Networks with Complex Rational Spectral Filters
The rise of graph-structured data such as social networks, regulatory
networks, citation graphs, and functional brain networks, in combination with
resounding success of deep learning in various applications, has brought the
interest in generalizing deep learning models to non-Euclidean domains. In this
paper, we introduce a new spectral domain convolutional architecture for deep
learning on graphs. The core ingredient of our model is a new class of
parametric rational complex functions (Cayley polynomials) allowing to
efficiently compute spectral filters on graphs that specialize on frequency
bands of interest. Our model generates rich spectral filters that are localized
in space, scales linearly with the size of the input data for
sparsely-connected graphs, and can handle different constructions of Laplacian
operators. Extensive experimental results show the superior performance of our
approach, in comparison to other spectral domain convolutional architectures,
on spectral image classification, community detection, vertex classification
and matrix completion tasks
- …