3,630 research outputs found
An analytic comparison of regularization methods for Gaussian Processes
Gaussian Processes (GPs) are a popular approach to predict the output of a
parameterized experiment. They have many applications in the field of Computer
Experiments, in particular to perform sensitivity analysis, adaptive design of
experiments and global optimization. Nearly all of the applications of GPs
require the inversion of a covariance matrix that, in practice, is often
ill-conditioned. Regularization methodologies are then employed with
consequences on the GPs that need to be better understood.The two principal
methods to deal with ill-conditioned covariance matrices are i) pseudoinverse
and ii) adding a positive constant to the diagonal (the so-called nugget
regularization).The first part of this paper provides an algebraic comparison
of PI and nugget regularizations. Redundant points, responsible for covariance
matrix singularity, are defined. It is proven that pseudoinverse
regularization, contrarily to nugget regularization, averages the output values
and makes the variance zero at redundant points. However, pseudoinverse and
nugget regularizations become equivalent as the nugget value vanishes. A
measure for data-model discrepancy is proposed which serves for choosing a
regularization technique.In the second part of the paper, a distribution-wise
GP is introduced that interpolates Gaussian distributions instead of data
points. Distribution-wise GP can be seen as an improved regularization method
for GPs
Pinsker estimators for local helioseismology
A major goal of helioseismology is the three-dimensional reconstruction of
the three velocity components of convective flows in the solar interior from
sets of wave travel-time measurements. For small amplitude flows, the forward
problem is described in good approximation by a large system of convolution
equations. The input observations are highly noisy random vectors with a known
dense covariance matrix. This leads to a large statistical linear inverse
problem.
Whereas for deterministic linear inverse problems several computationally
efficient minimax optimal regularization methods exist, only one
minimax-optimal linear estimator exists for statistical linear inverse
problems: the Pinsker estimator. However, it is often computationally
inefficient because it requires a singular value decomposition of the forward
operator or it is not applicable because of an unknown noise covariance matrix,
so it is rarely used for real-world problems. These limitations do not apply in
helioseismology. We present a simplified proof of the optimality properties of
the Pinsker estimator and show that it yields significantly better
reconstructions than traditional inversion methods used in helioseismology,
i.e.\ Regularized Least Squares (Tikhonov regularization) and SOLA (approximate
inverse) methods.
Moreover, we discuss the incorporation of the mass conservation constraint in
the Pinsker scheme using staggered grids. With this improvement we can
reconstruct not only horizontal, but also vertical velocity components that are
much smaller in amplitude
Data-Driven Estimation in Equilibrium Using Inverse Optimization
Equilibrium modeling is common in a variety of fields such as game theory and
transportation science. The inputs for these models, however, are often
difficult to estimate, while their outputs, i.e., the equilibria they are meant
to describe, are often directly observable. By combining ideas from inverse
optimization with the theory of variational inequalities, we develop an
efficient, data-driven technique for estimating the parameters of these models
from observed equilibria. We use this technique to estimate the utility
functions of players in a game from their observed actions and to estimate the
congestion function on a road network from traffic count data. A distinguishing
feature of our approach is that it supports both parametric and
\emph{nonparametric} estimation by leveraging ideas from statistical learning
(kernel methods and regularization operators). In computational experiments
involving Nash and Wardrop equilibria in a nonparametric setting, we find that
a) we effectively estimate the unknown demand or congestion function,
respectively, and b) our proposed regularization technique substantially
improves the out-of-sample performance of our estimators.Comment: 36 pages, 5 figures Additional theorems for generalization guarantees
and statistical analysis adde
Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience.
Identifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here, we describe a software toolbox-called seqNMF-with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral datas. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs
- …