Search CORE

3,630 research outputs found

An analytic comparison of regularization methods for Gaussian Processes

Author: Bay Xavier
Durrande Nicolas
Mohammadi Hossein
Riche Rodolphe Le
Touboul Eric
Publication venue
Publication date: 08/04/2015
Field of study

Gaussian Processes (GPs) are a popular approach to predict the output of a parameterized experiment. They have many applications in the field of Computer Experiments, in particular to perform sensitivity analysis, adaptive design of experiments and global optimization. Nearly all of the applications of GPs require the inversion of a covariance matrix that, in practice, is often ill-conditioned. Regularization methodologies are then employed with consequences on the GPs that need to be better understood.The two principal methods to deal with ill-conditioned covariance matrices are i) pseudoinverse and ii) adding a positive constant to the diagonal (the so-called nugget regularization).The first part of this paper provides an algebraic comparison of PI and nugget regularizations. Redundant points, responsible for covariance matrix singularity, are defined. It is proven that pseudoinverse regularization, contrarily to nugget regularization, averages the output values and makes the variance zero at redundant points. However, pseudoinverse and nugget regularizations become equivalent as the nugget value vanishes. A measure for data-model discrepancy is proposed which serves for choosing a regularization technique.In the second part of the paper, a distribution-wise GP is introduced that interpolates Gaussian distributions instead of data points. Distribution-wise GP can be seen as an improved regularization method for GPs

arXiv.org e-Print Archive

HAL Clermont Université

Open Research Exeter

HAL-EMSE

Pinsker estimators for local helioseismology

Author: Fournier D.
Gizon L.
Hohage T.
Holzke M.
Publication venue: 'IOP Publishing'
Publication date: 01/01/2016
Field of study

A major goal of helioseismology is the three-dimensional reconstruction of the three velocity components of convective flows in the solar interior from sets of wave travel-time measurements. For small amplitude flows, the forward problem is described in good approximation by a large system of convolution equations. The input observations are highly noisy random vectors with a known dense covariance matrix. This leads to a large statistical linear inverse problem. Whereas for deterministic linear inverse problems several computationally efficient minimax optimal regularization methods exist, only one minimax-optimal linear estimator exists for statistical linear inverse problems: the Pinsker estimator. However, it is often computationally inefficient because it requires a singular value decomposition of the forward operator or it is not applicable because of an unknown noise covariance matrix, so it is rarely used for real-world problems. These limitations do not apply in helioseismology. We present a simplified proof of the optimality properties of the Pinsker estimator and show that it yields significantly better reconstructions than traditional inversion methods used in helioseismology, i.e.\ Regularized Least Squares (Tikhonov regularization) and SOLA (approximate inverse) methods. Moreover, we discuss the incorporation of the mass conservation constraint in the Pinsker scheme using staggered grids. With this improvement we can reconstruct not only horizontal, but also vertical velocity components that are much smaller in amplitude

arXiv.org e-Print Archive

MPG.PuRe

Data-Driven Estimation in Equilibrium Using Inverse Optimization

Author: Dimitris Bertsimas
Dimitris Bertsimas
Vishal Gupta
Vishal Gupta
Publication venue
Publication date: 01/01/2014
Field of study

Equilibrium modeling is common in a variety of fields such as game theory and transportation science. The inputs for these models, however, are often difficult to estimate, while their outputs, i.e., the equilibria they are meant to describe, are often directly observable. By combining ideas from inverse optimization with the theory of variational inequalities, we develop an efficient, data-driven technique for estimating the parameters of these models from observed equilibria. We use this technique to estimate the utility functions of players in a game from their observed actions and to estimate the congestion function on a road network from traffic count data. A distinguishing feature of our approach is that it supports both parametric and \emph{nonparametric} estimation by leveraging ideas from statistical learning (kernel methods and regularization operators). In computational experiments involving Nash and Wardrop equilibria in a nonparametric setting, we find that a) we effectively estimate the unknown demand or congestion function, respectively, and b) our proposed regularization technique substantially improves the out-of-sample performance of our estimators.Comment: 36 pages, 5 figures Additional theorems for generalization guarantees and statistical analysis adde

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Boston University Institutional Repository (OpenBU)

Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience.

Author: Bahle Andrew H
Denisenko Natalia I
Fee Michale S
Goldman Mark S
Gu Shijie
Mackevicius Emily L
Williams Alex H
Publication venue: eScholarship, University of California
Publication date: 01/02/2019
Field of study

Identifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here, we describe a software toolbox-called seqNMF-with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral datas. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs

DSpace@MIT

eScholarship - University of California