Search CORE

3,244 research outputs found

Variational Downscaling, Fusion and Assimilation of Hydrometeorological States via Regularized Estimation

Author: Ebtehaj Ardeshir Mohammad
Foufoula-Georgiou Efi
Publication venue: 'Wiley'
Publication date: 02/10/2012
Field of study

Improved estimation of hydrometeorological states from down-sampled observations and background model forecasts in a noisy environment, has been a subject of growing research in the past decades. Here, we introduce a unified framework that ties together the problems of downscaling, data fusion and data assimilation as ill-posed inverse problems. This framework seeks solutions beyond the classic least squares estimation paradigms by imposing proper regularization, which are constraints consistent with the degree of smoothness and probabilistic structure of the underlying state. We review relevant regularization methods in derivative space and extend classic formulations of the aforementioned problems with particular emphasis on hydrologic and atmospheric applications. Informed by the statistical characteristics of the state variable of interest, the central results of the paper suggest that proper regularization can lead to a more accurate and stable recovery of the true state and hence more skillful forecasts. In particular, using the Tikhonov and Huber regularization in the derivative space, the promise of the proposed framework is demonstrated in static downscaling and fusion of synthetic multi-sensor precipitation data, while a data assimilation numerical experiment is presented using the heat equation in a variational setting

arXiv.org e-Print Archive

CiteSeerX

DigitalCommons@USU

Piecewise linear regularized solution paths

Author: Rosset Saharon
Zhu Ji
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

We consider the generic regularized optimization problem

\hat{\mathsf{\beta}}(\lambda)=\arg \min_{\beta}L({\sf{y}},X{\sf{\beta}})+\lambda J({\sf{\beta}})

. Efron, Hastie, Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407--499] have shown that for the LASSO--that is, if

L

is squared error loss and

J(\beta)=\|\beta\|_1

is the

\ell_1

norm of

\beta

--the optimal coefficient path is piecewise linear, that is,

\partial \hat{\beta}(\lambda)/\partial \lambda

is piecewise constant. We derive a general characterization of the properties of (loss

L

, penalty

J

) pairs which give piecewise linear coefficient paths. Such pairs allow for efficient generation of the full regularized coefficient paths. We investigate the nature of efficient path following algorithms which arise. We use our results to suggest robust versions of the LASSO for regression and classification, and to develop new, efficient algorithms for existing problems in the literature, including Mammen and van de Geer's locally adaptive regression splines.Comment: Published at http://dx.doi.org/10.1214/009053606000001370 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Linear system identification using stable spline kernels and PLQ penalties

Author: Aravkin Aleksandr Y.
Burke James V.
Pillonetto Gianluigi
Publication venue
Publication date: 01/01/2013
Field of study

The classical approach to linear system identification is given by parametric Prediction Error Methods (PEM). In this context, model complexity is often unknown so that a model order selection step is needed to suitably trade-off bias and variance. Recently, a different approach to linear system identification has been introduced, where model order determination is avoided by using a regularized least squares framework. In particular, the penalty term on the impulse response is defined by so called stable spline kernels. They embed information on regularity and BIBO stability, and depend on a small number of parameters which can be estimated from data. In this paper, we provide new nonsmooth formulations of the stable spline estimator. In particular, we consider linear system identification problems in a very broad context, where regularization functionals and data misfits can come from a rich set of piecewise linear quadratic functions. Moreover, our anal- ysis includes polyhedral inequality constraints on the unknown impulse response. For any formulation in this class, we show that interior point methods can be used to solve the system identification problem, with complexity O(n3)+O(mn2) in each iteration, where n and m are the number of impulse response coefficients and measurements, respectively. The usefulness of the framework is illustrated via a numerical experiment where output measurements are contaminated by outliers.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Stability

Author: Yu Bin
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/09/2013
Field of study

Reproducibility is imperative for any scientific discovery. More often than not, modern scientific findings rely on statistical analysis of high-dimensional data. At a minimum, reproducibility manifests itself in stability of statistical results relative to "reasonable" perturbations to data and to the model used. Jacknife, bootstrap, and cross-validation are based on perturbations to data, while robust statistics methods deal with perturbations to models. In this article, a case is made for the importance of stability in statistics. Firstly, we motivate the necessity of stability for interpretable and reliable encoding models from brain fMRI signals. Secondly, we find strong evidence in the literature to demonstrate the central role of stability in statistical inference, such as sensitivity analysis and effect detection. Thirdly, a smoothing parameter selector based on estimation stability (ES), ES-CV, is proposed for Lasso, in order to bring stability to bear on cross-validation (CV). ES-CV is then utilized in the encoding models to reduce the number of predictors by 60% with almost no loss (1.3%) of prediction performance across over 2,000 voxels. Last, a novel "stability" argument is seen to drive new results that shed light on the intriguing interactions between sample to sample variability and heavier tail error distribution (e.g., double-exponential) in high-dimensional regression models with

p

predictors and

n

independent samples. In particular, when

p/n\rightarrow\kappa\in(0.3,1)

and the error distribution is double-exponential, the Ordinary Least Squares (OLS) is a better estimator than the Least Absolute Deviation (LAD) estimator.Comment: Published in at http://dx.doi.org/10.3150/13-BEJSP14 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

eScholarship - University of California