3,244 research outputs found
Variational Downscaling, Fusion and Assimilation of Hydrometeorological States via Regularized Estimation
Improved estimation of hydrometeorological states from down-sampled
observations and background model forecasts in a noisy environment, has been a
subject of growing research in the past decades. Here, we introduce a unified
framework that ties together the problems of downscaling, data fusion and data
assimilation as ill-posed inverse problems. This framework seeks solutions
beyond the classic least squares estimation paradigms by imposing proper
regularization, which are constraints consistent with the degree of smoothness
and probabilistic structure of the underlying state. We review relevant
regularization methods in derivative space and extend classic formulations of
the aforementioned problems with particular emphasis on hydrologic and
atmospheric applications. Informed by the statistical characteristics of the
state variable of interest, the central results of the paper suggest that
proper regularization can lead to a more accurate and stable recovery of the
true state and hence more skillful forecasts. In particular, using the Tikhonov
and Huber regularization in the derivative space, the promise of the proposed
framework is demonstrated in static downscaling and fusion of synthetic
multi-sensor precipitation data, while a data assimilation numerical experiment
is presented using the heat equation in a variational setting
Piecewise linear regularized solution paths
We consider the generic regularized optimization problem
. Efron, Hastie,
Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407--499] have shown that for
the LASSO--that is, if is squared error loss and is
the norm of --the optimal coefficient path is piecewise linear,
that is, is piecewise
constant. We derive a general characterization of the properties of (loss ,
penalty ) pairs which give piecewise linear coefficient paths. Such pairs
allow for efficient generation of the full regularized coefficient paths. We
investigate the nature of efficient path following algorithms which arise. We
use our results to suggest robust versions of the LASSO for regression and
classification, and to develop new, efficient algorithms for existing problems
in the literature, including Mammen and van de Geer's locally adaptive
regression splines.Comment: Published at http://dx.doi.org/10.1214/009053606000001370 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Linear system identification using stable spline kernels and PLQ penalties
The classical approach to linear system identification is given by parametric
Prediction Error Methods (PEM). In this context, model complexity is often
unknown so that a model order selection step is needed to suitably trade-off
bias and variance. Recently, a different approach to linear system
identification has been introduced, where model order determination is avoided
by using a regularized least squares framework. In particular, the penalty term
on the impulse response is defined by so called stable spline kernels. They
embed information on regularity and BIBO stability, and depend on a small
number of parameters which can be estimated from data. In this paper, we
provide new nonsmooth formulations of the stable spline estimator. In
particular, we consider linear system identification problems in a very broad
context, where regularization functionals and data misfits can come from a rich
set of piecewise linear quadratic functions. Moreover, our anal- ysis includes
polyhedral inequality constraints on the unknown impulse response. For any
formulation in this class, we show that interior point methods can be used to
solve the system identification problem, with complexity O(n3)+O(mn2) in each
iteration, where n and m are the number of impulse response coefficients and
measurements, respectively. The usefulness of the framework is illustrated via
a numerical experiment where output measurements are contaminated by outliers.Comment: 8 pages, 2 figure
Stability
Reproducibility is imperative for any scientific discovery. More often than
not, modern scientific findings rely on statistical analysis of
high-dimensional data. At a minimum, reproducibility manifests itself in
stability of statistical results relative to "reasonable" perturbations to data
and to the model used. Jacknife, bootstrap, and cross-validation are based on
perturbations to data, while robust statistics methods deal with perturbations
to models. In this article, a case is made for the importance of stability in
statistics. Firstly, we motivate the necessity of stability for interpretable
and reliable encoding models from brain fMRI signals. Secondly, we find strong
evidence in the literature to demonstrate the central role of stability in
statistical inference, such as sensitivity analysis and effect detection.
Thirdly, a smoothing parameter selector based on estimation stability (ES),
ES-CV, is proposed for Lasso, in order to bring stability to bear on
cross-validation (CV). ES-CV is then utilized in the encoding models to reduce
the number of predictors by 60% with almost no loss (1.3%) of prediction
performance across over 2,000 voxels. Last, a novel "stability" argument is
seen to drive new results that shed light on the intriguing interactions
between sample to sample variability and heavier tail error distribution (e.g.,
double-exponential) in high-dimensional regression models with predictors
and independent samples. In particular, when
and the error distribution is
double-exponential, the Ordinary Least Squares (OLS) is a better estimator than
the Least Absolute Deviation (LAD) estimator.Comment: Published in at http://dx.doi.org/10.3150/13-BEJSP14 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
- …