20,401 research outputs found
Closed-Loop Statistical Verification of Stochastic Nonlinear Systems Subject to Parametric Uncertainties
This paper proposes a statistical verification framework using Gaussian
processes (GPs) for simulation-based verification of stochastic nonlinear
systems with parametric uncertainties. Given a small number of stochastic
simulations, the proposed framework constructs a GP regression model and
predicts the system's performance over the entire set of possible
uncertainties. Included in the framework is a new metric to estimate the
confidence in those predictions based on the variance of the GP's cumulative
distribution function. This variance-based metric forms the basis of active
sampling algorithms that aim to minimize prediction error through careful
selection of simulations. In three case studies, the new active sampling
algorithms demonstrate up to a 35% improvement in prediction error over other
approaches and are able to correctly identify regions with low prediction
confidence through the variance metric.Comment: 8 pages, submitted to ACC 201
Techniques of linear prediction, with application to oceanic and atmospheric fields in the tropical Pacific
The problem of constructing optimal linear prediction models by multivariance regression methods is reviewed. It is well known that as the number of predictors in a model is increased, the skill of the prediction grows, but the statistical significance generally decreases. For predictions using a large number of candidate predictors, strategies are therefore needed to determine optimal prediction models which properly balance the competing requirements of skill and significance. The popular methods of coefficient screening or stepwise regression represent a posteriori predictor selection methods and therefore cannot be used to recover statistically significant models by truncation if the complete model, including all predictors, is statistically insignificant. Higher significance can be achieved only by a priori reduction of the predictor set. To determine the maximum number of predictors which may be meaningfully incorporated in a model, a model hierarchy can be used in which a series of best fit prediction models is constructed for a (prior defined) nested sequence of predictor sets, the sequence being terminated when the significance level either falls below a prescribed limit or reaches a maximum value. The method requires a reliable assessment of model significance. This is characterized by a quadratic statistic which is defined independently of the model skill or artificial skill. As an example, the method is applied to the prediction of sea surface temperature anomalies at Christmas Island (representative of sea surface temperatures in the central equatorial Pacific) and variations of the central and east Pacific Hadley circulation (characterized by the second empirical orthogonal function (EOF) of the meridional component of the trade wind anomaly field) using a general multiple‐time‐lag prediction matrix. The ordering of the predictors is based on an EOF sequence, defined formally as orthogonal variables in the composite space of all (normalized) predictors, irrespective of their different physical dimensions, time lag, and geographic position. The choice of a large set of 20 predictors at 12 time lags yields significant predictability only for forecast periods of 3 to 5 months. However, a prior reduction of the predictor set to 4 predictors at 10 time lags leads to 95% significant predictions with skill values of the order of 0.4 to 0.7 up to 6 or 8 months. For infinitely long time series the construction of optimal prediction models reduces essentially to the problem of linear system identification. However, the model hierarchies normally considered for the simulation of general linear systems differ in structure from the model hierarchies which appear to be most suitable for constructing pure prediction models. Thus the truncation imposed by statistical significance requirements can result in rather different models for the two cases. The relation between optimal prediction models and linear dynamical models is illustrated by the prediction of east‐west sea level changes in the equatorial Pacific from wind field anomalies. It is shown that the optimal empirical prediction is statistically consistent in this case with both the first‐order relaxation and damped oscillator models recently proposed by McWilliams and Gent (but with somewhat different model parameters than suggested by the authors). Thus the data do not allow a distinction between the two physical models; the simplest acceptable model is the first‐order damped response. Finally, the problem of estimating forecast skill is discussed. It is usually stated that the forecast skill is smaller than the true skill, which in turn is smaller than the hindcast skill, by an amount which in both cases is approximately equal to the artificial skill. However, this result applies to the mean skills averaged over the ensemble of all possible hindcast data sets, given the true model. Under the more appropriate side condition of a given hindcast data set and an unknown true model, the estimation of the forecast skill represents a problem of statistical inference and is dependent on the assumed prior probability distribution of true models. The Bayesian hypothesis of a uniform prior distribution yields an average forecast skill equal to the hindcast skill, but other (equally acceptable) assumptions yield lower forecast skills more compatible with the usual hindcast‐averaged expressio
Invariant Manifolds and Rate Constants in Driven Chemical Reactions
Reaction rates of chemical reactions under nonequilibrium conditions can be
determined through the construction of the normally hyperbolic invariant
manifold (NHIM) [and moving dividing surface (DS)] associated with the
transition state trajectory. Here, we extend our recent methods by constructing
points on the NHIM accurately even for multidimensional cases. We also advance
the implementation of machine learning approaches to construct smooth versions
of the NHIM from a known high-accuracy set of its points. That is, we expand on
our earlier use of neural nets, and introduce the use of Gaussian process
regression for the determination of the NHIM. Finally, we compare and contrast
all of these methods for a challenging two-dimensional model barrier case so as
to illustrate their accuracy and general applicability.Comment: 28 pages, 13 figures, table of contents figur
Hubble Space Telescope times-series photometry of the planetary transit of HD189733: no moon, no rings, starspots
We monitored three transits of the giant gas planet around the nearby K dwarf
HD 189733 with the ACS camera on the Hubble Space Telescope. The resulting
very-high accuracy lightcurve (signal-to-noise ratio near 15000 on individual
measurements, 35000 on 10-minute averages) allows a direct geometric
measurement of the orbital inclination, radius ratio and scale of the system: i
= 85.68 +- 0.04, Rpl/R*=0.1572 +- 0.0004, a/R*=8.92 +- 0.09. We derive improved
values for the stellar and planetary radius, R*=0.755+- 0.011 Rsol, Rpl=1.154
+- 0.017 RJ, and the transit ephemerides, Ttr=2453931.12048 +- 0.00002 + n
2.218581 +- 0.000002$. The HST data also reveal clear evidence of the planet
occulting spots on the surface of the star. At least one large spot complex
(>80000 km) is required to explain the observed flux residuals and their colour
evolution. This feature is compatible in amplitude and phase with the
variability observed simultaneously from the ground. No evidence for satellites
or rings around HD 189733b is seen in the HST lightcurve. This allows us to
exlude with a high probability the presence of Earth-sized moons and
Saturn-type debris rings around this planet. The timing of the three transits
sampled is stable to the level of a few seconds, excluding a massive second
planet in outer 2:1 resonance.Comment: revised version. Significant updates and new figures; to appear in
Astronomy and Astrophysic
Sign-Perturbed Sums (SPS) with Asymmetric Noise: Robustness Analysis and Robustification Techniques
Sign-Perturbed Sums (SPS) is a recently developed finite sample system identification method that can build exact confidence regions for linear regression problems under mild statistical assumptions. The regions are well-shaped, e.g., they are centred around the least-squares (LS) estimate, star-convex and strongly consistent. One of the main assumptions of SPS is that the distribution of the noise terms are symmetric about zero. This paper analyses how robust SPS is with respect to the violation of this assumption and how it could be robustified with respect to non-symmetric noises. First, some alternative solutions are overviewed, then a robustness analysis is performed resulting in a robustified version of SPS. We also suggest a modification of SPS, called LAD-SPS, which builds exact confidence regions around the least-absolute deviation (LAD) estimate instead of the LS estimate. LAD-SPS requires less assumptions as the noise needs only to have a conditionally zero median (w.r.t. the past). Furthermore, that approach can also be robustified using similar ideas as in the LS-SPS case. Finally, some numerical experiments are presented
Guaranteed characterization of exact confidence regions for FIR models under mild assumptions on the noise via interval analysis
International audienceSPS is one of the two methods proposed recently by Campi et al. to obtain exact, non-asymptotic confidence regions for parameter estimates under mild assumptions on the noise distribution. It does not require the measurement noise to be Gaussian (or to have any other known distribution for that matter). The numerical characterization of the resulting confidence regions is far from trivial, however, and has only be carried out so far on very low-dimensional problems via methods that could not guarantee their results and could not be extended to large-scale problems because of their intrinsic complexity. The aim of the present paper is to show how interval analysis can contribute to a guaranteed characterization of exact confidence regions in large-scale problems. The application considered is the estimation of the parameters of finite-impulse response (FIR) models. The structure of the problem makes it possible to define a very efficient specific contractor, allowing the treatement of models with a large number of parameters, as is the rule for FIR models, and thus escaping the curse of dimensionality that often plagues interval methods
A Fresh Approach to Forecasting in Astroparticle Physics and Dark Matter Searches
We present a toolbox of new techniques and concepts for the efficient
forecasting of experimental sensitivities. These are applicable to a large
range of scenarios in (astro-)particle physics, and based on the Fisher
information formalism. Fisher information provides an answer to the question
what is the maximum extractable information from a given observation?. It is a
common tool for the forecasting of experimental sensitivities in many branches
of science, but rarely used in astroparticle physics or searches for particle
dark matter. After briefly reviewing the Fisher information matrix of general
Poisson likelihoods, we propose very compact expressions for estimating
expected exclusion and discovery limits (equivalent counts method). We
demonstrate by comparison with Monte Carlo results that they remain
surprisingly accurate even deep in the Poisson regime. We show how correlated
background systematics can be efficiently accounted for by a treatment based on
Gaussian random fields. Finally, we introduce the novel concept of Fisher
information flux. It can be thought of as a generalization of the commonly used
signal-to-noise ratio, while accounting for the non-local properties and
saturation effects of background and instrumental uncertainties. It is a
powerful and flexible tool ready to be used as core concept for informed
strategy development in astroparticle physics and searches for particle dark
matter.Comment: 33 pages, 12 figure
- …