224,076 research outputs found
Recommended from our members
Computational Methods for Parameter Estimation in Climate Models
Intensive computational methods have been used by Earth scientists in a wide range of problems in data inversion and uncertainty quantification such as earthquake epicenter location and climate projections. To quantify the uncertainties resulting from a range of plausible model configurations it is necessary to estimate a multidimensional probability distribution. The computational cost of estimating these distributions for geoscience applications is impractical using traditional methods such as Metropolis/Gibbs algorithms as simulation costs limit the number of experiments that can be obtained reasonably. Several alternate sampling strategies have been proposed that could improve on the sampling efficiency including Multiple Very Fast Simulated Annealing (MVFSA) and Adaptive Metropolis algorithms. The performance of these proposed sampling strategies are evaluated with a surrogate climate model that is able to approximate the noise and response behavior of a realistic atmospheric general circulation model (AGCM). The surrogate model is fast enough that its evaluation can be embedded in these Monte Carlo algorithms. We show that adaptive methods can be superior to MVFSA to approximate the known posterior distribution with fewer forward evaluations. However the adaptive methods can also be limited by inadequate sample mixing. The Single Component and Delayed Rejection Adaptive Metropolis algorithms were found to resolve these limitations, although challenges remain to approximating multi-modal distributions. The results show that these advanced methods of statistical inference can provide practical solutions to the climate model calibration problem and challenges in quantifying climate projection uncertainties. The computational methods would also be useful to problems outside climate prediction, particularly those where sampling is limited by availability of computational resources.National Science Foundation OCE-0415251CONACyT-Mexico 159764Institute for Geophysic
mfEGRA: Multifidelity Efficient Global Reliability Analysis through Active Learning for Failure Boundary Location
This paper develops mfEGRA, a multifidelity active learning method using
data-driven adaptively refined surrogates for failure boundary location in
reliability analysis. This work addresses the issue of prohibitive cost of
reliability analysis using Monte Carlo sampling for expensive-to-evaluate
high-fidelity models by using cheaper-to-evaluate approximations of the
high-fidelity model. The method builds on the Efficient Global Reliability
Analysis (EGRA) method, which is a surrogate-based method that uses adaptive
sampling for refining Gaussian process surrogates for failure boundary location
using a single-fidelity model. Our method introduces a two-stage adaptive
sampling criterion that uses a multifidelity Gaussian process surrogate to
leverage multiple information sources with different fidelities. The method
combines expected feasibility criterion from EGRA with one-step lookahead
information gain to refine the surrogate around the failure boundary. The
computational savings from mfEGRA depends on the discrepancy between the
different models, and the relative cost of evaluating the different models as
compared to the high-fidelity model. We show that accurate estimation of
reliability using mfEGRA leads to computational savings of 46% for an
analytic multimodal test problem and 24% for a three-dimensional acoustic horn
problem, when compared to single-fidelity EGRA. We also show the effect of
using a priori drawn Monte Carlo samples in the implementation for the acoustic
horn problem, where mfEGRA leads to computational savings of 45% for the
three-dimensional case and 48% for a rarer event four-dimensional case as
compared to single-fidelity EGRA
Recursive Partitioning for Heterogeneous Causal Effects
In this paper we study the problems of estimating heterogeneity in causal
effects in experimental or observational studies and conducting inference about
the magnitude of the differences in treatment effects across subsets of the
population. In applications, our method provides a data-driven approach to
determine which subpopulations have large or small treatment effects and to
test hypotheses about the differences in these effects. For experiments, our
method allows researchers to identify heterogeneity in treatment effects that
was not specified in a pre-analysis plan, without concern about invalidating
inference due to multiple testing. In most of the literature on supervised
machine learning (e.g. regression trees, random forests, LASSO, etc.), the goal
is to build a model of the relationship between a unit's attributes and an
observed outcome. A prominent role in these methods is played by
cross-validation which compares predictions to actual outcomes in test samples,
in order to select the level of complexity of the model that provides the best
predictive power. Our method is closely related, but it differs in that it is
tailored for predicting causal effects of a treatment rather than a unit's
outcome. The challenge is that the "ground truth" for a causal effect is not
observed for any individual unit: we observe the unit with the treatment, or
without the treatment, but not both at the same time. Thus, it is not obvious
how to use cross-validation to determine whether a causal effect has been
accurately predicted. We propose several novel cross-validation criteria for
this problem and demonstrate through simulations the conditions under which
they perform better than standard methods for the problem of causal effects. We
then apply the method to a large-scale field experiment re-ranking results on a
search engine
Characterization of a qubit Hamiltonian using adaptive measurements in a fixed basis
We investigate schemes for Hamiltonian parameter estimation of a two-level
system using repeated measurements in a fixed basis. The simplest (Fourier
based) schemes yield an estimate with a mean square error (MSE) that decreases
at best as a power law ~N^{-2} in the number of measurements N. By contrast, we
present numerical simulations indicating that an adaptive Bayesian algorithm,
where the time between measurements can be adjusted based on prior measurement
results, yields a MSE which appears to scale close to \exp(-0.3 N). That is,
measurements in a single fixed basis are sufficient to achieve exponential
scaling in N.Comment: 5 pages, 3 figures, 1 table. Published versio
Transfer from Multiple MDPs
Transfer reinforcement learning (RL) methods leverage on the experience
collected on a set of source tasks to speed-up RL algorithms. A simple and
effective approach is to transfer samples from source tasks and include them
into the training set used to solve a given target task. In this paper, we
investigate the theoretical properties of this transfer method and we introduce
novel algorithms adapting the transfer process on the basis of the similarity
between source and target tasks. Finally, we report illustrative experimental
results in a continuous chain problem.Comment: 201
Bayesian Methods for Analysis and Adaptive Scheduling of Exoplanet Observations
We describe work in progress by a collaboration of astronomers and
statisticians developing a suite of Bayesian data analysis tools for extrasolar
planet (exoplanet) detection, planetary orbit estimation, and adaptive
scheduling of observations. Our work addresses analysis of stellar reflex
motion data, where a planet is detected by observing the "wobble" of its host
star as it responds to the gravitational tug of the orbiting planet. Newtonian
mechanics specifies an analytical model for the resulting time series, but it
is strongly nonlinear, yielding complex, multimodal likelihood functions; it is
even more complex when multiple planets are present. The parameter spaces range
in size from few-dimensional to dozens of dimensions, depending on the number
of planets in the system, and the type of motion measured (line-of-sight
velocity, or position on the sky). Since orbits are periodic, Bayesian
generalizations of periodogram methods facilitate the analysis. This relies on
the model being linearly separable, enabling partial analytical
marginalization, reducing the dimension of the parameter space. Subsequent
analysis uses adaptive Markov chain Monte Carlo methods and adaptive importance
sampling to perform the integrals required for both inference (planet detection
and orbit measurement), and information-maximizing sequential design (for
adaptive scheduling of observations). We present an overview of our current
techniques and highlight directions being explored by ongoing research.Comment: 29 pages, 11 figures. An abridged version is accepted for publication
in Statistical Methodology for a special issue on astrostatistics, with
selected (refereed) papers presented at the Astronomical Data Analysis
Conference (ADA VI) held in Monastir, Tunisia, in May 2010. Update corrects
equation (3
- …