1,740,132 research outputs found
Multilevel Sparse Grid Methods for Elliptic Partial Differential Equations with Random Coefficients
Stochastic sampling methods are arguably the most direct and least intrusive
means of incorporating parametric uncertainty into numerical simulations of
partial differential equations with random inputs. However, to achieve an
overall error that is within a desired tolerance, a large number of sample
simulations may be required (to control the sampling error), each of which may
need to be run at high levels of spatial fidelity (to control the spatial
error). Multilevel sampling methods aim to achieve the same accuracy as
traditional sampling methods, but at a reduced computational cost, through the
use of a hierarchy of spatial discretization models. Multilevel algorithms
coordinate the number of samples needed at each discretization level by
minimizing the computational cost, subject to a given error tolerance. They can
be applied to a variety of sampling schemes, exploit nesting when available,
can be implemented in parallel and can be used to inform adaptive spatial
refinement strategies. We extend the multilevel sampling algorithm to sparse
grid stochastic collocation methods, discuss its numerical implementation and
demonstrate its efficiency both theoretically and by means of numerical
examples
Approximation with Error Bounds in Spark
We introduce a sampling framework to support approximate computing with
estimated error bounds in Spark. Our framework allows sampling to be performed
at the beginning of a sequence of multiple transformations ending in an
aggregation operation. The framework constructs a data provenance tree as the
computation proceeds, then combines the tree with multi-stage sampling and
population estimation theories to compute error bounds for the aggregation.
When information about output keys are available early, the framework can also
use adaptive stratified reservoir sampling to avoid (or reduce) key losses in
the final output and to achieve more consistent error bounds across popular and
rare keys. Finally, the framework includes an algorithm to dynamically choose
sampling rates to meet user specified constraints on the CDF of error bounds in
the outputs. We have implemented a prototype of our framework called
ApproxSpark, and used it to implement five approximate applications from
different domains. Evaluation results show that ApproxSpark can (a)
significantly reduce execution time if users can tolerate small amounts of
uncertainties and, in many cases, loss of rare keys, and (b) automatically find
sampling rates to meet user specified constraints on error bounds. We also
explore and discuss extensively trade-offs between sampling rates, execution
time, accuracy and key loss
Optimizing expected word error rate via sampling for speech recognition
State-level minimum Bayes risk (sMBR) training has become the de facto
standard for sequence-level training of speech recognition acoustic models. It
has an elegant formulation using the expectation semiring, and gives large
improvements in word error rate (WER) over models trained solely using
cross-entropy (CE) or connectionist temporal classification (CTC). sMBR
training optimizes the expected number of frames at which the reference and
hypothesized acoustic states differ. It may be preferable to optimize the
expected WER, but WER does not interact well with the expectation semiring, and
previous approaches based on computing expected WER exactly involve expanding
the lattices used during training. In this paper we show how to perform
optimization of the expected WER by sampling paths from the lattices used
during conventional sMBR training. The gradient of the expected WER is itself
an expectation, and so may be approximated using Monte Carlo sampling. We show
experimentally that optimizing WER during acoustic model training gives 5%
relative improvement in WER over a well-tuned sMBR baseline on a 2-channel
query recognition task (Google Home)
YIELD GUARANTEES AND THE PRODUCER WELFARE BENEFITS OF CROP INSURANCE
Crop yield and revenue insurance products with coverage based on actual production history (APH) yields dominate the U.S. Federal Crop Insurance Program. The APH yield, which plays a critical role in determining the coverage offered to producers, is based on a small sample of historical yields for the insured unit. The properties of this yield measure are critical in determining the value of the insurance to producers. Sampling error in APH yields has the potential to lead to over-insurance in some years and under-insurance in other years. Premiums, which are in part determined by the ratio of the APH yield to the county reference yield, are also affected by variations in APH yields. Congress has enacted two measures, yield substitution and yield floors, that are intended to limit the degree to which sampling error can reduce the insurance guarantee and producer welfare. We examine the impact of sampling error and related policy provisions for Texas cotton, Kansas wheat, and Illinois corn. The analysis is conducted using county level yield data from the National Agricultural Statistics Service and individual insured-unit-level yield data obtained from the Risk Management Agency’s insurance database. Our findings indicate that sampling error in APH yields has the potential to reduce producer welfare and that the magnitude of this effect differs substantially across crops. The yield substitution and yield floor provisions reduce the negative impact of sampling error but also bias guarantees upward, leading to increased government cost of the insurance programs.Actual Production History, Crop Insurance, Sampling Error, Yield Guarantee, Production Economics, Risk and Uncertainty,
Efficient simulation of large deviation events for sums of random vectors using saddle-point representations
We consider the problem of efficient simulation estimation of the
density function at the tails, and the probability of large
deviations for a sum of independent, identically distributed (i.i.d.),
light-tailed and nonlattice random vectors. The latter problem
besides being of independent interest, also forms a building block
for more complex rare event problems that arise, for instance, in
queuing and financial credit risk modeling. It has been extensively
studied in the literature where state-independent, exponential-twisting-based
importance sampling has been shown to be asymptotically
efficient and a more nuanced state-dependent exponential twisting
has been shown to have a stronger bounded relative error property.
We exploit the saddle-point-based representations that exist for
these rare quantities, which rely on inverting the characteristic
functions of the underlying random vectors. These representations
reduce the rare event estimation problem to evaluating certain
integrals, which may via importance sampling be represented as
expectations. Furthermore, it is easy to identify and approximate the
zero-variance importance sampling distribution to estimate these
integrals. We identify such importance sampling measures and show
that they possess the asymptotically vanishing relative error
property that is stronger than the bounded relative error
property. To illustrate the broader applicability of the proposed
methodology, we extend it to develop an asymptotically vanishing
relative error estimator for the practically important expected
overshoot of sums of i.i.d. random variables
- …
