1,740,132 research outputs found

    Multilevel Sparse Grid Methods for Elliptic Partial Differential Equations with Random Coefficients

    Full text link
    Stochastic sampling methods are arguably the most direct and least intrusive means of incorporating parametric uncertainty into numerical simulations of partial differential equations with random inputs. However, to achieve an overall error that is within a desired tolerance, a large number of sample simulations may be required (to control the sampling error), each of which may need to be run at high levels of spatial fidelity (to control the spatial error). Multilevel sampling methods aim to achieve the same accuracy as traditional sampling methods, but at a reduced computational cost, through the use of a hierarchy of spatial discretization models. Multilevel algorithms coordinate the number of samples needed at each discretization level by minimizing the computational cost, subject to a given error tolerance. They can be applied to a variety of sampling schemes, exploit nesting when available, can be implemented in parallel and can be used to inform adaptive spatial refinement strategies. We extend the multilevel sampling algorithm to sparse grid stochastic collocation methods, discuss its numerical implementation and demonstrate its efficiency both theoretically and by means of numerical examples

    Approximation with Error Bounds in Spark

    Full text link
    We introduce a sampling framework to support approximate computing with estimated error bounds in Spark. Our framework allows sampling to be performed at the beginning of a sequence of multiple transformations ending in an aggregation operation. The framework constructs a data provenance tree as the computation proceeds, then combines the tree with multi-stage sampling and population estimation theories to compute error bounds for the aggregation. When information about output keys are available early, the framework can also use adaptive stratified reservoir sampling to avoid (or reduce) key losses in the final output and to achieve more consistent error bounds across popular and rare keys. Finally, the framework includes an algorithm to dynamically choose sampling rates to meet user specified constraints on the CDF of error bounds in the outputs. We have implemented a prototype of our framework called ApproxSpark, and used it to implement five approximate applications from different domains. Evaluation results show that ApproxSpark can (a) significantly reduce execution time if users can tolerate small amounts of uncertainties and, in many cases, loss of rare keys, and (b) automatically find sampling rates to meet user specified constraints on error bounds. We also explore and discuss extensively trade-offs between sampling rates, execution time, accuracy and key loss

    Optimizing expected word error rate via sampling for speech recognition

    Full text link
    State-level minimum Bayes risk (sMBR) training has become the de facto standard for sequence-level training of speech recognition acoustic models. It has an elegant formulation using the expectation semiring, and gives large improvements in word error rate (WER) over models trained solely using cross-entropy (CE) or connectionist temporal classification (CTC). sMBR training optimizes the expected number of frames at which the reference and hypothesized acoustic states differ. It may be preferable to optimize the expected WER, but WER does not interact well with the expectation semiring, and previous approaches based on computing expected WER exactly involve expanding the lattices used during training. In this paper we show how to perform optimization of the expected WER by sampling paths from the lattices used during conventional sMBR training. The gradient of the expected WER is itself an expectation, and so may be approximated using Monte Carlo sampling. We show experimentally that optimizing WER during acoustic model training gives 5% relative improvement in WER over a well-tuned sMBR baseline on a 2-channel query recognition task (Google Home)

    YIELD GUARANTEES AND THE PRODUCER WELFARE BENEFITS OF CROP INSURANCE

    Get PDF
    Crop yield and revenue insurance products with coverage based on actual production history (APH) yields dominate the U.S. Federal Crop Insurance Program. The APH yield, which plays a critical role in determining the coverage offered to producers, is based on a small sample of historical yields for the insured unit. The properties of this yield measure are critical in determining the value of the insurance to producers. Sampling error in APH yields has the potential to lead to over-insurance in some years and under-insurance in other years. Premiums, which are in part determined by the ratio of the APH yield to the county reference yield, are also affected by variations in APH yields. Congress has enacted two measures, yield substitution and yield floors, that are intended to limit the degree to which sampling error can reduce the insurance guarantee and producer welfare. We examine the impact of sampling error and related policy provisions for Texas cotton, Kansas wheat, and Illinois corn. The analysis is conducted using county level yield data from the National Agricultural Statistics Service and individual insured-unit-level yield data obtained from the Risk Management Agency’s insurance database. Our findings indicate that sampling error in APH yields has the potential to reduce producer welfare and that the magnitude of this effect differs substantially across crops. The yield substitution and yield floor provisions reduce the negative impact of sampling error but also bias guarantees upward, leading to increased government cost of the insurance programs.Actual Production History, Crop Insurance, Sampling Error, Yield Guarantee, Production Economics, Risk and Uncertainty,

    Efficient simulation of large deviation events for sums of random vectors using saddle-point representations

    Get PDF
    We consider the problem of efficient simulation estimation of the density function at the tails, and the probability of large deviations for a sum of independent, identically distributed (i.i.d.), light-tailed and nonlattice random vectors. The latter problem besides being of independent interest, also forms a building block for more complex rare event problems that arise, for instance, in queuing and financial credit risk modeling. It has been extensively studied in the literature where state-independent, exponential-twisting-based importance sampling has been shown to be asymptotically efficient and a more nuanced state-dependent exponential twisting has been shown to have a stronger bounded relative error property. We exploit the saddle-point-based representations that exist for these rare quantities, which rely on inverting the characteristic functions of the underlying random vectors. These representations reduce the rare event estimation problem to evaluating certain integrals, which may via importance sampling be represented as expectations. Furthermore, it is easy to identify and approximate the zero-variance importance sampling distribution to estimate these integrals. We identify such importance sampling measures and show that they possess the asymptotically vanishing relative error property that is stronger than the bounded relative error property. To illustrate the broader applicability of the proposed methodology, we extend it to develop an asymptotically vanishing relative error estimator for the practically important expected overshoot of sums of i.i.d. random variables
    corecore