708,067 research outputs found

    Computationally Efficient Nonparametric Importance Sampling

    Full text link
    The variance reduction established by importance sampling strongly depends on the choice of the importance sampling distribution. A good choice is often hard to achieve especially for high-dimensional integration problems. Nonparametric estimation of the optimal importance sampling distribution (known as nonparametric importance sampling) is a reasonable alternative to parametric approaches.In this article nonparametric variants of both the self-normalized and the unnormalized importance sampling estimator are proposed and investigated. A common critique on nonparametric importance sampling is the increased computational burden compared to parametric methods. We solve this problem to a large degree by utilizing the linear blend frequency polygon estimator instead of a kernel estimator. Mean square error convergence properties are investigated leading to recommendations for the efficient application of nonparametric importance sampling. Particularly, we show that nonparametric importance sampling asymptotically attains optimal importance sampling variance. The efficiency of nonparametric importance sampling algorithms heavily relies on the computational efficiency of the employed nonparametric estimator. The linear blend frequency polygon outperforms kernel estimators in terms of certain criteria such as efficient sampling and evaluation. Furthermore, it is compatible with the inversion method for sample generation. This allows to combine our algorithms with other variance reduction techniques such as stratified sampling. Empirical evidence for the usefulness of the suggested algorithms is obtained by means of three benchmark integration problems. As an application we estimate the distribution of the queue length of a spam filter queueing system based on real data.Comment: 29 pages, 7 figure

    Concentration inequalities for order statistics

    Full text link
    This note describes non-asymptotic variance and tail bounds for order statistics of samples of independent identically distributed random variables. Those bounds are checked to be asymptotically tight when the sampling distribution belongs to a maximum domain of attraction. If the sampling distribution has non-decreasing hazard rate (this includes the Gaussian distribution), we derive an exponential Efron-Stein inequality for order statistics: an inequality connecting the logarithmic moment generating function of centered order statistics with exponential moments of Efron-Stein (jackknife) estimates of variance. We use this general connection to derive variance and tail bounds for order statistics of Gaussian sample. Those bounds are not within the scope of the Tsirelson-Ibragimov-Sudakov Gaussian concentration inequality. Proofs are elementary and combine R\'enyi's representation of order statistics and the so-called entropy approach to concentration inequalities popularized by M. Ledoux.Comment: 13 page

    Latin hypercube sampling with dependence and applications in finance

    Get PDF
    In Monte Carlo simulation, Latin hypercube sampling (LHS) [McKay et al. (1979)] is a well-known variance reduction technique for vectors of independent random variables. The method presented here, Latin hypercube sampling with dependence (LHSD), extends LHS to vectors of dependent random variables. The resulting estimator is shown to be consistent and asymptotically unbiased. For the bivariate case and under some conditions on the joint distribution, a central limit theorem together with a closed formula for the limit variance are derived. It is shown that for a class of estimators satisfying some monotonicity condition, the LHSD limit variance is never greater than the corresponding Monte Carlo limit variance. In some valuation examples of financial payoffs, when compared to standard Monte Carlo simulation, a variance reduction of factors up to 200 is achieved. LHSD is suited for problems with rare events and for high-dimensional problems, and it may be combined with Quasi-Monte Carlo methods. --Monte Carlo simulation,variance reduction,Latin hypercube sampling,stratified sampling

    Quantitative estimation of sampling uncertainties for mycotoxins in cereal shipments

    Get PDF
    Many countries receive shipments of bulk cereals from primary producers. There is a volume of work that is ongoing that seeks to arrive at appropriate standards for the quality of the shipments and the means to assess the shipments as they are out-loaded. Of concern are mycotoxin and heavy metal levels, pesticide and herbicide residue levels, and contamination by genetically modified organisms (GMOs). As the ability to quantify these contaminants improves through improved analytical techniques, the sampling methodologies applied to the shipments must also keep pace to ensure that the uncertainties attached to the sampling procedures do not overwhelm the analytical uncertainties. There is a need to understand and quantify sampling uncertainties under varying conditions of contamination. The analysis required is statistical and is challenging as the nature of the distribution of contaminants within a shipment is not well understood; very limited data exist. Limited work has been undertaken to quantify the variability of the contaminant concentrations in the flow of grain coming from a ship and the impact that this has on the variance of sampling. Relatively recent work by Paoletti et al. in 2006 [Paoletti C, Heissenberger A, Mazzara M, Larcher S, Grazioli E, Corbisier P, Hess N, Berben G, Lubeck PS, De Loose M, et al. 2006. Kernel lot distribution assessment (KeLDA): a study on the distribution of GMO in large soybean shipments. Eur Food Res Tech. 224:129–139] provides some insight into the variation in GMO concentrations in soybeans on cargo out-turn. Paoletti et al. analysed the data using correlogram analysis with the objective of quantifying the sampling uncertainty (variance) that attaches to the final cargo analysis, but this is only one possible means of quantifying sampling uncertainty. It is possible that in many cases the levels of contamination passing the sampler on out-loading are essentially random, negating the value of variographic quantitation of the sampling variance. GMOs and mycotoxins appear to have a highly heterogeneous distribution in a cargo depending on how the ship was loaded (the grain may have come from more than one terminal and set of storage silos) and mycotoxin growth may have occurred in transit. This paper examines a statistical model based on random contamination that can be used to calculate the sampling uncertainty arising from primary sampling of a cargo; it deals with what is thought to be a worst-case scenario. The determination of the sampling variance is treated both analytically and by Monte Carlo simulation. The latter approach provides the entire sampling distribution and not just the sampling variance. The sampling procedure is based on rules provided by the Canadian Grain Commission (CGC) and the levels of contamination considered are those relating to allowable levels of ochratoxin A (OTA) in wheat. The results of the calculations indicate that at a loading rate of 1000 tonnes h-1, primary sample increment masses of 10.6 kg, a 2000-tonne lot and a primary composite sample mass of 1900 kg, the relative standard deviation (RSD) is about 1.05 (105%) and the distribution of the mycotoxin (MT) level in the primary composite samples is highly skewed. This result applies to a mean MT level of 2 ng g-1. The rate of false-negative results under these conditions is estimated to be 16.2%. The corresponding contamination is based on initial average concentrations of MT of 4000 ng g-1 within average spherical volumes of 0.3m diameter, which are then diluted by a factor of 2 each time they pass through a handling stage; four stages of handling are assumed. The Monte Carlo calculations allow for variation in the initial volume of the MT-bearing grain, the average concentration and the dilution factor. The Monte Carlo studies seek to show the effect of variation in the sampling frequency while maintaining a primary composite sample mass of 1900 kg. The overall results are presented in terms of operational characteristic curves that relate only to the sampling uncertainties in the primary sampling of the grain. It is concluded that cross-stream sampling is intrinsically unsuited to sampling for mycotoxins and that better sampling methods and equipment are needed to control sampling uncertainties. At the same time, it is shown that some combination of crosscutting sampling conditions may, for a given shipment mass and MT content, yield acceptable sampling performance

    Variance Reduction For A Discrete Velocity Gas

    Get PDF
    We extend a variance reduction technique developed by Baker and Hadjiconstantinou [1] to a discrete velocity gas. In our previous work, the collision integral was evaluated by importance sampling of collision partners [2]. Significant computational effort may be wasted by evaluating the collision integral in regions where the flow is in equilibrium. In the current approach, substantial computational savings are obtained by only solving for the deviations from equilibrium. In the near continuum regime, the deviations from equilibrium are small and low noise evaluation of the collision integral can be achieved with very coarse statistical sampling. Spatially homogenous relaxation of the Bobylev-Krook-Wu distribution [3,4], was used as a test case to verify that the method predicts the correct evolution of a highly non-equilibrium distribution to equilibrium. When variance reduction is not used, the noise causes the entropy to undershoot, but the method with variance reduction matches the analytic curve for the same number of collisions. We then extend the work to travelling shock waves and compare the accuracy and computational savings of the variance reduction method to DSMC over Mach numbers ranging from 1.2 to 10.Aerospace Engineering and Engineering Mechanic
    • 

    corecore