    Random Bit Multilevel Algorithms for Stochastic Differential Equations

    We study the approximation of expectations \E(f(X)) for solutions XX of SDEs and functionals f ⁣:C([0,1],Rr)Rf \colon C([0,1],\R^r) \to \R by means of restricted Monte Carlo algorithms that may only use random bits instead of random numbers. We consider the worst case setting for functionals ff from the Lipschitz class w.r.t.\ the supremum norm. We construct a random bit multilevel Euler algorithm and establish upper bounds for its error and cost. Furthermore, we derive matching lower bounds, up to a logarithmic factor, that are valid for all random bit Monte Carlo algorithms, and we show that, for the given quadrature problem, random bit Monte Carlo algorithms are at least almost as powerful as general randomized algorithms

    Comparing different sampling schemes for approximating the integrals involved in the semi-Bayesian optimal design of choice experiments.

    In conjoint choice experiments, the semi-Bayesian D-optimality criterion is often used to compute efficient designs. The traditional way to compute this criterion which involves multi-dimensional integrals over the prior distribution is to use Pseudo-Monte Carlo samples. However, other sampling approaches are available. Examples are the Quasi-Monte Carlo approach (randomized Halton sequences, modified Latin hypercube sampling and extensible shifted lattice points with Baker's transformation), the Gaussian-Hermite quadrature approach and a method using spherical-radial transformations. Not much is known in general about which sampling scheme performs best in constructing efficient choice designs. In this study, we compare the performance of these approaches under various scenarios. We try to identify the most efficient sampling scheme for each situation.Conjoint choice design; Pseudo-Monte Carlo; Quasi-Monte Carlo; Gaussian-Hermite quadrature; Spherical-radial transformation;

    Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data

    Local antithetic sampling with scrambled nets

    We consider the problem of computing an approximation to the integral I=[0,1]df(x)dxI=\int_{[0,1]^d}f(x) dx. Monte Carlo (MC) sampling typically attains a root mean squared error (RMSE) of O(n1/2)O(n^{-1/2}) from nn independent random function evaluations. By contrast, quasi-Monte Carlo (QMC) sampling using carefully equispaced evaluation points can attain the rate O(n1+ε)O(n^{-1+\varepsilon}) for any ε>0\varepsilon>0 and randomized QMC (RQMC) can attain the RMSE O(n3/2+ε)O(n^{-3/2+\varepsilon}), both under mild conditions on ff. Classical variance reduction methods for MC can be adapted to QMC. Published results combining QMC with importance sampling and with control variates have found worthwhile improvements, but no change in the error rate. This paper extends the classical variance reduction method of antithetic sampling and combines it with RQMC. One such method is shown to bring a modest improvement in the RMSE rate, attaining O(n3/21/d+ε)O(n^{-3/2-1/d+\varepsilon}) for any ε>0\varepsilon>0, for smooth enough ff.Comment: Published in at http://dx.doi.org/10.1214/07-AOS548 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Validating Sample Average Approximation Solutions with Negatively Dependent Batches

    Sample-average approximations (SAA) are a practical means of finding approximate solutions of stochastic programming problems involving an extremely large (or infinite) number of scenarios. SAA can also be used to find estimates of a lower bound on the optimal objective value of the true problem which, when coupled with an upper bound, provides confidence intervals for the true optimal objective value and valuable information about the quality of the approximate solutions. Specifically, the lower bound can be estimated by solving multiple SAA problems (each obtained using a particular sampling method) and averaging the obtained objective values. State-of-the-art methods for lower-bound estimation generate batches of scenarios for the SAA problems independently. In this paper, we describe sampling methods that produce negatively dependent batches, thus reducing the variance of the sample-averaged lower bound estimator and increasing its usefulness in defining a confidence interval for the optimal objective value. We provide conditions under which the new sampling methods can reduce the variance of the lower bound estimator, and present computational results to verify that our scheme can reduce the variance significantly, by comparison with the traditional Latin hypercube approach

    Uniform Sampling for Matrix Approximation

    Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its statistical leverage score. Unfortunately, leverage scores are difficult to compute. A simple alternative is to sample rows uniformly at random. While this often works, uniform sampling will eliminate critical row information for many natural instances. We take a fresh look at uniform sampling by examining what information it does preserve. Specifically, we show that uniform sampling yields a matrix that, in some sense, well approximates a large fraction of the original. While this weak form of approximation is not enough for solving linear regression directly, it is enough to compute a better approximation. This observation leads to simple iterative row sampling algorithms for matrix approximation that run in input-sparsity time and preserve row structure and sparsity at all intermediate steps. In addition to an improved understanding of uniform sampling, our main proof introduces a structural result of independent interest: we show that every matrix can be made to have low coherence by reweighting a small subset of its rows