137,323 research outputs found

    A filtered multilevel Monte Carlo method for estimating the expectation of discretized random fields

    Full text link
    We investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of the numerical simulators have inconsistent dimensions across the multilevel hierarchy. This requires the introduction of grid transfer operators borrowed from multigrid methods. Starting from a simple 1D illustration, we demonstrate numerically that the resulting MLMC estimator deteriorates the estimation of high-frequency components of the discretized expectation field compared to a Monte Carlo (MC) estimator. By adapting mathematical tools initially developed for multigrid methods, we perform a theoretical spectral analysis of the MLMC estimator of the expectation of discretized random fields, in the specific case of linear, symmetric and circulant simulators. This analysis provides a spectral decomposition of the variance into contributions associated with each scale component of the discretized field. We then propose improved MLMC estimators using a filtering mechanism similar to the smoothing process of multigrid methods. The filtering operators improve the estimation of both the small- and large-scale components of the variance, resulting in a reduction of the total variance of the estimator. These improvements are quantified for the specific class of simulators considered in our spectral analysis. The resulting filtered MLMC (F-MLMC) estimator is applied to the problem of estimating the discretized variance field of a diffusion-based covariance operator, which amounts to estimating the expectation of a discretized random field. The numerical experiments support the conclusions of the theoretical analysis even with non-linear simulators, and demonstrate the improvements brought by the proposed F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator

    A Bayesian generalized random regression model for estimating heritability using overdispersed count data

    Get PDF
    Background: Faecal egg counts are a common indicator of nematode infection and since it is a heritable trait, it provides a marker for selective breeding. However, since resistance to disease changes as the adaptive immune system develops, quantifying temporal changes in heritability could help improve selective breeding programs. Faecal egg counts can be extremely skewed and difficult to handle statistically. Therefore, previous heritability analyses have log transformed faecal egg counts to estimate heritability on a latent scale. However, such transformations may not always be appropriate. In addition, analyses of faecal egg counts have typically used univariate rather than multivariate analyses such as random regression that are appropriate when traits are correlated. We present a method for estimating the heritability of untransformed faecal egg counts over the grazing season using random regression. Results: Replicating standard univariate analyses, we showed the dependence of heritability estimates on choice of transformation. Then, using a multitrait model, we exposed temporal correlations, highlighting the need for a random regression approach. Since random regression can sometimes involve the estimation of more parameters than observations or result in computationally intractable problems, we chose to investigate reduced rank random regression. Using standard software (WOMBAT), we discuss the estimation of variance components for log transformed data using both full and reduced rank analyses. Then, we modelled the untransformed data assuming it to be negative binomially distributed and used Metropolis Hastings to fit a generalized reduced rank random regression model with an additive genetic, permanent environmental and maternal effect. These three variance components explained more than 80 % of the total phenotypic variation, whereas the variance components for the log transformed data accounted for considerably less. The heritability, on a link scale, increased from around 0.25 at the beginning of the grazing season to around 0.4 at the end. Conclusions: Random regressions are a useful tool for quantifying sources of variation across time. Our MCMC (Markov chain Monte Carlo) algorithm provides a flexible approach to fitting random regression models to non-normal data. Here we applied the algorithm to negative binomially distributed faecal egg count data, but this method is readily applicable to other types of overdispersed data

    Myths and Truths Concerning Estimation of Power Spectra

    Full text link
    It is widely believed that maximum likelihood estimators must be used to provide optimal estimates of power spectra. Since such estimators require require of order N_d^3 operations they are computationally prohibitive for N_d greater than a few tens of thousands. Because of this, a large and inhomogeneous literature exists on approximate methods of power spectrum estimation. These range from manifestly sub-optimal, but computationally fast methods, to near optimal but computationally expensive methods. Furthermore, much of this literature concentrates on the power spectrum estimates rather than the equally important problem of deriving an accurate covariance matrix. In this paper, I consider the problem of estimating the power spectrum of cosmic microwave background (CMB) anisotropies from large data sets. Various analytic results on power spectrum estimators are derived, or collated from the literature, and tested against numerical simulations. An unbiased hybrid estimator is proposed that combines a maximum likelihood estimator at low multipoles and pseudo-C_\ell estimates at high multipoles. The hybrid estimator is computationally fast, nearly optimal over the full range of multipoles, and returns an accurate and nearly diagonal covariance matrix for realistic experimental configurations (provided certain conditions on the noise properties of the experiment are satisfied). It is argued that, in practice, computationally expensive methods that approximate the N_d^3 maximum likelihood solution are unlikely to improve on the hybrid estimator, and may actually perform worse. The results presented here can be generalised to CMB polarization and to power spectrum estimation using other types of data, such as galaxy clustering and weak gravitational lensing.Comment: 27 pages, 15 figures, MNRAS in press. Resubmission matches accepted versio
    • …
    corecore