137,323 research outputs found
A filtered multilevel Monte Carlo method for estimating the expectation of discretized random fields
We investigate the use of multilevel Monte Carlo (MLMC) methods for
estimating the expectation of discretized random fields. Specifically, we
consider a setting in which the input and output vectors of the numerical
simulators have inconsistent dimensions across the multilevel hierarchy. This
requires the introduction of grid transfer operators borrowed from multigrid
methods. Starting from a simple 1D illustration, we demonstrate numerically
that the resulting MLMC estimator deteriorates the estimation of high-frequency
components of the discretized expectation field compared to a Monte Carlo (MC)
estimator. By adapting mathematical tools initially developed for multigrid
methods, we perform a theoretical spectral analysis of the MLMC estimator of
the expectation of discretized random fields, in the specific case of linear,
symmetric and circulant simulators. This analysis provides a spectral
decomposition of the variance into contributions associated with each scale
component of the discretized field. We then propose improved MLMC estimators
using a filtering mechanism similar to the smoothing process of multigrid
methods. The filtering operators improve the estimation of both the small- and
large-scale components of the variance, resulting in a reduction of the total
variance of the estimator. These improvements are quantified for the specific
class of simulators considered in our spectral analysis. The resulting filtered
MLMC (F-MLMC) estimator is applied to the problem of estimating the discretized
variance field of a diffusion-based covariance operator, which amounts to
estimating the expectation of a discretized random field. The numerical
experiments support the conclusions of the theoretical analysis even with
non-linear simulators, and demonstrate the improvements brought by the proposed
F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator
A Bayesian generalized random regression model for estimating heritability using overdispersed count data
Background:
Faecal egg counts are a common indicator of nematode infection and since it is a heritable trait, it provides a marker for selective breeding. However, since resistance to disease changes as the adaptive immune system develops, quantifying temporal changes in heritability could help improve selective breeding programs. Faecal egg counts can be extremely skewed and difficult to handle statistically. Therefore, previous heritability analyses have log transformed faecal egg counts to estimate heritability on a latent scale. However, such transformations may not always be appropriate. In addition, analyses of faecal egg counts have typically used univariate rather than multivariate analyses such as random regression that are appropriate when traits are correlated. We present a method for estimating the heritability of untransformed faecal egg counts over the grazing season using random regression.
Results:
Replicating standard univariate analyses, we showed the dependence of heritability estimates on choice of transformation. Then, using a multitrait model, we exposed temporal correlations, highlighting the need for a random regression approach. Since random regression can sometimes involve the estimation of more parameters than observations or result in computationally intractable problems, we chose to investigate reduced rank random regression. Using standard software (WOMBAT), we discuss the estimation of variance components for log transformed data using both full and reduced rank analyses. Then, we modelled the untransformed data assuming it to be negative binomially distributed and used Metropolis Hastings to fit a generalized reduced rank random regression model with an additive genetic, permanent environmental and maternal effect. These three variance components explained more than 80 % of the total phenotypic variation, whereas the variance components for the log transformed data accounted for considerably less. The heritability, on a link scale, increased from around 0.25 at the beginning of the grazing season to around 0.4 at the end.
Conclusions:
Random regressions are a useful tool for quantifying sources of variation across time. Our MCMC (Markov chain Monte Carlo) algorithm provides a flexible approach to fitting random regression models to non-normal data. Here we applied the algorithm to negative binomially distributed faecal egg count data, but this method is readily applicable to other types of overdispersed data
Myths and Truths Concerning Estimation of Power Spectra
It is widely believed that maximum likelihood estimators must be used to
provide optimal estimates of power spectra. Since such estimators require
require of order N_d^3 operations they are computationally prohibitive for N_d
greater than a few tens of thousands. Because of this, a large and
inhomogeneous literature exists on approximate methods of power spectrum
estimation. These range from manifestly sub-optimal, but computationally fast
methods, to near optimal but computationally expensive methods. Furthermore,
much of this literature concentrates on the power spectrum estimates rather
than the equally important problem of deriving an accurate covariance matrix.
In this paper, I consider the problem of estimating the power spectrum of
cosmic microwave background (CMB) anisotropies from large data sets. Various
analytic results on power spectrum estimators are derived, or collated from the
literature, and tested against numerical simulations. An unbiased hybrid
estimator is proposed that combines a maximum likelihood estimator at low
multipoles and pseudo-C_\ell estimates at high multipoles. The hybrid estimator
is computationally fast, nearly optimal over the full range of multipoles, and
returns an accurate and nearly diagonal covariance matrix for realistic
experimental configurations (provided certain conditions on the noise
properties of the experiment are satisfied). It is argued that, in practice,
computationally expensive methods that approximate the N_d^3 maximum likelihood
solution are unlikely to improve on the hybrid estimator, and may actually
perform worse. The results presented here can be generalised to CMB
polarization and to power spectrum estimation using other types of data, such
as galaxy clustering and weak gravitational lensing.Comment: 27 pages, 15 figures, MNRAS in press. Resubmission matches accepted
versio
- …