580 research outputs found

    Parameter expansion for estimation of reduced rank covariance matrices (Open Access publication)

    Get PDF
    Parameter expanded and standard expectation maximisation algorithms are described for reduced rank estimation of covariance matrices by restricted maximum likelihood, fitting the leading principal components only. Convergence behaviour of these algorithms is examined for several examples and contrasted to that of the average information algorithm, and implications for practical analyses are discussed. It is shown that expectation maximisation type algorithms are readily adapted to reduced rank estimation and converge reliably. However, as is well known for the full rank case, the convergence is linear and thus slow. Hence, these algorithms are most useful in combination with the quadratically convergent average information algorithm, in particular in the initial stages of an iterative solution scheme

    Heterogeneous variances in Gaussian linear mixed models

    Get PDF

    Restricted maximum likelihood estimation of genetic principal components and smoothed covariance matrices

    Get PDF
    Principal component analysis is a widely used 'dimension reduction' technique, albeit generally at a phenotypic level. It is shown that we can estimate genetic principal components directly through a simple reparameterisation of the usual linear, mixed model. This is applicable to any analysis fitting multiple, correlated genetic effects, whether effects for individual traits or sets of random regression coefficients to model trajectories. Depending on the magnitude of genetic correlation, a subset of the principal component generally suffices to capture the bulk of genetic variation. Corresponding estimates of genetic covariance matrices are more parsimonious, have reduced rank and are smoothed, with the number of parameters required to model the dispersion structure reduced from k(k + 1)/2 to m(2k - m + 1)/2 for k effects and m principal components. Estimation of these parameters, the largest eigenvalues and pertaining eigenvectors of the genetic covariance matrix, via restricted maximum likelihood using derivatives of the likelihood, is described. It is shown that reduced rank estimation can reduce computational requirements of multivariate analyses substantially. An application to the analysis of eight traits recorded via live ultrasound scanning of beef cattle is given

    Heterogeneous variances in Gaussian linear mixed models

    Get PDF
    This paper reviews some problems encountered in estimating heterogeneous variances in Gaussian linear mixed models. The one-way and multiple classification cases are considered. EM-REML algorithms and Bayesian procedures are derived. A structural mixed linear model on log-variance components is also presented, which allows identification of meaningful sources of variation of heterogeneous residual and genetic components of variance and assessment of their magnitude and mode of action.Cet article fait le point sur un certain nombre de problèmes qui surviennent lors de l’estimation de variances hétérogènes dans des modèles linéaires mixtes gaussiens. On considère le cas d’un ou plusieurs facteurs d’hétéroscédasticité. On développe des algorithmes EM-REML et bayésiens. On propose également un modèle linéaire mixte structurel des logarithmes des variances qui permet de mettre en évidence des sources significatives de variation des variances résiduelles et génétiques et d’appréhender leur importance et leur mode d’action

    Learning from data: Plant breeding applications of machine learning

    Get PDF
    Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning techniques. A soybean nested association panel (aka. SoyNAM) was the base-population for experiments designed in situ and in silico. We used mixed models and Markov random fields to evaluate phenotypic-genotypic-environmental associations among traits and learning properties of genome-wide prediction methods. Alternative methods for analyses were proposed
    corecore