Search CORE

13,663 research outputs found

Evolutionary Inference for Function-valued Traits: Gaussian Process Regression on Phylogenies

Author: Jones Nick S.
Moriarty John
Publication venue: 'The Royal Society'
Publication date: 03/08/2012
Field of study

Biological data objects often have both of the following features: (i) they are functions rather than single numbers or vectors, and (ii) they are correlated due to phylogenetic relationships. In this paper we give a flexible statistical model for such data, by combining assumptions from phylogenetics with Gaussian processes. We describe its use as a nonparametric Bayesian prior distribution, both for prediction (placing posterior distributions on ancestral functions) and model selection (comparing rates of evolution across a phylogeny, or identifying the most likely phylogenies consistent with the observed data). Our work is integrative, extending the popular phylogenetic Brownian Motion and Ornstein-Uhlenbeck models to functional data and Bayesian inference, and extending Gaussian Process regression to phylogenies. We provide a brief illustration of the application of our method.Comment: 7 pages, 1 figur

arXiv.org e-Print Archive

Crossref

PubMed Central

The University of Manchester - Institutional Repository

A Bayesian generalized random regression model for estimating heritability using overdispersed count data

Author: Denwood Matthew
Jimenez de Cisneros Joaquin Prada
Johnson Paul
Mair Colette
Matthews Louise
Stear Michael
Stefan Thorsten
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Background: Faecal egg counts are a common indicator of nematode infection and since it is a heritable trait, it provides a marker for selective breeding. However, since resistance to disease changes as the adaptive immune system develops, quantifying temporal changes in heritability could help improve selective breeding programs. Faecal egg counts can be extremely skewed and difficult to handle statistically. Therefore, previous heritability analyses have log transformed faecal egg counts to estimate heritability on a latent scale. However, such transformations may not always be appropriate. In addition, analyses of faecal egg counts have typically used univariate rather than multivariate analyses such as random regression that are appropriate when traits are correlated. We present a method for estimating the heritability of untransformed faecal egg counts over the grazing season using random regression. Results: Replicating standard univariate analyses, we showed the dependence of heritability estimates on choice of transformation. Then, using a multitrait model, we exposed temporal correlations, highlighting the need for a random regression approach. Since random regression can sometimes involve the estimation of more parameters than observations or result in computationally intractable problems, we chose to investigate reduced rank random regression. Using standard software (WOMBAT), we discuss the estimation of variance components for log transformed data using both full and reduced rank analyses. Then, we modelled the untransformed data assuming it to be negative binomially distributed and used Metropolis Hastings to fit a generalized reduced rank random regression model with an additive genetic, permanent environmental and maternal effect. These three variance components explained more than 80 % of the total phenotypic variation, whereas the variance components for the log transformed data accounted for considerably less. The heritability, on a link scale, increased from around 0.25 at the beginning of the grazing season to around 0.4 at the end. Conclusions: Random regressions are a useful tool for quantifying sources of variation across time. Our MCMC (Markov chain Monte Carlo) algorithm provides a flexible approach to fitting random regression models to non-normal data. Here we applied the algorithm to negative binomially distributed faecal egg count data, but this method is readily applicable to other types of overdispersed data

Crossref

Springer - Publisher Connector

PubMed Central

University of Surrey

Enlighten

Surrey Research Insight

Warped Functional Analysis of Variance

Author: Carter Patrick A.
Gervini Daniel
Publication venue
Publication date: 07/11/2013
Field of study

This article presents an Analysis of Variance model for functional data that explicitly incorporates phase variability through a time-warping component, allowing for a unified approach to estimation and inference in presence of amplitude and time variability. The focus is on single-random-factor models but the approach can be easily generalized to more complex ANOVA models. The behavior of the estimators is studied by simulation, and an application to the analysis of growth curves of flour beetles is presented. Although the model assumes a smooth latent process behind the observed trajectories, smoothness of the observed data is not required; the method can be applied to the sparsely observed data that is often encountered in longitudinal studies

arXiv.org e-Print Archive

CiteSeerX

Implicit prices of indigenous cattle traits in central Ethiopia: Application of revealed and stated preference approaches

Author: Abdulai A.
Ayalew W.
Dessie Tadelle
Haile Aynalem
Kassie Girma T.
Okeyo Mwai Ally
Tibbo Markos
Wollny C.B.A.
Publication venue
Publication date: 04/03/2011
Field of study

The diversity of animal genetic resources has a quasi-public good nature that makes market prices inadequate indicator of its economic worth. Applying the characteristics theory of value, this research estimated the relative economic worth of the attributes of cattle genetic resources in central Ethiopia. Transaction level data were collected over four seasons in a year and choice experiment survey was done in five markets to generate data on both revealed and stated preferences of cattle buyers. Heteroscedasticity efficient estimation and random parameters logit were employed to analyse the data. The results essentially show that attributes related to the subsistence functions of cattle are more valued than attributes that directly influence marketable products of the animals. The findings imply the strong need to invest on improvement of attributes of cattle in the study area that enhance the subsistence functions of cattle that their owners accord higher priority to support their livelihoods than they do to tradable products

CGSpace

Detection and modelling of time-dependent QTL in animal populations

Author: Florence Jaffrézic
Mogens S. Lund
Per Madsen
Peter Sorensen
Publication venue: 'EDP Sciences'
Publication date: 01/01/2008
Field of study

A longitudinal approach is proposed to map QTL affecting function-valued traits and to estimate their effect over time. The method is based on fitting mixed random regression models. The QTL allelic effects are modelled with random coefficient parametric curves and using a gametic relationship matrix. A simulation study was conducted in order to assess the ability of the approach to fit different patterns of QTL over time. It was found that this longitudinal approach was able to adequately fit the simulated variance functions and considerably improved the power of detection of time-varying QTL effects compared to the traditional univariate model. This was confirmed by an analysis of protein yield data in dairy cattle, where the model was able to detect QTL with high effect either at the beginning or the end of the lactation, that were not detected with a simple 305 day model

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Directory of Open Access Journals

PubMed Central

ProdInra

Statistical models for the genetic analysis of longitudinal data

Author: Jaffrezic Florence
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Edinburgh Research Archive

Estimation of dynamic SNP-heritability with Bayesian Gaussian process models

Author: Arjas A
Hauptmann A
Sillanpää MJ
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/06/2020
Field of study

Motivation: Improved DNA technology has made it practical to estimate single nucleotide polymorphism (SNP)-heritability among distantly related individuals with unknown relationships. For growth and development related traits, it is meaningful to base SNP-heritability estimation on longitudinal data due to the time-dependency of the process. However, only few statistical methods have been developed so far for estimating dynamic SNP-heritability and quantifying its full uncertainty. / Results: We introduce a completely tuning-free Bayesian Gaussian process (GP) based approach for estimating dynamic variance components and heritability as their function. For parameter estimation, we use a modern Markov Chain Monte Carlo (MCMC) method which allows full uncertainty quantification. Several data sets are analysed and our results clearly illustrate that the 95 % credible intervals of the proposed joint estimation method (which "borrows strength" from adjacent time points) are significantly narrower than of a two-stage baseline method that first estimates the variance components at each time point independently and then performs smoothing. We compare the method with a random regression model using MTG2 and BLUPF90 softwares and quantitative measures indicate superior performance of our method. Results are presented for simulated and real data with up to 1000 time points. Finally, we demonstrate scalability of the proposed method for simulated data with tens of thousands of individuals. / Availability: The C++ implementation dynBGP and simulated data are available in GitHub (https://github.com/aarjas/dynBGP). The programs can be run in R. Real datasets are available in QTL archive (https://phenome.jax.org/centers/QTLA). / Supplementary information: Supplementary data are available at Bioinformatics online

UCL Discovery

Bayesian Sparse Factor Analysis of Genetic Covariance Matrices

Author: Mukherjee Sayan
Runcie Daniel E
Publication venue: 'Genetics Society of America'
Publication date: 15/03/2013
Field of study

Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed effects model. The key idea of our model is that we need only consider G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse -- affecting only a few observed traits. The advantages of this approach are two-fold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.Comment: 35 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

PubMed Central

eScholarship - University of California