358 research outputs found

    Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data

    Get PDF
    The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations

    Modeling relationships between calving traits: a comparison between standard and recursive mixed models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The use of structural equation models for the analysis of recursive and simultaneous relationships between phenotypes has become more popular recently. The aim of this paper is to illustrate how these models can be applied in animal breeding to achieve parameterizations of different levels of complexity and, more specifically, to model phenotypic recursion between three calving traits: gestation length (GL), calving difficulty (CD) and stillbirth (SB). All recursive models considered here postulate heterogeneous recursive relationships between GL and liabilities to CD and SB, and between liability to CD and liability to SB, depending on categories of GL phenotype.</p> <p>Methods</p> <p>Four models were compared in terms of goodness of fit and predictive ability: 1) standard mixed model (SMM), a model with unstructured (co)variance matrices; 2) recursive mixed model 1 (RMM1), assuming that residual correlations are due to the recursive relationships between phenotypes; 3) RMM2, assuming that correlations between residuals and contemporary groups are due to recursive relationships between phenotypes; and 4) RMM3, postulating that the correlations between genetic effects, contemporary groups and residuals are due to recursive relationships between phenotypes.</p> <p>Results</p> <p>For all the RMM considered, the estimates of the structural coefficients were similar. Results revealed a nonlinear relationship between GL and the liabilities both to CD and to SB, and a linear relationship between the liabilities to CD and SB.</p> <p>Differences in terms of goodness of fit and predictive ability of the models considered were negligible, suggesting that RMM3 is plausible.</p> <p>Conclusions</p> <p>The applications examined in this study suggest the plausibility of a nonlinear recursive effect from GL onto CD and SB. Also, the fact that the most restrictive model RMM3, which assumes that the only cause of correlation is phenotypic recursion, performs as well as the others indicates that the phenotypic recursion may be an important cause of the observed patterns of genetic and environmental correlations.</p

    Searching for phenotypic causal networks involving complex traits: an application to European quail

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Structural equation models (SEM) are used to model multiple traits and the casual links among them. The number of different causal structures that can be used to fit a SEM is typically very large, even when only a few traits are studied. In recent applications of SEM in quantitative genetics mixed model settings, causal structures were pre-selected based on prior beliefs alone. Alternatively, there are algorithms that search for structures that are compatible with the joint distribution of the data. However, such a search cannot be performed directly on the joint distribution of the phenotypes since causal relationships are possibly masked by genetic covariances. In this context, the application of the Inductive Causation (IC) algorithm to the joint distribution of phenotypes conditional to unobservable genetic effects has been proposed.</p> <p>Methods</p> <p>Here, we applied this approach to five traits in European quail: birth weight (BW), weight at 35 days of age (W35), age at first egg (AFE), average egg weight from 77 to 110 days of age (AEW), and number of eggs laid in the same period (NE). We have focused the discussion on the challenges and difficulties resulting from applying this method to field data. Statistical decisions regarding partial correlations were based on different Highest Posterior Density (HPD) interval contents and models based on the selected causal structures were compared using the Deviance Information Criterion (DIC). In addition, we used temporal information to perform additional edge orienting, overriding the algorithm output when necessary.</p> <p>Results</p> <p>As a result, the final causal structure consisted of two separated substructures: BW→AEW and W35→AFE→NE, where an arrow represents a direct effect. Comparison between a SEM with the selected structure and a Multiple Trait Animal Model using DIC indicated that the SEM is more plausible.</p> <p>Conclusions</p> <p>Coupling prior knowledge with the output provided by the IC algorithm allowed further learning regarding phenotypic causal structures when compared to standard mixed effects SEM applications.</p

    Bayesian analysis and prediction of hybrid performance

    Get PDF
    Background The selection of hybrids is an essential step in maize breeding. However, evaluating a large number of hybrids in field trials can be extremely costly. However, genomic models can be used to predict the expected performance of un-tested genotypes. Bayesian models offer a very flexible framework for hybrid prediction. The Bayesian methodology can be used with parametric and semi-parametric assumptions for additive and non-additive effects. Furthermore, samples from the posterior distribution of Bayesian models can be used to estimate the variance due to general and specific combining abilities even in cases where additive and non-additive effects are not mutually orthogonal. Also, the use of Bayesian models for analysis and prediction of hybrid performance has remained fairly limited. Results We provided an overview of Bayesian parametric and semi-parametric genomic models for prediction of agronomic traits in maize hybrids and discussed how these models can be used to decompose the genotypic variance into components due to general and specific combining ability. We applied the methodology to data from 906 single cross tropical maize hybrids derived from a convergent population. Our results show that: (1) non-additive effects make a sizable contribution to the genetic variance of grain yield; however, the relative importance of non-additive effects was much smaller for ear and plant height; (2) genomic prediction can achieve relatively high accuracy in predicting phenotypes of un-tested hybrids and in pre-screening. Conclusions Genomic prediction can be a useful tool in pre-screening of hybrids and could contribute to the improvement of the efficiency and efficacy of maize hybrids breeding programs. The Bayesian framework offers a great deal of flexibility in modeling hybrid performance. The methodology can be used to estimate important genetic parameters and render predictions of the expected hybrid performance as well measures of uncertainty about such predictions

    Efeitos recursivos estimados em avaliação genética de bovinos de corte.

    Get PDF
    O objetivo do trabalho foi estimar efeitos recursivos genéticos e residuais entre seis características em bovinos de corte, com o intuito de diminuir a dimensão dessas matrizes.Editora técnica Claudia Cristina Gulias Gomes
    corecore