Search CORE

2,521 research outputs found

Restricted maximum likelihood estimation of covariances in sparse linear models

Author: Groeneveld Eildert
Neumaier Arnold
Publication venue: BioMed Central
Publication date: 01/01/1998
Field of study

This paper discusses the restricted maximum likelihood (REML) approach for the estimation of covariance matrices in linear stochastic models, as implemented in the current version of the VCE package for covariance component estimation in large animal breeding models. The main features are: 1) the representation of the equations in an augmented form that simplifies the implementation; 2) the parametrization of the covariance matrices by means of their Cholesky factors, thus automatically ensuring their positive definiteness; 3) explicit formulas for the gradients of the REML function for the case of large and sparse model equations with a large number of unknown covariance components and possibly incomplete data, using the sparse inverse to obtain the gradients cheaply; 4) use of model equations that make separate formation of the inverse of the numerator relationship matrix unnecessary. Many large scale breeding problems were solved with the new implementation, among them an example with more than 250 000 normal equations and 55 covariance components, taking 41 h CPU time on a Hewlett Packard 755

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

High performance computing for large-scale genomic prediction

Author: De Coninck Arne
Publication venue: Ghent University. Faculty of Bioscience Engineering
Publication date: 01/01/2016
Field of study

In the past decades genetics was studied intensively leading to the knowledge that DNA is the molecule behind genetic inheritance and starting from the new millennium next-generation sequencing methods made it possible to sample this DNA with an ever decreasing cost. Animal and plant breeders have always made use of genetic information to predict agronomic performance of new breeds. While this genetic information previously was gathered from the pedigree of the population under study, genomic information of the DNA makes it possible to also deduce correlations between individuals that do not share any known ancestors leading to so-called genomic prediction of agronomic performance. Nowadays, the number of informative samples that can be taken from a genome ranges from one thousand to one million. Using all this information in a breeding context where agronomic performance is predicted and optimized for different environmental conditions is not a straightforward task. Moreover, the number of individuals for which this information is available keeps on growing and thus sophisticated computational methods are required for analyzing these large scale genomic data sets. This thesis introduces some concepts of high performance computing in a genomic prediction context and shows that analyzing phenotypic records of large numbers of genotyped individuals leads to a better prediction accuracy of the agronomic performance in different environments. Finally, it is even shown that the parts of the DNA that influence the agronomic performance under certain environmental conditions can be pinpointed, and this knowledge can thus be used by breeders to select individuals that thrive better in the targeted environment

Ghent University Academic Bibliography

Genomic prediction when some animals are not genotyped

Author: Christensen Ole F
Lund Mogens S
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The use of genomic selection in breeding programs may increase the rate of genetic improvement, reduce the generation time, and provide higher accuracy of estimated breeding values (EBVs). A number of different methods have been developed for genomic prediction of breeding values, but many of them assume that all animals have been genotyped. In practice, not all animals are genotyped, and the methods have to be adapted to this situation. Results In this paper we provide an extension of a linear mixed model method for genomic prediction to the situation with non-genotyped animals. The model specifies that a breeding value is the sum of a genomic and a polygenic genetic random effect, where genomic genetic random effects are correlated with a genomic relationship matrix constructed from markers and the polygenic genetic random effects are correlated with the usual relationship matrix. The extension of the model to non-genotyped animals is made by using the pedigree to derive an extension of the genomic relationship matrix to non-genotyped animals. As a result, in the extended model the estimated breeding values are obtained by blending the information used to compute traditional EBVs and the information used to compute purely genomic EBVs. Parameters in the model are estimated using average information REML and estimated breeding values are best linear unbiased predictions (BLUPs). The method is illustrated using a simulated data set. Conclusions The extension of the method to non-genotyped animals presented in this paper makes it possible to integrate all the genomic, pedigree and phenotype information into a one-step procedure for genomic prediction. Such a one-step procedure results in more accurate estimated breeding values and has the potential to become the standard tool for genomic prediction of breeding values in future practical evaluations in pig and cattle breeding.</p

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A rapid method for computing the inverse of the gametic covariance matrix between relatives for a marked Quantitative Trait Locus

Author: Albert E. Freeman
Gamal Abdel-Azim
Publication venue: 'EDP Sciences'
Publication date: 01/01/2003
Field of study

Crossref

Approximate genome-based kernel models for large data sets including main effects and interactions

Author: Crossa Jose
Cuevas Jaime
Lillemo Morten
Martini J.W.R.
Montesinos-Lopez Osval A.
Perez-Rodriguez Paulino
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

The rapid development of molecular markers and sequencing technologies has made it possible to use genomic prediction (GP) and selection (GS) in animal and plant breeding. However, when the number of observations (n) is large (thousands or millions), computational difficulties when handling these large genomic kernel relationship matrices (inverting and decomposing) increase exponentially. This problem increases when genomic × environment interaction and multi-trait kernels are included in the model. In this research we propose selecting a small number of lines m(m < n) for constructing an approximate kernel of lower rank than the original and thus exponentially decreasing the required computing time. First, we describe the full genomic method for single environment (FGSE) with a covariance matrix (kernel) including all n lines. Second, we select m lines and approximate the original kernel for the single environment model (APSE). Similarly, but including main effects and G × E, we explain a full genomic method with genotype × environment model (FGGE), and including m lines, we approximated the kernel method with G × E (APGE). We applied the proposed method to two different wheat data sets of different sizes (n) using the standard linear kernel Genomic Best Linear Unbiased Predictor (GBLUP) and also using eigen value decomposition. In both data sets, we compared the prediction performance and computing time for FGSE versus APSE; we also compared FGGE versus APGE. Results showed a competitive prediction performance of the approximated methods with a significant reduction in computing time. Genomic prediction accuracy depends on the decay of the eigenvalues (amount of variance information loss) of the original kernel as well as on the size of the selected lines m.publishedVersio

Brage NMBU

NORA - Norwegian Open Research Archives

Recommended from our members

lme4qtl: linear mixed models with flexible covariance structure for genetic studies of related individuals.

Author: Aschard Hugues
Brunel Helena
Martinez-Perez Angel
Soria Jose Manuel
Vázquez-Santiago Miquel
Ziyatdinov Andrey
Publication venue: BMC Bioinformatics
Publication date: 01/02/2018
Field of study

BACKGROUND: Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software. RESULTS: To address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project. CONCLUSIONS: Our software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl

Harvard University - DASH

Directory of Open Access Journals

Apollo (Cambridge)

HAL-Pasteur

Quantitative genetic modeling and inference in the presence of nonignorable missing data.

Author: Jensen H.
Larsen C.T.
Roulin A.
Steinsland I.
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

Natural selection is typically exerted at some specific life stages. If natural selection takes place before a trait can be measured, using conventional models can cause wrong inference about population parameters. When the missing data process relates to the trait of interest, a valid inference requires explicit modeling of the missing process. We propose a joint modeling approach, a shared parameter model, to account for nonrandom missing data. It consists of an animal model for the phenotypic data and a logistic model for the missing process, linked by the additive genetic effects. A Bayesian approach is taken and inference is made using integrated nested Laplace approximations. From a simulation study we find that wrongly assuming that missing data are missing at random can result in severely biased estimates of additive genetic variance. Using real data from a wild population of Swiss barn owls Tyto alba, our model indicates that the missing individuals would display large black spots; and we conclude that genes affecting this trait are already under selection before it is expressed. Our model is a tool to correctly estimate the magnitude of both natural selection and additive genetic variance

Serveur académique lausannois

NORA - Norwegian Open Research Archives

Using mapped quantitative trait loci in improving genetic evaluation

Author: Abdel Azim Gamal Abdel N.
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2001
Field of study

The benefit of using QTL information in dairy cattle breeding schemes by means of computer simulation is investigated. In addition, algorithms to overcome computational problems arising when marker data are included in mixed linear models were proposed.;Computer simulation was conducted with parameters relative to the Holstein population of the United States. Superiority of QTL-assisted selection (QAS) over QTL-free selection was studied in four pathways of selection, namely active sires, young bulls, bull dams, and cows, for cumulative genetic response, accuracy of evaluation, and selection pressure on the QTL.;Further, breeding scheme as a factor was studied. The breeding scheme was the most effective factor in increasing the superiority of QAS. As it agreed with many previous studies, nucleus breeding schemes were found to be promising systems to implement QTL information. On the other hand, benefits of QAS in conventional two stage selection programs were limited.;The interaction between the type of QTL information available and the breeding system was found important. Using a highly polymorphic QTL in nucleus schemes was found very effective. Effects of different number of alleles per locus and different number of loci on the superiority of QAS were studied.;An algorithm to directly build the inverse of a conditional gametic relationship matrix, given marker data, was developed. The inverse algorithm is based on matrix decomposition instead of partitioned matrix theory. Numerical techniques that greatly improved computing performance were introduced.;Appropriate modifications to the conventional breeding schemes that are currently in use are highly recommended. Further, attention should be paid to the characteristics of the QTL and how they may interact with the breeding system, e.g., number of loci and alleles. Finally, the study found that the use of marked or known QTL information in genetic evaluation is computationally possible and generally useful

Digital Repository @ Iowa State University (ISU)

Genomic analysis of dominance effects on milk production and conformation traits in Fleckvieh cattle

Author: Edel C.
Emmerling R.
Ertl J.
Götz K.U.
Legarra A.
Varona L.
Vitezica Z.G.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background Estimates of dominance variance in dairy cattle based on pedigree data vary considerably across traits and amount to up to 50% of the total genetic variance for conformation traits and up to 43% for milk production traits. Using bovine SNP (single nucleotide polymorphism) genotypes, dominance variance can be estimated both at the marker level and at the animal level using genomic dominance effect relationship matrices. Yield deviations of high-density genotyped Fleckvieh cows were used to assess cross-validation accuracy of genomic predictions with additive and dominance models. The potential use of dominance variance in planned matings was also investigated. Results Variance components of nine milk production and conformation traits were estimated with additive and dominance models using yield deviations of 1996 Fleckvieh cows and ranged from 3.3% to 50.5% of the total genetic variance. REML and Gibbs sampling estimates showed good concordance. Although standard errors of estimates of dominance variance were rather large, estimates of dominance variance for milk, fat and protein yields, somatic cell score and milkability were significantly different from 0. Cross-validation accuracy of predicted breeding values was higher with genomic models than with the pedigree model. Inclusion of dominance effects did not increase the accuracy of the predicted breeding and total genetic values. Additive and dominance SNP effects for milk yield and protein yield were estimated with a BLUP (best linear unbiased prediction) model and used to calculate expectations of breeding values and total genetic values for putative offspring. Selection on total genetic value instead of breeding value would result in a larger expected total genetic superiority in progeny, i.e. 14.8% for milk yield and 27.8% for protein yield and reduce the expected additive genetic gain only by 4.5% for milk yield and 2.6% for protein yield. Conclusions Estimated dominance variance was substantial for most of the analyzed traits. Due to small dominance effect relationships between cows, predictions of individual dominance deviations were very inaccurate and including dominance in the model did not improve prediction accuracy in the cross-validation study. Exploitation of dominance variance in assortative matings was promising and did not appear to severely compromise additive genetic gain

Repositorio Universidad de Zaragoza

Springer - Publisher Connector

PubMed Central

ProdInra