Search CORE

320 research outputs found

Extension of the bayesian alphabet for genomic selection

Author: B Hayes
BJ Hayes
CR Henderson
CR Henderson
D Garrick
D Gianola
D Habier
D Habier
D Habier
D Sorensen
David Habier
Dorian J Garrick
EL Heffner
HD Daetwyler
HP Piepho
JL Jannink
Kadir Kizilkaya
LA García-Cortés
PM VanRaden
PM VanRaden
RL Fernando
RL Fernando
Rohan L Fernando
S Karlin
S Zhong
SJ Godsill
T Ohta
THE Meuwissen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Two Bayesian methods, BayesC<it>π </it>and BayesD<it>π</it>, were developed for genomic prediction to address the drawback of BayesA and BayesB regarding the impact of prior hyperparameters and treat the prior probability <it>π </it>that a SNP has zero effect as unknown. The methods were compared in terms of inference of the number of QTL and accuracy of genomic estimated breeding values (GEBVs), using simulated scenarios and real data from North American Holstein bulls. Results Estimates of <it>π </it>from BayesC<it>π</it>, in contrast to BayesD<it>π</it>, were sensitive to the number of simulated QTL and training data size, and provide information about genetic architecture. Milk yield and fat yield have QTL with larger effects than protein yield and somatic cell score. The drawback of BayesA and BayesB did not impair the accuracy of GEBVs. Accuracies of alternative Bayesian methods were similar. BayesA was a good choice for GEBV with the real data. Computing time was shorter for BayesC<it>π </it>than for BayesD<it>π</it>, and longest for our implementation of BayesA. Conclusions Collectively, accounting for computing effort, uncertainty as to the number of QTL (which affects the GEBV accuracy of alternative methods), and fundamental interest in the number of QTL underlying quantitative traits, we believe that BayesC<it>π </it>has merit for routine applications.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Persistence of accuracy of genomic estimated breeding values over generations in layer chickens

Author: A Wolc
AK Sonesson
Anna Wolc
AR Gilmour
BJ Hayes
D Habier
D Habier
D Habier
David Habier
Dorian J Garrick
G Moser
HD Daetwyler
Jack CM Dekkers
Janet E Fulton
JCM Dekkers
Jesus Arango
ME Goddard
Neil P O'Sullivan
Petek Settar
PM VanRaden
PM VanRaden
RL Fernando
Rohan Fernando
Rudolf Preisinger
T Luan
THE Meuwissen
THE Meuwissen
TR Solberg
WM Muir
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The predictive ability of genomic estimated breeding values (GEBV) originates both from associations between high-density markers and QTL (Quantitative Trait Loci) and from pedigree information. Thus, GEBV are expected to provide more persistent accuracy over successive generations than breeding values estimated using pedigree-based methods. The objective of this study was to evaluate the accuracy of GEBV in a closed population of layer chickens and to quantify their persistence over five successive generations using marker or pedigree information. Methods The training data consisted of 16 traits and 777 genotyped animals from two generations of a brown-egg layer breeding line, 295 of which had individual phenotype records, while others had phenotypes on 2,738 non-genotyped relatives, or similar data accumulated over up to five generations. Validation data included phenotyped and genotyped birds from five subsequent generations (on average 306 birds/generation). Birds were genotyped for 23,356 segregating SNP. Animal models using genomic or pedigree relationship matrices and Bayesian model averaging methods were used for training analyses. Accuracy was evaluated as the correlation between EBV and phenotype in validation divided by the square root of trait heritability. Results Pedigree relationships in outbred populations are reduced by 50% at each meiosis, therefore accuracy is expected to decrease by the square root of 0.5 every generation, as observed for pedigree-based EBV (Estimated Breeding Values). In contrast the GEBV accuracy was more persistent, although the drop in accuracy was substantial in the first generation. Traits that were considered to be influenced by fewer QTL and to have a higher heritability maintained a higher GEBV accuracy over generations. In conclusion, GEBV capture information beyond pedigree relationships, but retraining every generation is recommended for genomic selection in closed breeding populations.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Breeding value prediction for production traits in layer chickens using pedigree or genomic relationships in a reduced animal model

Author: A Legarra
Anna Wolc
AR Gilmour
BJ Hayes
Chris Stricker
D Habier
D Habier
D Habier
David Habier
DJ Garrick
Dorian J Garrick
HD Daetwyler
HD Daetwyler
I Aguilar
IMS White
Jack CM Dekkers
Janet E Fulton
JCM Dekkers
Jesus Arango
JWM Bastiaansen
KL Verbyla
M Goddard
MS Lund
Neil P O'Sullivan
OF Christensen
Petek Settar
PM VanRaden
PM VanRaden
PM VanRaden
RL Quaas
Rohan Fernando
Rudolf Preisinger
Susan J Lamont
T Luan
THE Meuwissen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genomic selection involves breeding value estimation of selection candidates based on high-density SNP genotypes. To quantify the potential benefit of genomic selection, accuracies of estimated breeding values (EBV) obtained with different methods using pedigree or high-density SNP genotypes were evaluated and compared in a commercial layer chicken breeding line. Methods The following traits were analyzed: egg production, egg weight, egg color, shell strength, age at sexual maturity, body weight, albumen height, and yolk weight. Predictions appropriate for early or late selection were compared. A total of 2,708 birds were genotyped for 23,356 segregating SNP, including 1,563 females with records. Phenotypes on relatives without genotypes were incorporated in the analysis (in total 13,049 production records). The data were analyzed with a Reduced Animal Model using a relationship matrix based on pedigree data or on marker genotypes and with a Bayesian method using model averaging. Using a validation set that consisted of individuals from the generation following training, these methods were compared by correlating EBV with phenotypes corrected for fixed effects, selecting the top 30 individuals based on EBV and evaluating their mean phenotype, and by regressing phenotypes on EBV. Results Using high-density SNP genotypes increased accuracies of EBV up to two-fold for selection at an early age and by up to 88% for selection at a later age. Accuracy increases at an early age can be mostly attributed to improved estimates of parental EBV for shell quality and egg production, while for other egg quality traits it is mostly due to improved estimates of Mendelian sampling effects. A relatively small number of markers was sufficient to explain most of the genetic variation for egg weight and body weight.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The impact of genetic relationship information on genomic breeding values in German Holstein cattle

Author: A Winter
APW de Roos
AR Gilmour
B Grisart
BJ Hayes
BJ Hayes
BL Harris
CR Henderson
CR Henderson
D Gianola
D Habier
D Habier
D Habier
David Habier
DJ Garrick
F Farnir
Franz-Reinhold Seefried
G Malécot
Georg Thaller
HD Daetwyler
JCM Dekkers
Jens Tetens
LK Matukumalli
LR Schaeffer
M Pérez-Enciso
M Sargolzaei
ME Goddard
MPL Calus
P Croiseau
P Scheet
Peter Lichtner
PM VanRaden
PM VanRaden
PM Vanraden
PM VanRaden
RL Fernando
RL Fernando
S König
S Purcell
S Zhong
THE Meuwissen
THE Meuwissen
TR Solberg
WM Muir
Z Liu
Z Liu
Z Liu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The impact of additive-genetic relationships captured by single nucleotide polymorphisms (SNPs) on the accuracy of genomic breeding values (GEBVs) has been demonstrated, but recent studies on data obtained from Holstein populations have ignored this fact. However, this impact and the accuracy of GEBVs due to linkage disequilibrium (LD), which is fairly persistent over generations, must be known to implement future breeding programs. Materials and methods The data set used to investigate these questions consisted of 3,863 German Holstein bulls genotyped for 54,001 SNPs, their pedigree and daughter yield deviations for milk yield, fat yield, protein yield and somatic cell score. A cross-validation methodology was applied, where the maximum additive-genetic relationship (<it>a</it><it>max</it>) between bulls in training and validation was controlled. GEBVs were estimated by a Bayesian model averaging approach (BayesB) and an animal model using the genomic relationship matrix (G-BLUP). The accuracy of GEBVs due to LD was estimated by a regression approach using accuracy of GEBVs and accuracy of pedigree-based BLUP-EBVs. Results Accuracy of GEBVs obtained by both BayesB and G-BLUP decreased with decreasing <it>a</it><it>max </it>for all traits analyzed. The decay of accuracy tended to be larger for G-BLUP and with smaller training size. Differences between BayesB and G-BLUP became evident for the accuracy due to LD, where BayesB clearly outperformed G-BLUP with increasing training size. Conclusions GEBV accuracy of current selection candidates varies due to different additive-genetic relationships relative to the training data. Accuracy of future candidates can be lower than reported in previous studies because information from close relatives will not be available when selection on GEBVs is applied. A Bayesian model averaging approach exploits LD information considerably better than G-BLUP and thus is the most promising method. Cross-validations should account for family structure in the data to allow for long-lasting genomic based breeding plans in animal and plant breeding.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PuSH

Using the Pareto principle in genome-wide breeding value estimation

Author: B Efron
B Hayes
BJ Hayes
CR Henderson
D Habier
D Habier
EI George
H Ishwaran
HD Daetwyler
HD Daetwyler
J Besag
J Crossa
JM Juran
M Goddard
M Stone
PM VanRaden
T Luan
T Park
TH Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Xijiang Yu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The importance of identity-by-state information for the accuracy of genomic selection

Author: Alessandro Bagnato
AR Gilmour
BJ Hayes
BL Harris
D Berry
D Habier
D Habier
DJ Garrick
HD Daetwyler
John A Woolliams
Jørgen Ødegård
M Goddard
Marlies Dolezal
ME Goddard
MS Lund
PM VanRaden
R Makowsky
RL Fernando
Sergio I Roman-Ponce
T Luan
T Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Tu Luan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information. Methods The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data. Results We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree. Conclusions Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.</p

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Bayesian Variable Selection to identify QTL affecting a simulated quantitative trait

Author: Anouk Schurink
BL Fridley
CJ Hoggart
D Habier
G Sahana
Henri CM Heuven
JM Elsen
LLG Janss
Luc LG Janss
RE Kass
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Background Recent developments in genetic technology and methodology enable accurate detection of QTL and estimation of breeding values, even in individuals without phenotypes. The QTL-MAS workshop offers the opportunity to test different methods to perform a genome-wide association study on simulated data with a QTL structure that is unknown beforehand. The simulated data contained 3,220 individuals: 20 sires and 200 dams with 3,000 offspring. All individuals were genotyped, though only 2,000 offspring were phenotyped for a quantitative trait. QTL affecting the simulated quantitative trait were identified and breeding values of individuals without phenotypes were estimated using Bayesian Variable Selection, a multi-locus SNP model in association studies. Results Estimated heritability of the simulated quantitative trait was 0.30 (SD = 0.02). Mean posterior probability of SNP modelled having a large effect ( pˆi) was 0.0066 (95%HPDR: 0.0014-0.0132). Mean posterior probability of variance of second distribution was 0.409 (95%HPDR: 0.286-0.589). The genome-wide association analysis resulted in 14 significant and 43 putative SNP, comprising 7 significant QTL on chromosome 1, 2 and 3 and putative QTL on all chromosomes. Assigning single or multiple QTL to significant SNP was not obvious, especially for SNP in the same region that were more or less in LD. Correlation between the simulated and estimated breeding values of 1,000 offspring without phenotypes was 0.91. Conclusions Bayesian Variable Selection using thousands of SNP was successfully applied to genome-wide association analysis of a simulated dataset with unknown QTL structure. Simulated QTL with Mendelian inheritance were accurately identified, while imprinted and epistatic QTL were only putatively detected. The correlation between simulated and estimated breeding values of offspring without phenotypes was high

Crossref

Springer - Publisher Connector

PubMed Central

Wageningen University & Research Publications

Utrecht University Repository

Plant segmentation by supervised machine learning methods

Author: Choudhury S. D.
Davies E. R.
Habier D.
Hsu C.‐W.
Johnson R. A.
Kingma D. P.
McCormick R. F.
Vibhute A.
Wager S.
Wahba G.
Xavier A.
Xiong X.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2020
Field of study

High-throughput phenotyping systems provide abundant data for statistical analysis through plant imaging. Before usable data can be obtained, image processing must take place. In this study, we used supervised learning methods to segment plants from the background in such images and compared them with commonly used thresholding methods. Because obtaining accurate training data is a major obstacle to using supervised learning methods for segmentation, a novel approach to producing accurate labels was developed. We demonstrated that, with careful selection of training data through such an approach, supervised learning methods, and neural networks in particular, can outperform thresholding methods at segmentation

Crossref

DigitalCommons@University of Nebraska

Bowling Green State University: ScholarWorks@BGSU

Fitting and validating the genomic evaluation model to Polish Holstein-Friesian cattle

Author: A Legarra
A Żarnecki
B Grisart
BJ Hayes
D Habier
E Mäntysaari
I Strandén
L Jairath
MPL Calus
MS Lund
PM VanRaden
PM VanRaden
Publication venue: Springer-Verlag
Publication date: 01/01/2011
Field of study

The aim of the study was to fit the genomic evaluation model to Polish Holstein-Friesian dairy cattle. A training data set for the estimation of additive effects of single nucleotide polymorphisms (SNPs) consisted of 1227 Polish Holstein-Friesian bulls. Genotypes were obtained by the use of Illumina BovineSNP50 Genotyping BeadChip. Altogether 29 traits were considered: milk-, fat- and protein- yields, somatic cell score, four female fertility traits, and 21 traits describing conformation. The prediction of direct genomic values was based on a mixed model containing deregressed national proofs as a dependent variable and random SNP effects as independent variables. The correlations between direct genomic values and conventional estimated breeding values estimated for the whole data set were overall very high and varied between 0.98 for production traits and 0.78 for non return rates for cows. For the validation data set of 232 bulls the corresponding correlations were 0.38 for milk-, 0.37 for protein-, and 0.32 for fat yields, while the correlations between genomic enhanced breeding values and conventional estimated breeding values for the four traits were: 0.43, 0.44, 0.31, and 0.35. This model was able to pass the interbull validation criteria for genomic selection, which indicates that it is realistic to implement genomic selection in Polish Holstein-Friesian cattle

Crossref

Springer - Publisher Connector

PubMed Central

Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

Author: A Coster
AC Sørensen
B Gredler
B Grisart
Ben J Hayes
BJ Hayes
BJ Hayes
BS Dayal
CD Dechow
D Habier
D Habier
D Habier
DF Gudbjartsson
DP Berry
FG Curtis
G Lettre
G Moser
Gerhard Moser
HD Daetwyler
HD Daetwyler
HD Daetwyler
Herman W Raadsma
I-G Chong
JC Whittaker
KA Weigel
LR Schaeffer
M Goddard
M Haile-Mariam
M Perola
Mehar S Khatkar
MN Weedon
MP Calus
PM VanRaden
S König
S Tsuruta
S Wold
T Luan
T Meuwissen
TH Meuwissen
TH Meuwissen
TR Solberg
WM Muir
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods. Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results: RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions: Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ∼ 3,000 to 5,000 evenly spaced SNP

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace