Search CORE

EDP Sciences OAI-PMH repository (1.2.0)

Wageningen University & Research Publications

Within- and across-breed genomic prediction using whole-genome sequence and single nucleotide polymorphism panels

Author: A Coster
AC Bouwman
AK Sonesson
AP Roos De
APW Roos de
BJ Hayes
D Habier
D Habier
DS Falconer
G Su
HD Daetwyler
HD Daetwyler
JB Cole
JE Pryce
John A. Woolliams
KM Olson
LJ Corbin
ME Goddard
ME Goddard
Oscar O. M. Iheshiulor
PM VanRaden
PM VanRaden
PM VanRaden
Robin Wellmann
S Purcell
SA Clark
T Druet
T Luan
THE Meuwissen
THE Meuwissen
THE Meuwissen
THE Meuwissen
THE Meuwissen
Theo H. E. Meuwissen
TR Solberg
U Ober
WG Hill
WG Hill
X Yu
Xijiang Yu
YC Wientjes
YCJ Wientjes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

International audienceBackground Currently, genomic prediction in cattle is largely based on panels of about 54k single nucleotide polymorphisms (SNPs). However with the decreasing costs of and current advances in next-generation sequencing technologies, whole-genome sequence (WGS) data on large numbers of individuals is within reach. Availability of such data provides new opportunities for genomic selection, which need to be explored.MethodsThis simulation study investigated how much predictive ability is gained by using WGS data under scenarios with QTL (quantitative trait loci) densities ranging from 45 to 132 QTL/Morgan and heritabilities ranging from 0.07 to 0.30, compared to different SNP densities, with emphasis on divergent dairy cattle breeds with small populations. The relative performances of best linear unbiased prediction (SNP-BLUP) and of a variable selection method with a mixture of two normal distributions (MixP) were also evaluated. Genomic predictions were based on within-population, across-population, and multi-breed reference populations.ResultsThe use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits. Depending on heritability of the trait, and on SNP and QTL densities, accuracy increased by up to 31 %. The advantage of WGS data was more pronounced (7 to 92 % increase in accuracy depending on trait heritability, SNP and QTL densities, and time of divergence between populations) with a combined reference population and when using MixP. While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.ConclusionsOur results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method

Brage NMBU

The importance of identity-by-state information for the accuracy of genomic selection

Author: Alessandro Bagnato
AR Gilmour
BJ Hayes
BL Harris
D Berry
D Habier
D Habier
DJ Garrick
HD Daetwyler
John A Woolliams
Jørgen Ødegård
M Goddard
Marlies Dolezal
ME Goddard
MS Lund
PM VanRaden
R Makowsky
RL Fernando
Sergio I Roman-Ponce
T Luan
T Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Tu Luan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information. Methods The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data. Results We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree. Conclusions Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.</p

AIR Universita degli studi di Milano

Genetic prediction of complex traits: integrating infinitesimal and marked genetic effects

Author: A Legarra
Clément Carré
CR Henderson
CR Henderson
D Gianola
David Cros
Eduardo Manfredi
Fabrice Gamboa
G Los campos De
GR Abecasis
Gregor Gorjanc
I Aguilar
John Michael Hickey
JT Yang
LN Hazel
ME Goddard
ME Goddard
MS Lund
P Vanraden
RL Fernando
RL Quaas
RV Rohlfs
SI Duchemin
TH Meuwissen
TH Meuwissen
TH Meuwissen
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Genetic prediction for complex traits is usually based on models including individual (infinitesimal) or marker effects. Here, we concentrate on models including both the individual and the marker effects. In particular, we develop a ''Mendelian segregation'' model combining infinitesimal effects for base individuals and realized Mendelian sampling in descendants described by the available DNA data. The model is illustrated with an example and the analyses of a public simulated data file. Further, the potential contribution of such models is assessed by simulation. Accuracy, measured as the correlation between true (simulated) and predicted genetic values, was similar for all models compared under different genetic backgrounds. As expected, the segregation model is worthwhile when markers capture a low fraction of total genetic variance. (Résumé d'auteur

HAL Descartes

Repository of the University of Ljubljana

Agritrop

HAL-CIRAD

Accuracy of breeding values of 'unrelated' individuals predicted by dense SNP genotyping

Author: A Tenesa
AR Gilmour
BW Silverman
CR Henderson
D Falconer
D Habier
HD Daetwyler
JA Sved
JFC Kingman
M Kimura
ME Goddard
ME Goddard
MN Weedon
MP Calus
NR Wray
NR Wray
PM VanRaden
PM Visscher
RR Hudson
SH Lee
THE Meuwissen
Theo HE Meuwissen
TR Solberg
X Kong
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Recent developments in SNP discovery and high throughput genotyping technology have made the use of high-density SNP markers to predict breeding values feasible. This involves estimation of the SNP effects in a training data set, and use of these estimates to evaluate the breeding values of other 'evaluation' individuals. Simulation studies have shown that these predictions of breeding values can be accurate, when training and evaluation individuals are (closely) related. However, many general applications of genomic selection require the prediction of breeding values of 'unrelated' individuals, i.e. individuals from the same population, but not particularly closely related to the training individuals. Methods Accuracy of selection was investigated by computer simulation of small populations. Using scaling arguments, the results were extended to different populations, training data sets and genome sizes, and different trait heritabilities. Results Prediction of breeding values of unrelated individuals required a substantially higher marker density and number of training records than when prediction individuals were offspring of training individuals. However, when the number of records was 2*Ne*L and the number of markers was 10*Ne*L, the breeding values of unrelated individuals could be predicted with accuracies of 0.88 – 0.93, where Ne is the effective population size and L the genome size in Morgan. Reducing this requirement to 1*Ne*L individuals, reduced prediction accuracies to 0.73–0.83. Conclusion For livestock populations, 1NeL requires about ~30,000 training records, but this may be reduced if training and evaluation animals are related. A prediction equation is presented, that predicts accuracy when training and evaluation individuals are related. For humans, 1NeL requires ~350,000 individuals, which means that human disease risk prediction is possible only for diseases that are determined by a limited number of genes. Otherwise, genotyping and phenotypic recording need to become very common in the future.</p

The complete linkage disequilibrium test: a test that points to causative mutations underlying quantitative traits

Author: AR Gilmour
CS Carlson
E Uleberg
Eivind Uleberg
EK Karlsson
F Farnir
M Pérez-Enciso
M Ron
ME Goddard
RR Hudson
Theo HE Meuwissen
WG Hill
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genetically, SNP that are in complete linkage disequilibrium with the causative SNP cannot be distinguished from the causative SNP. The Complete Linkage Disequilibrium (CLD) test presented here tests whether a SNP is in complete LD with the causative mutation or not. The performance of the CLD test is evaluated in 1000 simulated datasets. Methods The CLD test consists of two steps i.e. analysis I and analysis II. Analysis I consists of an association analysis of the investigated region. The log-likelihood values from analysis I are next ranked in descending order and in analysis II the CLD test evaluates differences in log-likelihood ratios between the best and second best markers. Under the null-hypothesis distribution, the best SNP is in greater LD with the QTL than the second best, while under the alternative-CLD-hypothesis, the best SNP is alike-in-state with the QTL. To find a significance threshold, the test was also performed on data excluding the causative SNP. The 5th, 10th and 50th highest TCLD value from 1000 replicated analyses were used to control the type-I-error rate of the test at p = 0.005, p = 0.01 and p = 0.05, respectively. Results In a situation where the QTL explained 48% of the phenotypic variance analysis I detected a QTL in 994 replicates (p = 0.001), where 972 were positioned in the correct QTL position. When the causative SNP was excluded from the analysis, 714 replicates detected evidence of a QTL (p = 0.001). In analysis II, the CLD test confirmed 280 causative SNP from 1000 simulations (p = 0.05), i.e. power was 28%. When the effect of the QTL was reduced by doubling the error variance, the power of the test reduced relatively little to 23%. When sequence data were used, the power of the test reduced to 16%. All SNP that were confirmed by the CLD test were positioned in the correct QTL position. Conclusions The CLD test can provide evidence for a causative SNP, but its power may be low in situations with closely linked markers. In such situations, also functional evidence will be needed to definitely conclude whether the SNP is causative or not.</p

Public Library of Science (PLOS)

Best Linear Unbiased Prediction of Genomic Breeding Values Using a Trait-Specific Marker-Derived Relationship Matrix

Author: A Jacquard
A Legarra
A Nejati-Javaremi
AK Sonesson
B Hayes
BJ Hayes
BJ Hayes
BJ Hayes
BJ Hayes
CR Henderson
D Habier
D Habier
Dirk-Jan de Koning
DS Falconer
EL Heffner
H Eding
HD Daetwyler
HD Daetwyler
HM Nielsen
I Strandén
JC Whittaker
Jianfeng Liu
JL Jannink
LR Schaeffer
M Kimura
ME Goddard
ME Goddard
ME Goddard
MPL Calus
MPL Calus
N Long
OF Christensen
Piter Bijma
PM VanRaden
PM Visscher
Qin Zhang
S Xu
S Zhong
SH Lee
T Luan
TH Meuwissen
TH Meuwissen
THE Meuwissen
Thomas Mailund
TR Solberg
TR Solberg
WM Muir
Xiangdong Ding
Zhe Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

With the availability of high density whole-genome single nucleotide polymorphism chips, genomic selection has become a promising method to estimate genetic merit with potentially high accuracy for animal, plant and aquaculture species of economic importance. With markers covering the entire genome, genetic merit of genotyped individuals can be predicted directly within the framework of mixed model equations, by using a matrix of relationships among individuals that is derived from the markers. Here we extend that approach by deriving a marker-based relationship matrix specifically for the trait of interest

Wageningen University & Research Publications

Strategies for implementing genomic selection in family-based aquaculture breeding schemes: double haploid sib test populations

Author: AK Sonesson
AK Sonesson
Anna K Sonesson
B Villanueva
BJ Hayes
D Habier
EL Heffner
F Galton
H Komen
HD Daetwyler
HM Nielsen
JBS Haldane
JL Jannink
John A Woolliams
JP Gibson
Kahsay G Nirea
KG Nirea
M Goddard
M Kimura
M Pszczola
ME Goddard
MG Bulmer
PM VanRaden
S Wright
THE Meuwissen
Theo HE Meuwissen
VA Martinez
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background Simulation studies have shown that accuracy and genetic gain are increased in genomic selection schemes compared to traditional aquaculture sib-based schemes. In genomic selection, accuracy of selection can be maximized by increasing the precision of the estimation of SNP effects and by maximizing the relationships between test sibs and candidate sibs. Another means of increasing the accuracy of the estimation of SNP effects is to create individuals in the test population with extreme genotypes. The latter approach was studied here with creation of double haploids and use of non-random mating designs. Methods Six alternative breeding schemes were simulated in which the design of the test population was varied: test sibs inherited maternal (<it>Mat</it>), paternal (<it>Pat</it>) or a mixture of maternal and paternal (<it>MatPat</it>) double haploid genomes or test sibs were obtained by maximum coancestry mating (<it>MaxC</it>), minimum coancestry mating (<it>MinC</it>), or random (<it>RAND</it>) mating. Three thousand test sibs and 3000 candidate sibs were genotyped. The test sibs were recorded for a trait that could not be measured on the candidates and were used to estimate SNP effects. Selection was done by truncation on genome-wide estimated breeding values and 100 individuals were selected as parents each generation, equally divided between both sexes. Results Results showed a 7 to 19% increase in selection accuracy and a 6 to 22% increase in genetic gain in the <it>MatPat</it> scheme compared to the <it>RAND</it> scheme. These increases were greater with lower heritabilities. Among all other scenarios, i.e. <it>Mat, Pat, MaxC</it>, and <it>MinC</it>, no substantial differences in selection accuracy and genetic gain were observed. Conclusions In conclusion, a test population designed with a mixture of paternal and maternal double haploids, i.e. the <it>MatPat</it> scheme, increases substantially the accuracy of selection and genetic gain. This will be particularly interesting for traits that cannot be recorded on the selection candidates and require the use of sib tests, such as disease resistance and meat quality.</p

Genomic prediction based on runs of homozygosity

Author: AK Sonesson
Alessandro Bagnato
AR Gilmour
B Hayes
BJ Hayes
D Habier
IM MacLeod
JA Sved
JA Yang
JL Jannink
LR Rabiner
Marlies Dolezal
ME Goddard
PM VanRaden
RL Fernando
T Luan
THE Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Tu Luan
WG Hill
Xijiang Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Digital Repository @ Iowa State University (ISU)

Persistence of accuracy of genomic estimated breeding values over generations in layer chickens

Author: A Wolc
AK Sonesson
Anna Wolc
AR Gilmour
BJ Hayes
D Habier
D Habier
D Habier
David Habier
Dorian J Garrick
G Moser
HD Daetwyler
Jack CM Dekkers
Janet E Fulton
JCM Dekkers
Jesus Arango
ME Goddard
Neil P O'Sullivan
Petek Settar
PM VanRaden
PM VanRaden
RL Fernando
Rohan Fernando
Rudolf Preisinger
T Luan
THE Meuwissen
THE Meuwissen
TR Solberg
WM Muir
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The predictive ability of genomic estimated breeding values (GEBV) originates both from associations between high-density markers and QTL (Quantitative Trait Loci) and from pedigree information. Thus, GEBV are expected to provide more persistent accuracy over successive generations than breeding values estimated using pedigree-based methods. The objective of this study was to evaluate the accuracy of GEBV in a closed population of layer chickens and to quantify their persistence over five successive generations using marker or pedigree information. Methods The training data consisted of 16 traits and 777 genotyped animals from two generations of a brown-egg layer breeding line, 295 of which had individual phenotype records, while others had phenotypes on 2,738 non-genotyped relatives, or similar data accumulated over up to five generations. Validation data included phenotyped and genotyped birds from five subsequent generations (on average 306 birds/generation). Birds were genotyped for 23,356 segregating SNP. Animal models using genomic or pedigree relationship matrices and Bayesian model averaging methods were used for training analyses. Accuracy was evaluated as the correlation between EBV and phenotype in validation divided by the square root of trait heritability. Results Pedigree relationships in outbred populations are reduced by 50% at each meiosis, therefore accuracy is expected to decrease by the square root of 0.5 every generation, as observed for pedigree-based EBV (Estimated Breeding Values). In contrast the GEBV accuracy was more persistent, although the drop in accuracy was substantial in the first generation. Traits that were considered to be influenced by fewer QTL and to have a higher heritability maintained a higher GEBV accuracy over generations. In conclusion, GEBV capture information beyond pedigree relationships, but retraining every generation is recommended for genomic selection in closed breeding populations.</p