Search CORE

83 research outputs found

Using the Pareto principle in genome-wide breeding value estimation

Author: B Efron
B Hayes
BJ Hayes
CR Henderson
D Habier
D Habier
EI George
H Ishwaran
HD Daetwyler
HD Daetwyler
J Besag
J Crossa
JM Juran
M Goddard
M Stone
PM VanRaden
T Luan
T Park
TH Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Xijiang Yu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Genome-wide breeding value (GWEBV) estimation methods can be classified based on the prior distribution assumptions of marker effects. Genome-wide BLUP methods assume a normal prior distribution for all markers with a constant variance, and are computationally fast. In Bayesian methods, more flexible prior distributions of SNP effects are applied that allow for very large SNP effects although most are small or even zero, but these prior distributions are often also computationally demanding as they rely on Monte Carlo Markov chain sampling. In this study, we adopted the Pareto principle to weight available marker loci, i.e., we consider that x% of the loci explain (100 - x)% of the total genetic variance. Assuming this principle, it is also possible to define the variances of the prior distribution of the 'big' and 'small' SNP. The relatively few large SNP explain a large proportion of the genetic variance and the majority of the SNP show small effects and explain a minor proportion of the genetic variance. We name this method MixP, where the prior distribution is a mixture of two normal distributions, i.e. one with a big variance and one with a small variance. Simulation results, using a real Norwegian Red cattle pedigree, show that MixP is at least as accurate as the other methods in all studied cases. This method also reduces the hyper-parameters of the prior distribution from 2 (proportion and variance of SNP with big effects) to 1 (proportion of SNP with big effects), assuming the overall genetic variance is known. The mixture of normal distribution prior made it possible to solve the equations iteratively, which greatly reduced computation loads by two orders of magnitude. In the era of marker density reaching million(s) and whole-genome sequence data, MixP provides a computationally feasible Bayesian method of analysis

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The importance of identity-by-state information for the accuracy of genomic selection

Author: Alessandro Bagnato
AR Gilmour
BJ Hayes
BL Harris
D Berry
D Habier
D Habier
DJ Garrick
HD Daetwyler
John A Woolliams
Jørgen Ødegård
M Goddard
Marlies Dolezal
ME Goddard
MS Lund
PM VanRaden
R Makowsky
RL Fernando
Sergio I Roman-Ponce
T Luan
T Meuwissen
THE Meuwissen
THE Meuwissen
Theo HE Meuwissen
Tu Luan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract Background It is commonly assumed that prediction of genome-wide breeding values in genomic selection is achieved by capitalizing on linkage disequilibrium between markers and QTL but also on genetic relationships. Here, we investigated the reliability of predicting genome-wide breeding values based on population-wide linkage disequilibrium information, based on identity-by-descent relationships within the known pedigree, and to what extent linkage disequilibrium information improves predictions based on identity-by-descent genomic relationship information. Methods The study was performed on milk, fat, and protein yield, using genotype data on 35 706 SNP and deregressed proofs of 1086 Italian Brown Swiss bulls. Genome-wide breeding values were predicted using a genomic identity-by-state relationship matrix and a genomic identity-by-descent relationship matrix (averaged over all marker loci). The identity-by-descent matrix was calculated by linkage analysis using one to five generations of pedigree data. Results We showed that genome-wide breeding values prediction based only on identity-by-descent genomic relationships within the known pedigree was as or more reliable than that based on identity-by-state, which implicitly also accounts for genomic relationships that occurred before the known pedigree. Furthermore, combining the two matrices did not improve the prediction compared to using identity-by-descent alone. Including different numbers of generations in the pedigree showed that most of the information in genome-wide breeding values prediction comes from animals with known common ancestors less than four generations back in the pedigree. Conclusions Our results show that, in pedigreed breeding populations, the accuracy of genome-wide breeding values obtained by identity-by-descent relationships was not improved by identity-by-state information. Although, in principle, genomic selection based on identity-by-state does not require pedigree data, it does use the available pedigree structure. Our findings may explain why the prediction equations derived for one breed may not predict accurate genome-wide breeding values when applied to other breeds, since family structures differ among breeds.</p

Crossref

AIR Universita degli studi di Milano

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

Author: A Coster
AC Sørensen
B Gredler
B Grisart
Ben J Hayes
BJ Hayes
BJ Hayes
BS Dayal
CD Dechow
D Habier
D Habier
D Habier
DF Gudbjartsson
DP Berry
FG Curtis
G Lettre
G Moser
Gerhard Moser
HD Daetwyler
HD Daetwyler
HD Daetwyler
Herman W Raadsma
I-G Chong
JC Whittaker
KA Weigel
LR Schaeffer
M Goddard
M Haile-Mariam
M Perola
Mehar S Khatkar
MN Weedon
MP Calus
PM VanRaden
S König
S Tsuruta
S Wold
T Luan
T Meuwissen
TH Meuwissen
TH Meuwissen
TR Solberg
WM Muir
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods. Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results: RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions: Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ∼ 3,000 to 5,000 evenly spaced SNP

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

Breeding value prediction for production traits in layer chickens using pedigree or genomic relationships in a reduced animal model

Author: A Legarra
Anna Wolc
AR Gilmour
BJ Hayes
Chris Stricker
D Habier
D Habier
D Habier
David Habier
DJ Garrick
Dorian J Garrick
HD Daetwyler
HD Daetwyler
I Aguilar
IMS White
Jack CM Dekkers
Janet E Fulton
JCM Dekkers
Jesus Arango
JWM Bastiaansen
KL Verbyla
M Goddard
MS Lund
Neil P O'Sullivan
OF Christensen
Petek Settar
PM VanRaden
PM VanRaden
PM VanRaden
RL Quaas
Rohan Fernando
Rudolf Preisinger
Susan J Lamont
T Luan
THE Meuwissen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genomic selection involves breeding value estimation of selection candidates based on high-density SNP genotypes. To quantify the potential benefit of genomic selection, accuracies of estimated breeding values (EBV) obtained with different methods using pedigree or high-density SNP genotypes were evaluated and compared in a commercial layer chicken breeding line. Methods The following traits were analyzed: egg production, egg weight, egg color, shell strength, age at sexual maturity, body weight, albumen height, and yolk weight. Predictions appropriate for early or late selection were compared. A total of 2,708 birds were genotyped for 23,356 segregating SNP, including 1,563 females with records. Phenotypes on relatives without genotypes were incorporated in the analysis (in total 13,049 production records). The data were analyzed with a Reduced Animal Model using a relationship matrix based on pedigree data or on marker genotypes and with a Bayesian method using model averaging. Using a validation set that consisted of individuals from the generation following training, these methods were compared by correlating EBV with phenotypes corrected for fixed effects, selecting the top 30 individuals based on EBV and evaluating their mean phenotype, and by regressing phenotypes on EBV. Results Using high-density SNP genotypes increased accuracies of EBV up to two-fold for selection at an early age and by up to 88% for selection at a later age. Accuracy increases at an early age can be mostly attributed to improved estimates of parental EBV for shell quality and egg production, while for other egg quality traits it is mostly due to improved estimates of Mendelian sampling effects. A relatively small number of markers was sufficient to explain most of the genetic variation for egg weight and body weight.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity

Author: C Burge
Cheng-Tsung Lu
DM Shien
E Huala
F Diella
F Gnad
FF Zhou
GE Crooks
H Steen
HD Huang
HD Huang
J Gao
J Gao
JC Obenauer
JL Heazlewood
JM Stone
KC Chou
LM Iakoucheva
M Schneider
M Steffen
MJ Hubbard
N Blom
N Blom
Neil Arvin Bretaña
P Diolez
PV Hornbeck
R Aebersold
S Luan
SC Huber
SR Eddy
TD Schneider
TY Lee
TY Lee
TY Lee
Tzong-Yi Lee
V Vacic
Y Xue
Y Xue
YH Wong
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in intracellular signal transduction. Due to the difficulty in performing high-throughput mass spectrometry-based experiment, there is a desire to predict phosphorylation sites using computational methods. However, previous studies regarding <it>in silico </it>prediction of plant phosphorylation sites lack the consideration of kinase-specific phosphorylation data. Thus, we are motivated to propose a new method that investigates different substrate specificities in plant phosphorylation sites. Results Experimentally verified phosphorylation data were extracted from TAIR9-a protein database containing 3006 phosphorylation data from the plant species <it>Arabidopsis thaliana</it>. In an attempt to investigate the various substrate motifs in plant phosphorylation, maximal dependence decomposition (MDD) is employed to cluster a large set of phosphorylation data into subgroups containing significantly conserved motifs. Profile hidden Markov model (HMM) is then applied to learn a predictive model for each subgroup. Cross-validation evaluation on the MDD-clustered HMMs yields an average accuracy of 82.4% for serine, 78.6% for threonine, and 89.0% for tyrosine models. Moreover, independent test results using <it>Arabidopsis thaliana </it>phosphorylation data from UniProtKB/Swiss-Prot show that the proposed models are able to correctly predict 81.4% phosphoserine, 77.1% phosphothreonine, and 83.7% phosphotyrosine sites. Interestingly, several MDD-clustered subgroups are observed to have similar amino acid conservation with the substrate motifs of well-known kinases from Phospho.ELM-a database containing kinase-specific phosphorylation data from multiple organisms. Conclusions This work presents a novel method for identifying plant phosphorylation sites with various substrate motifs. Based on cross-validation and independent testing, results show that the MDD-clustered models outperform models trained without using MDD. The proposed method has been implemented as a web-based plant phosphorylation prediction tool, PlantPhos <url>http://csb.cse.yzu.edu.tw/PlantPhos/</url>. Additionally, two case studies have been demonstrated to further evaluate the effectiveness of PlantPhos.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Within- and across-breed genomic prediction using whole-genome sequence and single nucleotide polymorphism panels

Author: A Coster
AC Bouwman
AK Sonesson
AP Roos De
APW Roos de
BJ Hayes
D Habier
D Habier
DS Falconer
G Su
HD Daetwyler
HD Daetwyler
JB Cole
JE Pryce
John A. Woolliams
KM Olson
LJ Corbin
ME Goddard
ME Goddard
Oscar O. M. Iheshiulor
PM VanRaden
PM VanRaden
PM VanRaden
Robin Wellmann
S Purcell
SA Clark
T Druet
T Luan
THE Meuwissen
THE Meuwissen
THE Meuwissen
THE Meuwissen
THE Meuwissen
Theo H. E. Meuwissen
TR Solberg
U Ober
WG Hill
WG Hill
X Yu
Xijiang Yu
YC Wientjes
YCJ Wientjes
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

International audienceBackground Currently, genomic prediction in cattle is largely based on panels of about 54k single nucleotide polymorphisms (SNPs). However with the decreasing costs of and current advances in next-generation sequencing technologies, whole-genome sequence (WGS) data on large numbers of individuals is within reach. Availability of such data provides new opportunities for genomic selection, which need to be explored.MethodsThis simulation study investigated how much predictive ability is gained by using WGS data under scenarios with QTL (quantitative trait loci) densities ranging from 45 to 132 QTL/Morgan and heritabilities ranging from 0.07 to 0.30, compared to different SNP densities, with emphasis on divergent dairy cattle breeds with small populations. The relative performances of best linear unbiased prediction (SNP-BLUP) and of a variable selection method with a mixture of two normal distributions (MixP) were also evaluated. Genomic predictions were based on within-population, across-population, and multi-breed reference populations.ResultsThe use of WGS data for within-population predictions resulted in small to large increases in accuracy for low to moderately heritable traits. Depending on heritability of the trait, and on SNP and QTL densities, accuracy increased by up to 31 %. The advantage of WGS data was more pronounced (7 to 92 % increase in accuracy depending on trait heritability, SNP and QTL densities, and time of divergence between populations) with a combined reference population and when using MixP. While MixP outperformed SNP-BLUP at 45 QTL/Morgan, SNP-BLUP was as good as MixP when QTL density increased to 132 QTL/Morgan.ConclusionsOur results show that, genomic predictions in numerically small cattle populations would benefit from a combination of WGS data, a multi-breed reference population, and a variable selection method

Brage NMBU

Crossref

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

Persistence of accuracy of genomic estimated breeding values over generations in layer chickens

Author: A Wolc
AK Sonesson
Anna Wolc
AR Gilmour
BJ Hayes
D Habier
D Habier
D Habier
David Habier
Dorian J Garrick
G Moser
HD Daetwyler
Jack CM Dekkers
Janet E Fulton
JCM Dekkers
Jesus Arango
ME Goddard
Neil P O'Sullivan
Petek Settar
PM VanRaden
PM VanRaden
RL Fernando
Rohan Fernando
Rudolf Preisinger
T Luan
THE Meuwissen
THE Meuwissen
TR Solberg
WM Muir
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The predictive ability of genomic estimated breeding values (GEBV) originates both from associations between high-density markers and QTL (Quantitative Trait Loci) and from pedigree information. Thus, GEBV are expected to provide more persistent accuracy over successive generations than breeding values estimated using pedigree-based methods. The objective of this study was to evaluate the accuracy of GEBV in a closed population of layer chickens and to quantify their persistence over five successive generations using marker or pedigree information. Methods The training data consisted of 16 traits and 777 genotyped animals from two generations of a brown-egg layer breeding line, 295 of which had individual phenotype records, while others had phenotypes on 2,738 non-genotyped relatives, or similar data accumulated over up to five generations. Validation data included phenotyped and genotyped birds from five subsequent generations (on average 306 birds/generation). Birds were genotyped for 23,356 segregating SNP. Animal models using genomic or pedigree relationship matrices and Bayesian model averaging methods were used for training analyses. Accuracy was evaluated as the correlation between EBV and phenotype in validation divided by the square root of trait heritability. Results Pedigree relationships in outbred populations are reduced by 50% at each meiosis, therefore accuracy is expected to decrease by the square root of 0.5 every generation, as observed for pedigree-based EBV (Estimated Breeding Values). In contrast the GEBV accuracy was more persistent, although the drop in accuracy was substantial in the first generation. Traits that were considered to be influenced by fewer QTL and to have a higher heritability maintained a higher GEBV accuracy over generations. In conclusion, GEBV capture information beyond pedigree relationships, but retraining every generation is recommended for genomic selection in closed breeding populations.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Best Linear Unbiased Prediction of Genomic Breeding Values Using a Trait-Specific Marker-Derived Relationship Matrix

Author: A Jacquard
A Legarra
A Nejati-Javaremi
AK Sonesson
B Hayes
BJ Hayes
BJ Hayes
BJ Hayes
BJ Hayes
CR Henderson
D Habier
D Habier
Dirk-Jan de Koning
DS Falconer
EL Heffner
H Eding
HD Daetwyler
HD Daetwyler
HM Nielsen
I Strandén
JC Whittaker
Jianfeng Liu
JL Jannink
LR Schaeffer
M Kimura
ME Goddard
ME Goddard
ME Goddard
MPL Calus
MPL Calus
N Long
OF Christensen
Piter Bijma
PM VanRaden
PM Visscher
Qin Zhang
S Xu
S Zhong
SH Lee
T Luan
TH Meuwissen
TH Meuwissen
THE Meuwissen
Thomas Mailund
TR Solberg
TR Solberg
WM Muir
Xiangdong Ding
Zhe Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

With the availability of high density whole-genome single nucleotide polymorphism chips, genomic selection has become a promising method to estimate genetic merit with potentially high accuracy for animal, plant and aquaculture species of economic importance. With markers covering the entire genome, genetic merit of genotyped individuals can be predicted directly within the framework of mixed model equations, by using a matrix of relationships among individuals that is derived from the markers. Here we extend that approach by deriving a marker-based relationship matrix specifically for the trait of interest

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

Wageningen University & Research Publications

Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification

Author: A Ben-Dor
A Nagai
AJ Yang
C Ding
C Moroz
CC Gavin
CH Zhang
Cheng Liu
DA Notterman
G Monari
H Zou
Hai Zhang
HD Li
I Guyon
I Rivals
I Sohn
J Fan
J Fiedman
J Fiedman
J Wiese AH
JH Dai
JW Lee
K Shailubhai
K Yang
Kwong-Sak Leung
MA Shipp
R Maglietta
R Tibshirani
S Dudoit
SK Shevade
SL Wang
T Golub
T Li
Tak-Ming Chan
U Alon
Xin-Ze Luan
Yong Liang
ZB Xu
ZB Xu
Zong-Ben Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genotyping of Streptococcus agalactiae (group B streptococci) isolated from vaginal and rectal swabs of women at 35-37 weeks of pregnancy

Abstract Background Group B streptococci (GBS), or <it>Streptococcus agalactiae</it>, are the leading bacterial cause of meningitis and bacterial sepsis in newborns. Here we compared different culture media for GBS detection and we compared the occurrence of different genotypes and serotypes of GBS isolates from the vagina and rectum. Methods <it>Streptococcus agalactiae </it>was cultured separately from both rectum and vagina, for a total of 150 pregnant women, i) directly onto Columbia CNA agar, or indirectly onto ii) Granada agar resp. iii) Columbia CNA agar, after overnight incubation in Lim broth. Results Thirty six women (24%) were colonized by GBS. Of these, 19 harbored GBS in both rectum and vagina, 9 only in the vagina and 8 exclusively in the rectum. The combination of Lim broth and subculture on Granada agar was the only culture method that detected all GBS positive women. Using RAPD-analysis, a total of 66 genotypes could be established among the 118 isolates from 32 women for which fingerprinting was carried out. Up to 4 different genotypes in total (rectal + vaginal) were found for 4 women, one woman carried 3 different genotypes vaginally and 14 women carried two 2 different genotypes vaginally. Only two subjects were found to carry strains with the same genotype, although the serotype of both of these strains was different. Eighteen of the 19 subjects with GBS at both sites had at least one vaginal and one rectal isolate with the same genotype. We report the presence of two to four different genotypes in 22 (61%) of the 36 GBS positive women and the presence of identical genotypes in both sites for all women but one. Conclusion The combination of Lim broth and subculture on Granada medium provide high sensitivity for GBS detection from vaginal and rectal swabs from pregnant women. We established a higher genotypic diversity per individual than other studies, with up to four different genotypes among a maximum of 6 isolates per individual picked. Still, 18 of the 19 women with GBS from both rectum and vagina had at least one isolate from each sampling site with the same genotype.</p

Lirias

Crossref

Springer - Publisher Connector

eCommons@AKU

Directory of Open Access Journals

Ghent University Academic Bibliography

PubMed Central

Archivsystem Ask23