Search CORE

238 research outputs found

Estimating Effects and Making Predictions from Genome-Wide Marker Data

Author: Goddard Michael E.
Verbyla Klara
Visscher Peter M.
Wray Naomi R.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called ``winner's curse.'' We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.Comment: Published in at http://dx.doi.org/10.1214/09-STS306 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Queensland eSpace

WGNAM: whole-genome nested association mapping

Author: Christopher Jack T.
Kelly Alison M.
Paccapelo Valeria
Verbyla Arūnas P.
Publication venue
Publication date: 01/01/2022
Field of study

A powerful QTL analysis method for nested association mapping populations is presented. Based on a one-stage multi-locus model, it provides accurate predictions of founder specific QTL effects

Queensland DAF eResearch Archive

Transcriptomic analysis of wheat near-isogenic lines identifies PM19-A1 and A2 as candidates for a major dormancy QTL

Author: Barrero J.
Cavanagh C.
Gubler F.
Hayden M.
Huang B.
Rigault P.
Rosewarne G.
Stephen S.
Tibbits J.
Verbyla A.
Verbyla K.
Wang P.
Whan A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

BACKGROUND: Next-generation sequencing technologies provide new opportunities to identify the genetic components responsible for trait variation. However, in species with large polyploid genomes, such as bread wheat, the ability to rapidly identify genes underlying quantitative trait loci (QTL) remains non-trivial. To overcome this, we introduce a novel pipeline that analyses, by RNA-sequencing, multiple near-isogenic lines segregating for a targeted QTL. RESULTS: We use this approach to characterize a major and widely utilized seed dormancy QTL located on chromosome 4AL. It exploits the power and mapping resolution afforded by large multi-parent mapping populations, whilst reducing complexity by using multi-allelic contrasts at the targeted QTL region. Our approach identifies two adjacent candidate genes within the QTL region belonging to the ABA-induced Wheat Plasma Membrane 19 family. One of them, PM19-A1, is highly expressed during grain maturation in dormant genotypes. The second, PM19-A2, shows changes in sequence causing several amino acid alterations between dormant and non-dormant genotypes. We confirm that PM19 genes are positive regulators of seed dormancy. CONCLUSIONS: The efficient identification of these strong candidates demonstrates the utility of our transcriptomic pipeline for rapid QTL to gene mapping. By using this approach we are able to provide a comprehensive genetic analysis of the major source of grain dormancy in wheat. Further analysis across a diverse panel of bread and durum wheats indicates that this important dormancy QTL predates hexaploid wheat. The use of these genes by wheat breeders could assist in the elimination of pre-harvest sprouting in wheat.Jose M. Barrero, Colin Cavanagh, Klara L. Verbyla, Josquin F.G. Tibbits, Arunas P. Verbyla, B. Emma Huang, Garry M. Rosewarne, Stuart Stephen, Penghao Wang, Alex Whan, Philippe Rigault, Matthew J. Hayden, and Frank Guble

Crossref

Adelaide Research & Scholarship

Springer - Publisher Connector

PubMed Central

University of Melbourne Institutional Repository

Recommended from our members

Covariance Clustering: Modelling Covariance in Designed Experiments When the Number of Variables is Greater than Experimental Units

Author: Forknall Clayton R.
Fox Glen P.
Jones Shirley H.
Kelly Alison M.
Kerr Edward
Nazarathy Yoni
Osama Sarah
Schulz Benjamin L.
Verbyla Arūnas P.
Yousif Adel
Publication venue
Publication date: 01/01/2023
Field of study

The size and complexity of datasets resulting from comparative research experiments in the agricultural domain is constantly increasing. Often the number of variables measured in an experiment exceeds the number of experimental units composing the experiment. When there is a necessity to model the covariance relationships that exist between variables in these experiments, estimation difficulties can arise due to the resulting covariance structure being of reduced rank. A statistical method, based in a linear mixed model framework, is presented for the analysis of designed experiments where datasets are characterised by a greater number of variables than experimental units, and for which the modelling of complex covariance structures between variables is desired. Aided by a clustering algorithm, the method enables the estimation of covariance through the introduction of covariance clusters as random effects into the modelling framework, providing an extension of the traditional variance components model for building covariance structures. The method was applied to a multi-phase mass spectrometry-based proteomics experiment, with the aim of exploring changes in the proteome of barley grain over time during the malting process. The modelling approach provides a new linear mixed model-based method for the estimation of covariance structures between variables measured from designed experiments, when there are a small number of experimental units, or observations, informing covariance parameter estimates

eScholarship - University of California

Queensland DAF eResearch Archive

Accuracy of genomic breeding values in multi-breed dairy cattle populations

Author: A Nejati-Javaremi
Amanda C Chamberlain
APW De Roos
APW De Roos
AR Gilmour
B Villanueva
Ben J Hayes
BF Grisart
BJ Hayes
BJ Hayes
BL Harris
D Habier
JM Hickey
KL Verbyla
Klara Verbyla
M Haile-Mariam
ME Goddard
Mike E Goddard
N Ibánez-Escriche
P Scheet
Phillip J Bowman
PM VanRaden
PM VanRaden
RJ Spelman
RL Fernando
S Zhong
THE Meuwissen
Publication venue: BioMed Central
Publication date: 01/11/2009
Field of study

Abstract Background Two key findings from genomic selection experiments are 1) the reference population used must be very large to subsequently predict accurate genomic estimated breeding values (GEBV), and 2) prediction equations derived in one breed do not predict accurate GEBV when applied to other breeds. Both findings are a problem for breeds where the number of individuals in the reference population is limited. A multi-breed reference population is a potential solution, and here we investigate the accuracies of GEBV in Holstein dairy cattle and Jersey dairy cattle when the reference population is single breed or multi-breed. The accuracies were obtained both as a function of elements of the inverse coefficient matrix and from the realised accuracies of GEBV. Methods Best linear unbiased prediction with a multi-breed genomic relationship matrix (GBLUP) and two Bayesian methods (BAYESA and BAYES_SSVS) which estimate individual SNP effects were used to predict GEBV for 400 and 77 young Holstein and Jersey bulls respectively, from a reference population of 781 and 287 Holstein and Jersey bulls, respectively. Genotypes of 39,048 SNP markers were used. Phenotypes in the reference population were de-regressed breeding values for production traits. For the GBLUP method, expected accuracies calculated from the diagonal of the inverse of coefficient matrix were compared to realised accuracies. Results When GBLUP was used, expected accuracies from a function of elements of the inverse coefficient matrix agreed reasonably well with realised accuracies calculated from the correlation between GEBV and EBV in single breed populations, but not in multi-breed populations. When the Bayesian methods were used, realised accuracies of GEBV were up to 13% higher when the multi-breed reference population was used than when a pure breed reference was used. However no consistent increase in accuracy across traits was obtained. Conclusion Predicting genomic breeding values using a genomic relationship matrix is an attractive approach to implement genomic selection as expected accuracies of GEBV can be readily derived. However in multi-breed populations, Bayesian approaches give higher accuracies for some traits. Finally, multi-breed reference populations will be a valuable resource to fine map QTL.</p

Crossref

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

University of Queensland eSpace

Extended GMANOVA model with a linearly structured covariance matrix

Author: A. P. Verbyla
C. G. Khatri
C. G. Khatri
D. Rosen von
D. von Rosen
H. Cramér
J. C. Lee
J. C.-S. Lee
J. Hu
J. Hu
J. Nzabanita
J. Nzabanita
K. Filipiak
M. Ohlson
M. Singull
R. F. Pothoff
T. Kollo
Publication venue: 'Allerton Press'
Publication date
Field of study

Crossref

Identification of Mendelian inconsistencies between SNP and pedigree information of sibs

Author: B Hayes
C Israel
DS Falconer
GR Wiggans
GR Wiggans
H Geldermann
Han A Mulder
JA Woolliams
JE Powell
JI Weller
JI Weller
John WM Bastiaansen
KL Verbyla
L Aceto
L Huang
M Lynch
M Ron
M Ron
Mario PL Calus
N Ihara
PM VanRaden
PM Visscher
SR Browning
SW Guo
T Meuwissen
WG Hill
WG Hill
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Conclusions Tests to remove Mendelian inconsistencies between sibs should be preceded by a test for parent-offspring inconsistencies. This parent-offspring test should not only consider parent-offspring pairs based on pedigree data, but also those based on SNP information. Both SIB tests could identify pairs of sibs with Mendelian inconsistencies. Based on type I and II error rates, counting opposing homozygotes between sibs (SIBCOUNT) appears slightly more precise than comparing genomic and pedigree relationships (SIBREL) to detect Mendelian inconsistencies between sib

Crossref

Springer - Publisher Connector

PubMed Central

Wageningen University & Research Publications

Estimated breeding values and association mapping for persistency and total milk yield using natural cubic smoothing splines

Author: A Frensham
AB Samoré
AB Smith
AP Verbyla
AP Verbyla
AR Gilmour
Arunas P Verbyla
B Harder
BL Muir
D Yamaji
EL Sherman
F Mekus
H Kiiveri
H Kiiveri
IM White
IMS White
J Appuhamy
J Kucerova
J Nadesalingam
JB Cole
JB Cole
JCM Dekkers
JH Jakobsen
JI Weller
JY Dai
K Togashi
K Togashi
K Togashi
Klara L Verbyla
KM Rasmussen
M Grossman
MS Ashwell
N Gengler
PJ Green
Q Xiao
R Development Core Team
R Rekaya
T Druet
T Druet
WP Jones
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

BackgroundFor dairy producers, a reliable description of lactation curves is a valuable tool for management and selection. From a breeding and production viewpoint, milk yield persistency and total milk yield are important traits. Understanding the genetic drivers for the phenotypic variation of both these traits could provide a means for improving these traits in commercial production.MethodsIt has been shown that Natural Cubic Smoothing Splines (NCSS) can model the features of lactation curves with greater flexibility than the traditional parametric methods. NCSS were used to model the sire effect on the lactation curves of cows. The sire solutions for persistency and total milk yield were derived using NCSS and a whole-genome approach based on a hierarchical model was developed for a large association study using single nucleotide polymorphisms (SNP).ResultsEstimated sire breeding values (EBV) for persistency and milk yield were calculated using NCSS. Persistency EBV were correlated with peak yield but not with total milk yield. Several SNP were found to be associated with both traits and these were used to identify candidate genes for further investigation.ConclusionNCSS can be used to estimate EBV for lactation persistency and total milk yield, which in turn can be used in whole-genome association studies.Klara L. Verbyla and Arunas P. Verbyl

Crossref

Adelaide Research & Scholarship

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Research Online