Search CORE

65 research outputs found

Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

Author: Wu Xiao-Lin
Xu Jiaqi
Feng Guofei
Wiggans George R.
Taylor Jeremy F.
He Jun
Qian Changsong
Qiu Jiansheng
Simpson Barry
Walker Jeremy
Bauck Stewart
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 03/02/2015
Field of study

Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with more computing time. Nevertheless, the differences diminished when \u3e5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with \u3e3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal

OpenEdition

Genomic evaluations with many more genotypes

Author: A Flaquer
A Toosi
B Harris
C Henderson
D Habier
G Wiggans
G Wiggans
George R Wiggans
J Burdick
J Cole
J Cole
J Taylor
J Yang
Jeffrey R O'Connell
K Weigel
K Weigel
KA Weigel
Kent A Weigel
M Calus
M Lund
M Sargolzaei
N Macciotta
P VanRaden
P VanRaden
P VanRaden
P VanRaden
P Vanraden
Paul M VanRaden
PM VanRaden
R Villa-Angulo
T Druet
T Meuwissen
T Solberg
T Villumsen
Y Li
Z Liu
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genomic evaluations in Holstein dairy cattle have quickly become more reliable over the last two years in many countries as more animals have been genotyped for 50,000 markers. Evaluations can also include animals genotyped with more or fewer markers using new tools such as the 777,000 or 2,900 marker chips recently introduced for cattle. Gains from more markers can be predicted using simulation, whereas strategies to use fewer markers have been compared using subsets of actual genotypes. The overall cost of selection is reduced by genotyping most animals at less than the highest density and imputing their missing genotypes using haplotypes. Algorithms to combine different densities need to be efficient because numbers of genotyped animals and markers may continue to grow quickly. Methods Genotypes for 500,000 markers were simulated for the 33,414 Holsteins that had 50,000 marker genotypes in the North American database. Another 86,465 non-genotyped ancestors were included in the pedigree file, and linkage disequilibrium was generated directly in the base population. Mixed density datasets were created by keeping 50,000 (every tenth) of the markers for most animals. Missing genotypes were imputed using a combination of population haplotyping and pedigree haplotyping. Reliabilities of genomic evaluations using linear and nonlinear methods were compared. Results Differing marker sets for a large population were combined with just a few hours of computation. About 95% of paternal alleles were determined correctly, and > 95% of missing genotypes were called correctly. Reliability of breeding values was already high (84.4%) with 50,000 simulated markers. The gain in reliability from increasing the number of markers to 500,000 was only 1.6%, but more than half of that gain resulted from genotyping just 1,406 young bulls at higher density. Linear genomic evaluations had reliabilities 1.5% lower than the nonlinear evaluations with 50,000 markers and 1.6% lower with 500,000 markers. Conclusions Methods to impute genotypes and compute genomic evaluations were affordable with many more markers. Reliabilities for individual animals can be modified to reflect success of imputation. Breeders can improve reliability at lower cost by combining marker densities to increase both the numbers of markers and animals included in genomic evaluation. Larger gains are expected from increasing the number of animals than the number of markers.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Design of a Bovine Low-Density SNP Array Optimized for Imputation

Author: AL Van Eenennaam
André Eggen
Ben J. Hayes
CJ Edwards
Curtis P. Van Tassell
Cynthia T. Lawley
Didier Boichard
EL Heffner
George R. Wiggans
GR Wiggans
GR Wiggans
HD Daetwyler
Hoyoung Chung
J Johnston
JE Pryce
JE Pryce
JM Hickey
KA Weigel
Karine A. Viaud-Martinez
Kimberly J. Gietzen
LK Matukumalli
LR Schaeffer
P Scheet
Paul M. VanRaden
PM VanRaden
R Dassonneville
R Dassonneville
Romain Dassonneville
SR Browning
Sébastien Fritz
T Druet
T Druet
Tad S. Sonstegard
THE Meuwissen
Xavier David
Zhanjiang Liu
Publication venue: Public Library of Science
Publication date: 28/03/2012
Field of study

The Illumina BovineLD BeadChip was designed to support imputation to higher density genotypes in dairy and beef breeds by including single-nucleotide polymorphisms (SNPs) that had a high minor allele frequency as well as uniform spacing across the genome except at the ends of the chromosome where densities were increased. The chip also includes SNPs on the Y chromosome and mitochondrial DNA loci that are useful for determining subspecies classification and certain paternal and maternal breed lineages. The total number of SNPs was 6,909. Accuracy of imputation to Illumina BovineSNP50 genotypes using the BovineLD chip was over 97% for most dairy and beef populations. The BovineLD imputations were about 3 percentage points more accurate than those from the Illumina GoldenGate Bovine3K BeadChip across multiple populations. The improvement was greatest when neither parent was genotyped. The minor allele frequencies were similar across taurine beef and dairy breeds as was the proportion of SNPs that were polymorphic. The new BovineLD chip should facilitate low-cost genomic selection in taurine beef and dairy cattle

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Queensland eSpace

FigShare

Imputation of Missing Genotypes from Sparse to High Density Using Long-Range Phasing

Author: Arias
Ben J. Hayes
Clark
George R. Wiggans
Hans D. Daetwyler
John A. Woolliams
Meuwissen
Meuwissen
Mike E. Goddard
Nejati-Javaremi
Villumsen
Weeks
Publication venue: 'Genetics Society of America'
Publication date: 01/01/2011
Field of study

Related individuals share potentially long chromosome segments that trace to a common ancestor. We describe a phasing algorithm (ChromoPhase) that utilizes this characteristic of finite populations to phase large sections of a chromosome. In addition to phasing, our method imputes missing genotypes in individuals genotyped at lower marker density when more densely genotyped relatives are available. ChromoPhase uses a pedigree to collect an individual's (the proband) surrogate parents and offspring and uses genotypic similarity to identify its genomic surrogates. The algorithm then cycles through the relatives and genomic surrogates one at a time to find shared chromosome segments. Once a segment has been identified, any missing information in the proband is filled in with information from the relative. We tested ChromoPhase in a simulated population consisting of 400 individuals at a marker density of 1500/M, which is approximately equivalent to a 50K bovine single nucleotide polymorphism chip. In simulated data, 99.9% loci were correctly phased and, when imputing from 100 to 1500 markers, more than 87% of missing genotypes were correctly imputed. Performance increased when the number of generations available in the pedigree increased, but was reduced when the sparse genotype contained fewer loci. However, in simulated data, ChromoPhase correctly imputed at least 12% more genotypes than fastPHASE, depending on sparse marker density. We also tested the algorithm in a real Holstein cattle data set to impute 50K genotypes in animals with a sparse 3K genotype. In these data 92% of genotypes were correctly imputed in animals with a genotyped sire. We evaluated the accuracy of genomic predictions with the dense, sparse, and imputed simulated data sets and show that the reduction in genomic evaluation accuracy is modest even with imperfectly imputed genotype data. Our results demonstrate that imputation of missing genotypes, and potentially full genome sequence, using long-range phasing is feasible

Crossref

PubMed Central

Edinburgh Research Explorer

Wageningen University & Research Publications

Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array

Author: Bickhart Derek M
Boichard Didier A
DeNise Sue
Eggen André
Fritz Sébastien
Hou Yali
Hvinden Miranda L
Li Congjun
Liu George E
Song Jiuzhou
Sonstegard Tad S
Van Tassell Curtis P
Wiggans George R
Publication venue: Springer Nature
Publication date: 01/01/2012
Field of study

Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases. In this study using the high density BovineHD SNP array, we performed high resolution CNV analyses on both Btau_4.0 and UMD3.1 with 674 animals of 27 cattle breeds. We first compared CNV results derived from these two different SNP array platforms on Btau_4.0. With two thirds of the animals shared between studies, on Btau_4.0 we identified 3,346 candidate CNV regions representing 142.7 megabases (~4.70%) of the genome. With a similar total length but 5 times more event counts, the average CNVR length of current Btau_4.0 dataset is significantly shorter than the previous one (42.7 kb vs. 205 kb). Although subsets of these two results overlapped, 64% (91.6 megabases) of current dataset was not present in the previous study. We also performed similar analyses on UMD3.1 using these BovineHD SNP array results. Approximately 50% more and 20% longer CNVs were called on UMD3.1 as compared to those on Btau_4.0. However, a comparable result of CNVRs (3,438 regions with a total length 146.9 megabases) was obtained. We suspect that these results are due to the UMD3.1 assembly's efforts of placing unplaced contigs and removing unmerged alleles. Selected CNVs were further experimentally validated, achieving a 73% PCR validation rate, which is considerably higher than the previous validation rate. About 20-45% of CNV regions overlapped with cattle RefSeq genes and Ensembl genes. Panther and IPA analyses indicated that these genes provide a wide spectrum of biological processes involving immune system, lipid metabolism, cell, organism and system development. In this study using the high density BovineHD SNP array, we performed high resolution CNV analyses on both Btau_4.0 and UMD3.1 with 674 animals of 27 cattle breeds. We first compared CNV results derived from these two different SNP array platforms on Btau_4.0. With two thirds of the animals shared between studies, on Btau_4.0 we identified 3,346 candidate CNV regions representing 142.7 megabases (~4.70%) of the genome. With a similar total length but 5 times more event counts, the average CNVR length of current Btau_4.0 dataset is significantly shorter than the previous one (42.7 kb vs. 205 kb). Although subsets of these two results overlapped, 64% (91.6 megabases) of current dataset was not present in the previous study. We also performed similar analyses on UMD3.1 using these BovineHD SNP array results. Approximately 50% more and 20% longer CNVs were called on UMD3.1 as compared to those on Btau_4.0. However, a comparable result of CNVRs (3,438 regions with a total length 146.9 megabases) was obtained. We suspect that these results are due to the UMD3.1 assembly's efforts of placing unplaced contigs and removing unmerged alleles. Selected CNVs were further experimentally validated, achieving a 73% PCR validation rate, which is considerably higher than the previous validation rate. About 20-45% of CNV regions overlapped with cattle RefSeq genes and Ensembl genes. Panther and IPA analyses indicated that these genes provide a wide spectrum of biological processes involving immune system, lipid metabolism, cell, organism and system development. We present a comprehensive result of cattle CNVs at a higher resolution and sensitivity. We identified over 3,000 candidate CNV regions on both Btau_4.0 and UMD3.1, further compared current datasets with previous results, and examined the impacts of genome assemblies on CNV calling.https://doi.org/10.1186/1471-2164-13-37

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

ProdInra

Genome-wide association analysis of thirty one production, health, reproduction and body conformation traits in contemporary U.S. Holstein cows

Author: AV Zimin
AV Zimin
B Grisart
Bovine HapMap Consortium
Brian A Crooker
CP Van Tassell
Curtis P Van Tassell
D Kolbehdari
DE Dostal
G Sahana
George R Wiggans
GR Wiggans
Holstein Association USA
J Hendrickx
JB Cole
Jing Yang
John B Cole
L Ma
L Ma
Lakshmi K Matukumalli
Li Ma
LK Matukumalli
M Arenas
MD Mai
ME Goddard
P Cohen
PM VanRaden
PM VanRaden
R Development Core Team
S Leimkühler
S Shenolikar
Shengwen Wang
Tad S Sonstegard
THE Meuwissen
Thomas J Lawlor
TS Sonstegard
U Brandt
Y Mao
Yang Da
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genome-wide association analysis is a powerful tool for annotating phenotypic effects on the genome and knowledge of genes and chromosomal regions associated with dairy phenotypes is useful for genome and gene-based selection. Here, we report results of a genome-wide analysis of predicted transmitting ability (PTA) of 31 production, health, reproduction and body conformation traits in contemporary Holstein cows. Results Genome-wide association analysis identified a number of candidate genes and chromosome regions associated with 31 dairy traits in contemporary U.S. Holstein cows. Highly significant genes and chromosome regions include: BTA13's <it>GNAS </it>region for milk, fat and protein yields; BTA7's <it>INSR </it>region and BTAX's <it>LOC520057 </it>and <it>GRIA3 </it>for daughter pregnancy rate, somatic cell score and productive life; BTA2's <it>LRP1B </it>for somatic cell score; BTA14's <it>DGAT1-NIBP </it>region for fat percentage; <it>BTA1</it>'s <it>FKBP2 </it>for protein yields and percentage, BTA26's <it>MGMT </it>and BTA6's <it>PDGFRA </it>for protein percentage; BTA18's 53.9-58.7 Mb region for service-sire and daughter calving ease and service-sire stillbirth; BTA18's <it>PGLYRP1</it>-<it>IGFL1 </it>region for a large number of traits; BTA18's <it>LOC787057 </it>for service-sire stillbirth and daughter calving ease; BTA15's <it>CD82</it>, BTA23's <it>DST </it>and the <it>MOCS1</it>-<it>LRFN2 </it>region for daughter stillbirth; and BTAX's <it>LOC520057 </it>and <it>GRIA3 </it>for daughter pregnancy rate. For body conformation traits, BTA11, BTAX, BTA10, BTA5, and BTA26 had the largest concentrations of SNP effects, and <it>PHKA2 </it>of BTAX and <it>REN </it>of BTA16 had the most significant effects for body size traits. For body shape traits, BTAX, BTA19 and BTA3 were most significant. Udder traits were affected by BTA16, BTA22, BTAX, BTA2, BTA10, BTA11, BTA20, BTA22 and BTA25, teat traits were affected by BTA6, BTA7, BTA9, BTA16, BTA11, BTA26 and BTA17, and feet/legs traits were affected by BTA11, BTA13, BTA18, BTA20, and BTA26. Conclusions Genome-wide association analysis identified a number of genes and chromosome regions associated with 31 production, health, reproduction and body conformation traits in contemporary Holstein cows. The results provide useful information for annotating phenotypic effects on the dairy genome and for building consensus of dairy QTL effects.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Updating test-day milk yield factors for use in genetic evaluations and dairy production systems: a comprehensive review

Author: Asha M. Miles
Curtis P. Van Tassell
George R. Wiggans
H. Duan Norman
Javier Burchard
Jay Mattison
João Dürr
Malia J. Caputo
Ransom L. Baldwin
Steven Sievert
Xiao-Lin Wu
Xiao-Lin Wu
Publication venue: Frontiers Media S.A.
Publication date: 01/12/2023
Field of study

Various methods have been proposed to estimate daily yield from partial yields, primarily to deal with unequal milking intervals. This paper offers an exhaustive review of daily milk yields, the foundation of lactation records. Seminal advancements in the late 20th century concentrated on two main adjustment metrics: additive additive correction factors (ACF) and multiplicative correction factors (MCF). An ACF model provides additive adjustments to two times AM or PM milk yield, which then becomes the estimated daily yields, whereas an MCF is a ratio of daily yield to the yield from a single milking. Recent studies highlight the potential of alternative approaches, such as exponential regression and other nonlinear models. Biologically, milk secretion rates are not linear throughout the entire milking interval, influenced by the internal mammary gland pressure. Consequently, nonlinear models are appealing for estimating daily milk yields as well. MCFs and ACFs are typically determined for discrete milking interval classes. Nonetheless, large discrete intervals can introduce systematic biases. A universal solution for deriving continuous correction factors has been proposed, ensuring reduced bias and enhanced daily milk yield estimation accuracy. When leveraging test-day milk yields for genetic evaluations in dairy cattle, two predominant statistical models are employed: lactation and test-day yield models. A lactation model capitalizes on the high heritability of total lactation yields, aligning closely with dairy producers’ needs because the total amount of milk production in a lactation directly determines farm revenue. However, a lactation yield model without harnessing all test-day records may ignore vital data about the shapes of lactation curves needed for informed breeding decisions. In contrast, a test-day model emphasizes individual test-day data, accommodating various intervals and recording plans and allowing the estimation of environmental effects on specific test days. In the United States, the patenting of test-day models in 1993 used to restrict the use of test-day models to regional and unofficial evaluations by the patent holders. Estimated test-day milk yields have been used as if they were accurate depictions of actual milk yields, neglecting possible estimation errors. Its potential consequences on subsequent genetic evaluations have not been sufficiently addressed. Moving forward, there are still numerous questions and challenges in this domain

Directory of Open Access Journals

Genomic characteristics of cattle copy number variations

Abstract Background Copy number variation (CNV) represents another important source of genetic variation complementary to single nucleotide polymorphism (SNP). High-density SNP array data have been routinely used to detect human CNVs, many of which have significant functional effects on gene expression and human diseases. In the dairy industry, a large quantity of SNP genotyping results are becoming available and can be used for CNV discovery to understand and accelerate genetic improvement for complex traits. Results We performed a systematic analysis of CNV using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the pedigree information, we identified 682 candidate CNV regions, which represent 139.8 megabases (~4.60%) of the genome. Selected CNVs were further experimentally validated and we found that copy number "gain" CNVs were predominantly clustered in tandem rather than existing as interspersed duplications. Many CNV regions (~56%) overlap with cattle genes (1,263), which are significantly enriched for immunity, lactation, reproduction and rumination. The overlap of this new dataset and other published CNV studies was less than 40%; however, our discovery of large, high frequency (> 5% of animals surveyed) CNV regions showed 90% agreement with other studies. These results highlight the differences and commonalities between technical platforms. Conclusions We present a comprehensive genomic analysis of cattle CNVs derived from SNP data which will be a valuable genomic variation resource. Combined with SNP detection assays, gene-containing CNV regions may help identify genes undergoing artificial selection in domesticated animals.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

Digital Repository at the University of Maryland

Anglo-Dutch Premium Auctions in Eighteenth-Century Amsterdam

Author: &quot
&quot
&quot
&quot
&quot
&quot
Alvin E Roth
Alvin E Roth
Amsterdamsch Effectenblad
Ann Carlos
Anne Goldgar
Audrey Hu
Avner Greif
Avner Greif
Avner Greif
Avner Greif
C A Schillemans
Carsten Burhop
Christiaan Bochove
Christiaan Bochove
Christiaan van Bochove
Dan Levin
Daniel Quint
Darrell Duffie
De Marchi
Dietrich Ebeling
Domenic Vitiello
Edwin Meerkerk
Effectenblad Nieuw Algemeen
Eric Maskin
Ernst Baasch
Fernand Braudel
Georg Thielmann
George M Welling
H J Hoes
Helmuts Azacis
J Mak Van Waay
J Neyman
Jacob Goeree
Jan Vries
Jean Ricard
Jeffrey M Wooldridge
Johannes Postma
Johannes Voort
John Montias
Joost Jonker
Joost Jonker
Kenneth Hendricks
L A Prooije
Larry Neal
Lars Boerner
Lars Boerner
Lars Boerner
Le Moine De L&apos
Leydse Courant
Lodewijk Petram
Lynn Lopucki
Marie Kr�hne
Martin Shubik
Maureen O&apos
Maureen O&apos
Maureen O&apos
Mikoo Malinowski
Nederlandsche De Maandelykse
Nicole Steen
Noordkerk
Oskar Gelderblom
Oskar Gelderblom
P G M Dickson
P J Middelhoven
Paul Klemperer
Paul Klemperer
Paul Klemperer
Paul Milgrom
Paul Milgrom
Paul Milgrom
Paul Milgrom
Peter Koudijs
Pierre-Cyrille Hautcoeur
Pieter Scheltema
Prijs-Courant Der Effecten
R Liesker
Ralph Cassidy
Ranald C Michie
Remonstrantse Aca
Richard Engelbrecht-Wiggans
Richard Engelbrecht-Wiggans
Richard Mclean
Robert S Lopez
Robert Sobel
Roger Myerson
Rosa Philips
Sushil Bikchandani
Vijay Krishna
W P Sautijn Kluit
Wegener Sleeswijk
Wilhelm Stieda
William R Scott
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Crossref

Genetic selection: Evaluation and methods

Author: Gengler Nicolas
Wiggans George R.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

peer reviewedThe ultimate goal of animal selection is to create a new generation of animals that are superior to the current population. Superior is interpreted broadly to include functionality of animals, cost reduction of production, consumer perception, quality of products, and reduced environmental impact. These factors contribute to overall sustainability and long-term economic profitability of animal production. An essential element of selection is a genetic evaluation system for the detection of superior animals to be used to produce future generations. Current genetic evaluations use phenotypic records and advanced statistical methods to separate genetic and environmental effects. These traditional methods are complemented by DNA-based technologies that provide genetic information at a molecular level. Genetic evaluation systems are highly complex and involve collection of data from thousands of farms, determination of milk characteristics in laboratories, processing and storage of data in regional computing centers, and application of advanced statistical procedures to estimate genetic merit. Genetic evaluations are widely distributed and are the primary determiner of the value of semen and embryos. Internationally, bull evaluations are combined across countries so that each country has a single national ranking of all bulls worldwide. Selection decisions on farms and by artificial insemination organizations are highly dependent on that genetic information. This article covers aspects of genetic selection that stretch from basic data collection (including identification systems), traits recorded and evaluated, and characteristics of current and future evaluation systems to new DNA-based technologies

Open Repository and Bibliography - Liège