16 research outputs found
Prediction of a deletion copy number variant by a dense SNP panel
<p>Abstract</p> <p>Background</p> <p>A newly recognized type of genetic variation, Copy Number Variation (CNV), is detected in mammalian genomes, e.g. the cattle genome. This form of variation can potentially cause phenotypic variation. Our objective was to determine whether dense SNP (single nucleotide polymorphisms) panels can capture the genetic variation due to a simple bi-allelic CNV, with the prospect of including the effect of such structural variations into genomic predictions.</p> <p>Methods</p> <p>A deletion type CNV on bovine chromosome 6 was predicted from its neighboring SNP with a multiple regression model. Our dataset consisted of CNV genotypes of 1,682 cows, along with 100 surrounding SNP genotypes. A prediction model was fitted considering 10 to 100 surrounding SNP and the accuracy obtained directly from the model was confirmed by cross-validation.</p> <p>Results and conclusions</p> <p>The accuracy of prediction increased with an increasing number of SNP in the model and the predicted accuracies were similar to those obtained by cross-validation. A substantial increase in accuracy was observed when the number of SNP increased from 10 to 50 but thereafter the increase was smaller, reaching the highest accuracy (0.94) with 100 surrounding SNP. Thus, we conclude that the genotype of a deletion type CNV and its putative QTL effect can be predicted with a maximum accuracy of 0.94 from surrounding SNP. This high prediction accuracy suggests that genetic variation due to simple deletion CNV is well captured by dense SNP panels. Since genomic selection relies on the availability of a dense marker panel with markers in close linkage disequilibrium to the QTL in order to predict their genetic values, we also discuss opportunities for genomic selection to predict the effects of CNV by dense SNP panels, when CNV cause variation in quantitative traits.</p
Comparison of linear mixed model analysis and genealogy-based haplotype clustering with a Bayesian approach for association mapping in a pedigreed population
Comparison of Genome-Wide Association Methods in Analyses of Admixed Populations with Complex Familial Relationships
<div><p>Population structure is known to cause false-positive detection in association studies. We compared the power, precision, and type-I error rates of various association models in analyses of a simulated dataset with structure at the population (admixture from two populations; <i>P</i>) and family (<i>K</i>) levels. We also compared type-I error rates among models in analyses of publicly available human and dog datasets. The models corrected for none, one, or both structure levels. Correction for <i>K</i> was performed with linear mixed models incorporating familial relationships estimated from pedigrees or genetic markers. Linear models that ignored <i>K</i> were also tested. Correction for <i>P</i> was performed using principal component or structured association analysis. In analyses of simulated and real data, linear mixed models that corrected for <i>K</i> were able to control for type-I error, regardless of whether they also corrected for <i>P</i>. In contrast, correction for <i>P</i> alone in linear models was insufficient. The power and precision of linear mixed models with and without correction for <i>P</i> were similar. Furthermore, power, precision, and type-I error rate were comparable in linear mixed models incorporating pedigree and genomic relationships. In summary, in association studies using samples with both <i>P</i> and <i>K</i>, ancestries estimated using principal components or structured assignment were not sufficient to correct type-I errors. In such cases type-I errors may be controlled by use of linear mixed models with relationships derived from either pedigree or from genetic markers.</p></div
Distribution of individuals' ancestries after admixture (100 replicates).
<p>Distribution of individuals' ancestries after admixture (100 replicates).</p
Absolute differences in allele frequencies of single-nucleotide polymorphism (SNP) markers between populations (100 replicates).
<p>Absolute differences in allele frequencies of single-nucleotide polymorphism (SNP) markers between populations (100 replicates).</p
Schematic representation of the simulation.
<p>Schematic representation of the simulation.</p
Quantile-quantile plot of âlog<sub>10</sub><i>p</i>-values for association tests of the human GOLDN dataset using different models.
<p>Quantile-quantile plot of âlog<sub>10</sub><i>p</i>-values for association tests of the human GOLDN dataset using different models.</p
Absolute error (cM) in quantitative trait loci localization.
<p>Precision is given as the absolute genetic distance between simulated and detected quantitative trait loci (Âą1 cM).</p><p><i>LMMpedâ=âLinear Mixed Model Including Pedigree-Based Relationship, LMMpcaâ=âPrincipal Component Analysis in a Linear Mixed Model, LMMstrâ=âLinear Mixed Model with STRUCTURE, LMMgmatâ=âLinear Mixed Model Including Genomic Relationship.</i></p
Average number of significant single-nucleotide polymorphisms (SNPs; 100 replicates) in five chromosomes (10,000 SNPs) without simulated quantitative trait loci.
<p>Standard errors are given in parentheses. The average number of significant SNPs (S<sub>obs</sub>) in 100 replicates was compared with the expected number (S<sub>exp</sub>) at different significance levels using <i>t</i>-tests. (H<sub>0</sub>: S<sub>obs</sub>â=âS<sub>exp</sub>; H<sub>1</sub>: S<sub>obs</sub>>S<sub>exp</sub>; *<i>p</i><0.05, **<i>p</i><0.01). Significance level of 0.000005 corresponds to a nominal significance level of 0.05 after Bonferroni correction for 10000 tests.</p><p><i>LMMpedâ=âLinear Mixed Model Including Pedigree-Based Relationship, LMMstrâ=âLinear Mixed Model with STRUCTURE, LMMpcaâ=âPrincipal Component Analysis in a Linear Mixed Model, LMMgmatâ=âLinear Mixed Model Including Genomic Relationship, LMâ=âLinear Model, LMstrâ=âLinear Model with STRUCTURE, LMpcaâ=âPrincipal Component Analysis in a Linear Model, Pâ=âadmixture, Kâ=âFamilial relationships.</i></p
Differences in mean phenotypes of two populations after separation for 30 generations (100 replicates).
<p>Differences in mean phenotypes of two populations after separation for 30 generations (100 replicates).</p