16 research outputs found

    Prediction of a deletion copy number variant by a dense SNP panel

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A newly recognized type of genetic variation, Copy Number Variation (CNV), is detected in mammalian genomes, e.g. the cattle genome. This form of variation can potentially cause phenotypic variation. Our objective was to determine whether dense SNP (single nucleotide polymorphisms) panels can capture the genetic variation due to a simple bi-allelic CNV, with the prospect of including the effect of such structural variations into genomic predictions.</p> <p>Methods</p> <p>A deletion type CNV on bovine chromosome 6 was predicted from its neighboring SNP with a multiple regression model. Our dataset consisted of CNV genotypes of 1,682 cows, along with 100 surrounding SNP genotypes. A prediction model was fitted considering 10 to 100 surrounding SNP and the accuracy obtained directly from the model was confirmed by cross-validation.</p> <p>Results and conclusions</p> <p>The accuracy of prediction increased with an increasing number of SNP in the model and the predicted accuracies were similar to those obtained by cross-validation. A substantial increase in accuracy was observed when the number of SNP increased from 10 to 50 but thereafter the increase was smaller, reaching the highest accuracy (0.94) with 100 surrounding SNP. Thus, we conclude that the genotype of a deletion type CNV and its putative QTL effect can be predicted with a maximum accuracy of 0.94 from surrounding SNP. This high prediction accuracy suggests that genetic variation due to simple deletion CNV is well captured by dense SNP panels. Since genomic selection relies on the availability of a dense marker panel with markers in close linkage disequilibrium to the QTL in order to predict their genetic values, we also discuss opportunities for genomic selection to predict the effects of CNV by dense SNP panels, when CNV cause variation in quantitative traits.</p

    Comparison of Genome-Wide Association Methods in Analyses of Admixed Populations with Complex Familial Relationships

    No full text
    <div><p>Population structure is known to cause false-positive detection in association studies. We compared the power, precision, and type-I error rates of various association models in analyses of a simulated dataset with structure at the population (admixture from two populations; <i>P</i>) and family (<i>K</i>) levels. We also compared type-I error rates among models in analyses of publicly available human and dog datasets. The models corrected for none, one, or both structure levels. Correction for <i>K</i> was performed with linear mixed models incorporating familial relationships estimated from pedigrees or genetic markers. Linear models that ignored <i>K</i> were also tested. Correction for <i>P</i> was performed using principal component or structured association analysis. In analyses of simulated and real data, linear mixed models that corrected for <i>K</i> were able to control for type-I error, regardless of whether they also corrected for <i>P</i>. In contrast, correction for <i>P</i> alone in linear models was insufficient. The power and precision of linear mixed models with and without correction for <i>P</i> were similar. Furthermore, power, precision, and type-I error rate were comparable in linear mixed models incorporating pedigree and genomic relationships. In summary, in association studies using samples with both <i>P</i> and <i>K</i>, ancestries estimated using principal components or structured assignment were not sufficient to correct type-I errors. In such cases type-I errors may be controlled by use of linear mixed models with relationships derived from either pedigree or from genetic markers.</p></div

    Distribution of individuals' ancestries after admixture (100 replicates).

    No full text
    <p>Distribution of individuals' ancestries after admixture (100 replicates).</p

    Absolute differences in allele frequencies of single-nucleotide polymorphism (SNP) markers between populations (100 replicates).

    No full text
    <p>Absolute differences in allele frequencies of single-nucleotide polymorphism (SNP) markers between populations (100 replicates).</p

    Quantile-quantile plot of −log<sub>10</sub><i>p</i>-values for association tests of the human GOLDN dataset using different models.

    No full text
    <p>Quantile-quantile plot of −log<sub>10</sub><i>p</i>-values for association tests of the human GOLDN dataset using different models.</p

    Absolute error (cM) in quantitative trait loci localization.

    No full text
    <p>Precision is given as the absolute genetic distance between simulated and detected quantitative trait loci (±1 cM).</p><p><i>LMMped = Linear Mixed Model Including Pedigree-Based Relationship, LMMpca = Principal Component Analysis in a Linear Mixed Model, LMMstr = Linear Mixed Model with STRUCTURE, LMMgmat = Linear Mixed Model Including Genomic Relationship.</i></p

    Average number of significant single-nucleotide polymorphisms (SNPs; 100 replicates) in five chromosomes (10,000 SNPs) without simulated quantitative trait loci.

    No full text
    <p>Standard errors are given in parentheses. The average number of significant SNPs (S<sub>obs</sub>) in 100 replicates was compared with the expected number (S<sub>exp</sub>) at different significance levels using <i>t</i>-tests. (H<sub>0</sub>: S<sub>obs</sub> = S<sub>exp</sub>; H<sub>1</sub>: S<sub>obs</sub>>S<sub>exp</sub>; *<i>p</i><0.05, **<i>p</i><0.01). Significance level of 0.000005 corresponds to a nominal significance level of 0.05 after Bonferroni correction for 10000 tests.</p><p><i>LMMped = Linear Mixed Model Including Pedigree-Based Relationship, LMMstr = Linear Mixed Model with STRUCTURE, LMMpca = Principal Component Analysis in a Linear Mixed Model, LMMgmat = Linear Mixed Model Including Genomic Relationship, LM = Linear Model, LMstr = Linear Model with STRUCTURE, LMpca = Principal Component Analysis in a Linear Model, P = admixture, K = Familial relationships.</i></p

    Differences in mean phenotypes of two populations after separation for 30 generations (100 replicates).

    No full text
    <p>Differences in mean phenotypes of two populations after separation for 30 generations (100 replicates).</p
    corecore