101 research outputs found

    A genome-wide study of Hardy–Weinberg equilibrium with next generation sequence data

    Get PDF
    Statistical tests for Hardy–Weinberg equilibrium have been an important tool for detecting genotyping errors in the past, and remain important in the quality control of next generation sequence data. In this paper, we analyze complete chromosomes of the 1000 genomes project by using exact test procedures for autosomal and X-chromosomal variants. We find that the rate of disequilibrium largely exceeds what might be expected by chance alone for all chromosomes. Observed disequilibrium is, in about 60% of the cases, due to heterozygote excess. We suggest that most excess disequilibrium can be explained by sequencing problems, and hypothesize mechanisms that can explain exceptional heterozygosities. We report higher rates of disequilibrium for the MHC region on chromosome 6, regions flanking centromeres and p-arms of acrocentric chromosomes. We also detected long-range haplotypes and areas with incidental high disequilibrium. We report disequilibrium to be related to read depth, with variants having extreme read depths being more likely to be out of equilibrium. Disequilibrium rates were found to be 11 times higher in segmental duplications and simple tandem repeat regions. The variants with significant disequilibrium are seen to be concentrated in these areas. For next generation sequence data, Hardy–Weinberg disequilibrium seems to be a major indicator for copy number variation.Peer ReviewedPostprint (published version

    Quantitative Analysis of Single Nucleotide Polymorphisms within Copy Number Variation

    Get PDF
    BACKGROUND: Single nucleotide polymorphisms (SNPs) have been used extensively in genetics and epidemiology studies. Traditionally, SNPs that did not pass the Hardy-Weinberg equilibrium (HWE) test were excluded from these analyses. Many investigators have addressed possible causes for departure from HWE, including genotyping errors, population admixture and segmental duplication. Recent large-scale surveys have revealed abundant structural variations in the human genome, including copy number variations (CNVs). This suggests that a significant number of SNPs must be within these regions, which may cause deviation from HWE. RESULTS: We performed a Bayesian analysis on the potential effect of copy number variation, segmental duplication and genotyping errors on the behavior of SNPs. Our results suggest that copy number variation is a major factor of HWE violation for SNPs with a small minor allele frequency, when the sample size is large and the genotyping error rate is 0~1%. CONCLUSIONS: Our study provides the posterior probability that a SNP falls in a CNV or a segmental duplication, given the observed allele frequency of the SNP, sample size and the significance level of HWE testing

    Association between DNA Damage Response and Repair Genes and Risk of Invasive Serous Ovarian Cancer

    Get PDF
    BACKGROUND: We analyzed the association between 53 genes related to DNA repair and p53-mediated damage response and serous ovarian cancer risk using case-control data from the North Carolina Ovarian Cancer Study (NCOCS), a population-based, case-control study. METHODS/PRINCIPAL FINDINGS: The analysis was restricted to 364 invasive serous ovarian cancer cases and 761 controls of white, non-Hispanic race. Statistical analysis was two staged: a screen using marginal Bayes factors (BFs) for 484 SNPs and a modeling stage in which we calculated multivariate adjusted posterior probabilities of association for 77 SNPs that passed the screen. These probabilities were conditional on subject age at diagnosis/interview, batch, a DNA quality metric and genotypes of other SNPs and allowed for uncertainty in the genetic parameterizations of the SNPs and number of associated SNPs. Six SNPs had Bayes factors greater than 10 in favor of an association with invasive serous ovarian cancer. These included rs5762746 (median OR(odds ratio)(per allele) = 0.66; 95% credible interval (CI) = 0.44-1.00) and rs6005835 (median OR(per allele) = 0.69; 95% CI = 0.53-0.91) in CHEK2, rs2078486 (median OR(per allele) = 1.65; 95% CI = 1.21-2.25) and rs12951053 (median OR(per allele) = 1.65; 95% CI = 1.20-2.26) in TP53, rs411697 (median OR (rare homozygote) = 0.53; 95% CI = 0.35 - 0.79) in BACH1 and rs10131 (median OR( rare homozygote) = not estimable) in LIG4. The six most highly associated SNPs are either predicted to be functionally significant or are in LD with such a variant. The variants in TP53 were confirmed to be associated in a large follow-up study. CONCLUSIONS/SIGNIFICANCE: Based on our findings, further follow-up of the DNA repair and response pathways in a larger dataset is warranted to confirm these results

    Testing for Hardy–Weinberg equilibrium at biallelic genetic markers on the X chromosome

    Get PDF
    Testing genetic markers for Hardy–Weinberg equilibrium (HWE) is an important tool for detecting genotyping errors in large-scale genotyping studies. For markers at the X chromosome, typically the ¿2 or exact test is applied to the females only, and the hemizygous males are considered to be uninformative. In this paper we show that the males are relevant, because a difference in allele frequency between males and females may indicate HWE not to hold. The testing of markers on the X chromosome has received little attention, and in this paper we lay down the foundation for testing biallelic X-chromosomal markers for HWE. We develop four frequentist statistical test procedures for X-linked markers that take both males and females into account: the ¿2 test, likelihood ratio test, exact test and permutation test. Exact tests that include males are shown to have a better Type I error rate. Empirical data from the GENEVA project on venous thromboembolism is used to illustrate the proposed tests. Results obtained with the new tests differ substantially from tests that are based on female genotype counts only. The new tests detect differences in allele frequencies and seem able to uncover additional genotyping error that would have gone unnoticed in HWE tests based on females onlyPeer ReviewedPostprint (published version

    Association between TCF7L2 gene polymorphisms and susceptibility to Type 2 Diabetes Mellitus: a large Human Genome Epidemiology (HuGE) review and meta-analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Transcription factor 7-like 2 (<it>TCF7L2</it>) has been shown to be associated with type 2 diabetes mellitus (T2MD) in multiple ethnic groups in the past two years, but, contradictory results were reported for Chinese and Pima Indian populations. The authors then performed a large meta-analysis of 36 studies examining the association of type 2 diabetes mellitus (T2DM) with polymorphisms in the <it>TCF7L2 </it>gene in various ethnicities, containing rs7903146 C-to-T (IVS3C>T), rs7901695 T-to-C (IVS3T>C), a rs12255372 G-to-T (IVS4G>T), and rs11196205 G-to-C (IVS4G>C) polymorphisms and to evaluate the size of gene effect and the possible genetic mode of action.</p> <p>Methods</p> <p>Literature-based searching was conducted to collect data and three methods, that is, fixed-effects, random-effects and Bayesian multivariate mete-analysis, were performed to pool the odds ratio (<it>OR</it>). Publication bias and study-between heterogeneity were also examined.</p> <p>Results</p> <p>The studies included 35,843 cases of T2DM and 39,123 controls, using mainly primary data. For T2DM and IVS3C>T polymorphism, the Bayesian <it>OR </it>for TT homozygotes and TC heterozygotes versus CC homozygote was 1.968 (95% credible interval (<it>CrI</it>): 1.790, 2.157), 1.406 (95% <it>CrI</it>: 1.341, 1.476), respectively, and the population attributable risk (PAR) for the TT/TC genotypes of this variant is 16.9% for overall. For T2DM and IVS4G>T polymorphism, TT homozygotes and TG heterozygotes versus GG homozygote was 1.885 (95%<it>CrI</it>: 1.698, 2.088), 1.360 (95% <it>CrI</it>: 1.291, 1.433), respectively. Four <it>OR</it>s among these two polymorphisms all yielded significant between-study heterogeneity (P < 0.05) and the main source of heterogeneity was ethnic differences. Data also showed significant associations between T2DM and the other two polymorphisms, but with low heterogeneity (<it>P </it>> 0.10). Pooled <it>OR</it>s fit a codominant, multiplicative genetic model for all the four polymorphisms of <it>TCF7L2 </it>gene, and this model was also confirmed in different ethnic populations when stratification of IVS3C>T and IVS4G>T polymorphisms except for Africans, where a dominant, additive genetic mode is suggested for IVS3C>T polymorphism.</p> <p>Conclusion</p> <p>This meta-analysis demonstrates that four variants of <it>TCF7L2 </it>gene are all associated with T2DM, and indicates a multiplicative genetic model for all the four polymorphisms, as well as suggests the <it>TCF7L2 </it>gene involved in near 1/5 of all T2MD. Potential gene-gene and gene-environmental interactions by which common variants in the <it>TCF7L2 </it>gene influence the risk of T2MD need further exploration.</p

    The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports

    Get PDF
    Background: The Framingham Heart Study (FHS), founded in 1948 to examine the epidemiology of cardiovascular disease, is among the most comprehensively characterized multi-generational studies in the world. Many collected phenotypes have substantial genetic contributors; yet most genetic determinants remain to be identified. Using single nucleotide polymorphisms (SNPs) from a 100K genome-wide scan, we examine the associations of common polymorphisms with phenotypic variation in this community-based cohort and provide a full-disclosure, web-based resource of results for future replication studies. Methods: Adult participants (n = 1345) of the largest 310 pedigrees in the FHS, many biologically related, were genotyped with the 100K Affymetrix GeneChip. These genotypes were used to assess their contribution to 987 phenotypes collected in FHS over 56 years of follow up, including: cardiovascular risk factors and biomarkers; subclinical and clinical cardiovascular disease; cancer and longevity traits; and traits in pulmonary, sleep, neurology, renal, and bone domains. We conducted genome-wide variance components linkage and population-based and family-based association tests. Results: The participants were white of European descent and from the FHS Original and Offspring Cohorts (examination 1 Offspring mean age 32 ± 9 years, 54% women). This overview summarizes the methods, selected findings and limitations of the results presented in the accompanying series of 17 manuscripts. The presented association results are based on 70,897 autosomal SNPs meeting the following criteria: minor allele frequency ≥ 10%, genotype call rate ≥ 80%, Hardy-Weinberg equilibrium p-value ≥ 0.001, and satisfying Mendelian consistency. Linkage analyses are based on 11,200 SNPs and short-tandem repeats. Results of phenotype-genotype linkages and associations for all autosomal SNPs are posted on the NCBI dbGaP website at http:// www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007. Conclusion: We have created a full-disclosure resource of results, posted on the dbGaP website, from a genome-wide association study in the FHS. Because we used three analytical approaches to examine the association and linkage of 987 phenotypes with thousands of SNPs, our results must be considered hypothesis-generating and need to be replicated. Results from the FHS 100K project with NCBI web posting provides a resource for investigators to identify high priority findings for replication.Molecular and Cellular Biolog

    Keywords and Cultural Change: Frame Analysis of Business Model Public Talk, 1975–2000

    Full text link
    • …
    corecore