292 research outputs found

    A genome-wide study of Hardy–Weinberg equilibrium with next generation sequence data

    Get PDF
    Statistical tests for Hardy–Weinberg equilibrium have been an important tool for detecting genotyping errors in the past, and remain important in the quality control of next generation sequence data. In this paper, we analyze complete chromosomes of the 1000 genomes project by using exact test procedures for autosomal and X-chromosomal variants. We find that the rate of disequilibrium largely exceeds what might be expected by chance alone for all chromosomes. Observed disequilibrium is, in about 60% of the cases, due to heterozygote excess. We suggest that most excess disequilibrium can be explained by sequencing problems, and hypothesize mechanisms that can explain exceptional heterozygosities. We report higher rates of disequilibrium for the MHC region on chromosome 6, regions flanking centromeres and p-arms of acrocentric chromosomes. We also detected long-range haplotypes and areas with incidental high disequilibrium. We report disequilibrium to be related to read depth, with variants having extreme read depths being more likely to be out of equilibrium. Disequilibrium rates were found to be 11 times higher in segmental duplications and simple tandem repeat regions. The variants with significant disequilibrium are seen to be concentrated in these areas. For next generation sequence data, Hardy–Weinberg disequilibrium seems to be a major indicator for copy number variation.Peer ReviewedPostprint (published version

    A LASSO-based approach to analyzing rare variants in genetic association studies

    Get PDF
    Genetic markers with rare variants are spread out in the genome, making it necessary and difficult to consider them in genetic association studies. Consequently, wisely combining rare variants into “composite” markers may facilitate meaningful analyses. In this paper, we propose a novel approach of analyzing rare variant data by incorporating the least absolute shrinkage and selection operator technique. We applied this method to the Genetic Analysis Workshop 17 data, and our results suggest that this new approach is promising. In addition, we took advantage of having 200 phenotype replications and assessed the performance of our approach by means of repeated classification tree analyses. Our method and analyses were performed without knowledge of the underlying simulating model. Our method identified 38 markers (in 65 genes) that are significantly associated with the phenotype Affected and correctly identified two causal genes, SIRT1 and PDGFD

    Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity

    Get PDF
    There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1) 16 nuclear regulators of mitochondrial genes, (2) 91 genes for oxidative phosphorylation and (3) 966 nuclear-encoded mitochondrial genes). Gene set enrichment analysis (GSEA) showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS) data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents) and a population-based GWAS sample (KORA F4, n = 1,743). A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th) and 95(th) percentile of the set of all gene-wise corrected p-values) as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th) percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50) = 0.0103). This finding was not confirmed in the trios (p(GSEA,50) = 0.5991), but in KORA (p(GSEA,50) = 0.0398). The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50) = 0.1052, p(MAGENTA,75) = 0.0251). The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes

    Novel associations for hypothyroidism include known autoimmune risk loci

    Get PDF
    Hypothyroidism is the most common thyroid disorder, affecting about 5% of the general population. Here we present the first large genome-wide association study of hypothyroidism, in 2,564 cases and 24,448 controls from the customer base of 23andMe, Inc., a personal genetics company. We identify four genome-wide significant associations, two of which are well known to be involved with a large spectrum of autoimmune diseases: rs6679677 near _PTPN22_ and rs3184504 in _SH2B3_ (p-values 3.5e-13 and 3.0e-11, respectively). We also report associations with rs4915077 near _VAV3_ (p-value 8.3e-11), another gene involved in immune function, and rs965513 near _FOXE1_ (p-value 3.1e-14). Of these, the association with _PTPN22_ confirms a recent small candidate gene study, and _FOXE1_ was previously known to be associated with thyroid-stimulating hormone (TSH) levels. Although _SH2B3_ has been previously linked with a number of autoimmune diseases, this is the first report of its association with thyroid disease. The _VAV3_ association is novel. These results suggest heterogeneity in the genetic etiology of hypothyroidism, implicating genes involved in both autoimmune disorders and thyroid function. Using a genetic risk profile score based on the top association from each of the four genome-wide significant regions in our study, the relative risk between the highest and lowest deciles of genetic risk is 2.1

    Fat Mass and Obesity-Associated Gene (FTO) in Eating Disorders: Evidence for Association of the rs9939609 Obesity Risk Allele with Bulimia nervosa and Anorexia nervosa

    Get PDF
    Objective: The common single nucleotide polymorphism (SNP) rs9939609 in the fat mass and obesity-associated gene (FTO) is associated with obesity. As genetic variants associated with weight regulation might also be implicated in the etiology of eating disorders, we evaluated whether SNP rs9939609 is associated with bulimia nervosa (BN) and anorexia nervosa (AN). Methods: Association of rs9939609 with BN and AN was assessed in 689 patients with AN, 477 patients with BN, 984 healthy non-population-based controls, and 3,951 population-based controls (KORA-S4). Based on the familial and premorbid occurrence of obesity in patients with BN, we hypothesized an association of the obesity risk A-allele with BN. Results: In accordance with our hypothesis, we observed evidence for association of the rs9939609 A-allele with BN when compared to the non-population-based controls (unadjusted odds ratio (OR) = 1.142, one-sided 95% confidence interval (CI) 1.001-infinity; one-sided p = 0.049) and a trend in the population-based controls (OR = 1.124, one-sided 95% CI 0.932-infinity; one-sided p = 0.056). Interestingly, compared to both control groups, we further detected a nominal association of the rs9939609 A-allele to AN (OR = 1.181, 95% CI 1.027-1.359, two-sided p = 0.020 or OR = 1.673, 95% CI 1.101-2.541, two-sided p = 0.015,). Conclusion: Our data suggest that the obesity-predisposing FTO allele might be relevant in both AN and BN. Copyright (C) 2012 S. Karger GmbH, Freibur

    A high density linkage map of the bovine genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent technological advances have made it possible to efficiently genotype large numbers of single nucleotide polymorphisms (SNPs) in livestock species, allowing the production of high-density linkage maps. Such maps can be used for quality control of other SNPs and for fine mapping of quantitative trait loci (QTL) via linkage disequilibrium (LD).</p> <p>Results</p> <p>A high-density bovine linkage map was constructed using three types of markers. The genotypic information was obtained from 294 microsatellites, three milk protein haplotypes and 6769 SNPs. The map was constructed by combining genetic (linkage) and physical information in an iterative mapping process. Markers were mapped to 3,155 unique positions; the 6,924 autosomal markers were mapped to 3,078 unique positions and the 123 non-pseudoautosomal and 19 pseudoautosomal sex chromosome markers were mapped to 62 and 15 unique positions, respectively. The linkage map had a total length of 3,249 cM. For the autosomes the average genetic distance between adjacent markers was 0.449 cM, the genetic distance between unique map positions was 1.01 cM and the average genetic distance (cM) per Mb was 1.25.</p> <p>Conclusion</p> <p>There is a high concordance between the order of the SNPs in our linkage map and their physical positions on the most recent bovine genome sequence assembly (Btau 4.0). The linkage maps provide support for fine mapping projects and LD studies in bovine populations. Additionally, the linkage map may help to resolve positions of unassigned portions of the bovine genome.</p

    Replication of Association between ADAM33 Polymorphisms and Psoriasis

    Get PDF
    Polymorphisms in ADAM33, the first gene identified in asthma by positional cloning, have been recently associated with psoriasis. No replication study of this association has been published so far. Data available in the French EGEA study (Epidemiological study on Genetics and Environment of Asthma, bronchial hyperresponsivensess and Atopy) give the opportunity to attempt to replicate the association between ADAM33 and psoriasis in 2002 individuals. Psoriasis (n = 150) has been assessed by questionnaire administered by an interviewer and a sub-sample of subjects with early-onset psoriasis (n = 74) has been identified based on the age of the subjects at time of interview (<40 years). Nine SNPs in ADAM33 and 11 SNPs in PSORS1 were genotyped. Association analysis was conducted by using two methods, GEE regression-based method and a likelihood-based method (LAMP program). The rs512625 SNP in ADAM33 was found associated with psoriasis at p = 0.01, the usual threshold required for replication (OR [95% CI] for heterozygotes compared to the reference group of homozygotes for the most frequent allele = 0.61 [0.42;0.89]). The rs628977 SNP, which was not in linkage disequilibrium with rs512625, was significantly associated with early-onset psoriasis (p = 0.01, OR [95% CI] for homozygotes for the minor allele compared to the reference group = 2.52 [1.31;4.86]). Adjustment for age, sex, asthma and a PSORS1 SNP associated with psoriasis in the EGEA data did not change the significance of these associations. This suggests independent effects of ADAM33 and PSORS1 on psoriasis. This is the first study that replicates an association between genetic variants in ADAM33 and psoriasis. Interestingly, the 2 ADAM33 SNPs associated with psoriasis in the present analysis were part of the 3-SNPs haplotypes showing the strongest associations in the initial study. The identification of a pleiotropic effect of ADAM33 on asthma and psoriasis may contribute to the understanding of these common immune-mediated diseases

    Validation of a Cost-Efficient Multi-Purpose SNP Panel for Disease Based Research

    Get PDF
    BACKGROUND: Here we present convergent methodologies using theoretical calculations, empirical assessment on in-house and publicly available datasets as well as in silico simulations, that validate a panel of SNPs for a variety of necessary tasks in human genetics disease research before resources are committed to larger-scale genotyping studies on those samples. While large-scale well-funded human genetic studies routinely have up to a million SNP genotypes, samples in a human genetics laboratory that are not yet part of such studies may be productively utilized in pilot projects or as part of targeted follow-up work though such smaller scale applications require at least some genome-wide genotype data for quality control purposes such as DNA "barcoding" to detect swaps or contamination issues, determining familial relationships between samples and correcting biases due to population effects such as population stratification in pilot studies. PRINCIPAL FINDINGS: Empirical performance in classification of relative types for any two given DNA samples (e.g., full siblings, parental, etc) indicated that for outbred populations the panel performs sufficiently to classify relationship in extended families and therefore also for smaller structures such as trios and for twin zygosity testing. Additionally, familial relationships do not significantly diminish the (mean match) probability of sharing SNP genotypes in pedigrees, further indicating the uniqueness of the "barcode." Simulation using these SNPs for an African American case-control disease association study demonstrated that population stratification, even in complex admixed samples, can be adequately corrected under a range of disease models using the SNP panel. CONCLUSION: The panel has been validated for use in a variety of human disease genetics research tasks including sample barcoding, relationship verification, population substructure detection and statistical correction. Given the ease of genotyping our specific assay contained herein, this panel represents a useful and economical panel for human geneticists

    Systems medicine and infection

    Get PDF
    By using a systems based approach, mathematical and computational techniques can be used to develop models that describe the important mechanisms involved in infectious diseases. An iterative approach to model development allows new discoveries to continually improve the model, and ultimately increase the accuracy of predictions. SIR models are used to describe epi demics, predicting the extent and spread of disease. Genome-wide genotyping and sequencing technologies can be used to identify the biological mechanisms behind diseases. These tools help to build strategies for disease prevention and treatment, an example being the recent outbreak of Ebola in West Africa where these techniques were deployed. HIV is a complex disease where much is still to be learnt about the virus and the best effective treatment. With basic mathematical modelling techniques, significant discoveries have been made over the last 20 years. With recent technological advances, the computation al resources now available and interdisciplinary cooperation, further breakthroughs are inevitable. In TB, modelling has traditionally been empirical in nature, with clinical data providing the fuel for this top-down approach. Recently, projects have begun to use data derived from laboratory experiments and clinical trials to create mathematical models that describe the mechanisms responsible for the disease. A systems medicine approach to infection modelling helps identify important biological questions that then direct future experiments , the results of which improve the model in an iterative cycle . This means that data from several model systems can be integrated and synthesised to explore complex biological systems .Postprin

    Genetic adult lactase persistence is associated with risk of Crohn's Disease in a New Zealand population

    Get PDF
    Background: Mycobacterium avium subspecies paratuberculosis (MAP) is an infective agent found in ruminants and milk products, which has been suggested to increase the risk of gastrointestinal inflammation in genetically susceptible hosts. It is hypothesized that lactase persistence facilitates exposure to such milk products increasing the likelihood of adverse outcomes. Individuals either homozygous or heterozygous for the T allele of DNA variant, rs4988235, located 14kb upstream from the LCT locus, are associated with having lactase persistence. The aim of this study was to determine whether lactase persistence as evident by the T allele of rs4988235 is associated with Crohn's Disease (CD) in a New Zealand population. Findings: Individuals homozygous for the T allele (T/T genotype) showed a significantly increased risk of having CD as compared with those homozygous for the C allele (OR = 1.61, 95% CI = 1.03-2.51). Additionally, a significant increase in the frequency of the T allele was observed in CD patients (OR = 1.30, 95% CI = 1.05-1.61, p = 0.013), indicating that the T allele encoding lactase persistence was associated with an increased risk of CD. Conclusions: Our findings indicate that lactase persistence as evident by the presence of the T allele of rs4988235 is associated with risk of CD in this New Zealand Caucasian population
    corecore