55 research outputs found

    An Open Access Database of Genome-wide Association Results

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The number of genome-wide association studies (GWAS) is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes. However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results.</p> <p>Methods</p> <p>We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results. Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS.</p> <p>Results</p> <p>Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g., MHC loci) were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g., <it>APOE</it>, <it>LPL</it>). At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (<it>SLC16A7, CSMD1, OAS1</it>), suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings. Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies) containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p < 4.6 × 10<sup>-14</sup>), a finding which was not perturbed by a sensitivity analysis.</p> <p>Conclusion</p> <p>We provide access to a full gene-annotated GWAS database which could be used for further querying, analyses or integration with other genomic information. We make a number of general observations. Of reported associated SNPs, 40% lie within the boundaries of a RefSeq gene and 68% are within 60 kb of one, indicating a bias toward gene-centricity in the findings. We found considerable heterogeneity in information available from GWAS suggesting the wider community could benefit from standardization and centralization of results reporting.</p

    Genetic and expression studies of SMN2 gene in Russian patients with spinal muscular atrophy type II and III

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Spinal muscular atrophy (SMA type I, II and III) is an autosomal recessive neuromuscular disorder caused by mutations in the survival motor neuron gene (<it>SMN1</it>). <it>SMN2 </it>is a centromeric copy gene that has been characterized as a major modifier of SMA severity. SMA type I patients have one or two <it>SMN2 </it>copies while most SMA type II patients carry three <it>SMN2 </it>copies and SMA III patients have three or four <it>SMN2 </it>copies. The <it>SMN1 </it>gene produces a full-length transcript (FL-SMN) while <it>SMN2 </it>is only able to produce a small portion of the FL-SMN because of a splice mutation which results in the production of abnormal SMNΔ7 mRNA.</p> <p>Methods</p> <p>In this study we performed quantification of the <it>SMN2 </it>gene copy number in Russian patients affected by SMA type II and III (42 and 19 patients, respectively) by means of real-time PCR. Moreover, we present two families consisting of asymptomatic carriers of a homozygous absence of the <it>SMN1 </it>gene. We also developed a novel RT-qPCR-based assay to determine the FL-SMN/SMNΔ7 mRNA ratio as SMA biomarker.</p> <p>Results</p> <p>Comparison of the <it>SMN2 </it>copy number and clinical features revealed a significant correlation between mild clinical phenotype (SMA type III) and presence of four copies of the <it>SMN2 </it>gene. In both asymptomatic cases we found an increased number of <it>SMN2 </it>copies in the healthy carriers and a biallelic <it>SMN1 </it>absence. Furthermore, the novel assay revealed a difference between SMA patients and healthy controls.</p> <p>Conclusions</p> <p>We suggest that the <it>SMN2 </it>gene copy quantification in SMA patients could be used as a prognostic tool for discrimination between the SMA type II and SMA type III diagnoses, whereas the FL-SMN/SMNΔ7 mRNA ratio could be a useful biomarker for detecting changes during SMA pharmacotherapy.</p

    SMA CARNI-VAL TRIAL PART II: A Prospective, Single-Armed Trial of L-Carnitine and Valproic Acid in Ambulatory Children with Spinal Muscular Atrophy

    Get PDF
    Multiple lines of evidence have suggested that valproic acid (VPA) might benefit patients with spinal muscular atrophy (SMA). The SMA CARNIVAL TRIAL was a two part prospective trial to evaluate oral VPA and l-carnitine in SMA children. Part 1 targeted non-ambulatory children ages 2–8 in a 12 month cross over design. We report here Part 2, a twelve month prospective, open-label trial of VPA and L-carnitine in ambulatory SMA children.This study involved 33 genetically proven type 3 SMA subjects ages 3–17 years. Subjects underwent two baseline assessments over 4–6 weeks and then were placed on VPA and L-carnitine for 12 months. Assessments were performed at baseline, 3, 6 and 12 months. Primary outcomes included safety, adverse events and the change at 6 and 12 months in motor function assessed using the Modified Hammersmith Functional Motor Scale Extend (MHFMS-Extend), timed motor tests and fine motor modules. Secondary outcomes included changes in ulnar compound muscle action potential amplitudes (CMAP), handheld dynamometry, pulmonary function, and Pediatric Quality of Life Inventory scores.Twenty-eight subjects completed the study. VPA and carnitine were generally well tolerated. Although adverse events occurred in 85% of subjects, they were usually mild and transient. Weight gain of 20% above body weight occurred in 17% of subjects. There was no significant change in any primary outcome at six or 12 months. Some pulmonary function measures showed improvement at one year as expected with normal growth. CMAP significantly improved suggesting a modest biologic effect not clinically meaningful.This study, coupled with the CARNIVAL Part 1 study, indicate that VPA is not effective in improving strength or function in SMA children. The outcomes used in this study are feasible and reliable, and can be employed in future trials in SMA

    A Fine-Mapping Study of 7 Top Scoring Genes from a GWAS for Major Depressive Disorder

    Get PDF
    Major depressive disorder (MDD) is a psychiatric disorder that is characterized -amongst others- by persistent depressed mood, loss of interest and pleasure and psychomotor retardation. Environmental circumstances have proven to influence the aetiology of the disease, but MDD also has an estimated 40% heritability, probably with a polygenic background. In 2009, a genome wide association study (GWAS) was performed on the Dutch GAIN-MDD cohort. A non-synonymous coding single nucleotide polymorphism (SNP) rs2522833 in the PCLO gene became only nominally significant after post-hoc analysis with an Australian cohort which used similar ascertainment. The absence of genome-wide significance may be caused by low SNP coverage of genes. To increase SNP coverage to 100% for common variants (m.a.f.>0.1, r2>0.8), we selected seven genes from the GAIN-MDD GWAS: PCLO, GZMK, ANPEP, AFAP1L1, ST3GAL6, FGF14 and PTK2B. We genotyped 349 SNPs and obtained the lowest P-value for rs2715147 in PCLO at P = 6.8E−7. We imputed, filling in missing genotypes, after which rs2715147 and rs2715148 showed the lowest P-value at P = 1.2E−6. When we created a haplotype of these SNPs together with the non-synonymous coding SNP rs2522833, the P-value decreased to P = 9.9E−7 but was not genome wide significant. Although our study did not identify a more strongly associated variant, the results for PCLO suggest that the causal variant is in high LD with rs2715147, rs2715148 and rs2522833

    Knowledge-Driven Multi-Locus Analysis Reveals Gene-Gene Interactions Influencing HDL Cholesterol Level in Two Independent EMR-Linked Biobanks

    Get PDF
    Genome-wide association studies (GWAS) are routinely being used to examine the genetic contribution to complex human traits, such as high-density lipoprotein cholesterol (HDL-C). Although HDL-C levels are highly heritable (h2∼0.7), the genetic determinants identified through GWAS contribute to a small fraction of the variance in this trait. Reasons for this discrepancy may include rare variants, structural variants, gene-environment (GxE) interactions, and gene-gene (GxG) interactions. Clinical practice-based biobanks now allow investigators to address these challenges by conducting GWAS in the context of comprehensive electronic medical records (EMRs). Here we apply an EMR-based phenotyping approach, within the context of routine care, to replicate several known associations between HDL-C and previously characterized genetic variants: CETP (rs3764261, p = 1.22e-25), LIPC (rs11855284, p = 3.92e-14), LPL (rs12678919, p = 1.99e-7), and the APOA1/C3/A4/A5 locus (rs964184, p = 1.06e-5), all adjusted for age, gender, body mass index (BMI), and smoking status. By using a novel approach which censors data based on relevant co-morbidities and lipid modifying medications to construct a more rigorous HDL-C phenotype, we identified an association between HDL-C and TRIB1, a gene which previously resisted identification in studies with larger sample sizes. Through the application of additional analytical strategies incorporating biological knowledge, we further identified 11 significant GxG interaction models in our discovery cohort, 8 of which show evidence of replication in a second biobank cohort. The strongest predictive model included a pairwise interaction between LPL (which modulates the incorporation of triglyceride into HDL) and ABCA1 (which modulates the incorporation of free cholesterol into HDL). These results demonstrate that gene-gene interactions modulate complex human traits, including HDL cholesterol

    From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes

    Get PDF
    Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays

    The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports

    Get PDF
    Background: The Framingham Heart Study (FHS), founded in 1948 to examine the epidemiology of cardiovascular disease, is among the most comprehensively characterized multi-generational studies in the world. Many collected phenotypes have substantial genetic contributors; yet most genetic determinants remain to be identified. Using single nucleotide polymorphisms (SNPs) from a 100K genome-wide scan, we examine the associations of common polymorphisms with phenotypic variation in this community-based cohort and provide a full-disclosure, web-based resource of results for future replication studies. Methods: Adult participants (n = 1345) of the largest 310 pedigrees in the FHS, many biologically related, were genotyped with the 100K Affymetrix GeneChip. These genotypes were used to assess their contribution to 987 phenotypes collected in FHS over 56 years of follow up, including: cardiovascular risk factors and biomarkers; subclinical and clinical cardiovascular disease; cancer and longevity traits; and traits in pulmonary, sleep, neurology, renal, and bone domains. We conducted genome-wide variance components linkage and population-based and family-based association tests. Results: The participants were white of European descent and from the FHS Original and Offspring Cohorts (examination 1 Offspring mean age 32 ± 9 years, 54% women). This overview summarizes the methods, selected findings and limitations of the results presented in the accompanying series of 17 manuscripts. The presented association results are based on 70,897 autosomal SNPs meeting the following criteria: minor allele frequency ≥ 10%, genotype call rate ≥ 80%, Hardy-Weinberg equilibrium p-value ≥ 0.001, and satisfying Mendelian consistency. Linkage analyses are based on 11,200 SNPs and short-tandem repeats. Results of phenotype-genotype linkages and associations for all autosomal SNPs are posted on the NCBI dbGaP website at http:// www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?id=phs000007. Conclusion: We have created a full-disclosure resource of results, posted on the dbGaP website, from a genome-wide association study in the FHS. Because we used three analytical approaches to examine the association and linkage of 987 phenotypes with thousands of SNPs, our results must be considered hypothesis-generating and need to be replicated. Results from the FHS 100K project with NCBI web posting provides a resource for investigators to identify high priority findings for replication.Molecular and Cellular Biolog

    Genetic Determinants of Lipid Traits in Diverse Populations from the Population Architecture using Genomics and Epidemiology (PAGE) Study

    Get PDF
    For the past five years, genome-wide association studies (GWAS) have identified hundreds of common variants associated with human diseases and traits, including high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and triglyceride (TG) levels. Approximately 95 loci associated with lipid levels have been identified primarily among populations of European ancestry. The Population Architecture using Genomics and Epidemiology (PAGE) study was established in 2008 to characterize GWAS–identified variants in diverse population-based studies. We genotyped 49 GWAS–identified SNPs associated with one or more lipid traits in at least two PAGE studies and across six racial/ethnic groups. We performed a meta-analysis testing for SNP associations with fasting HDL-C, LDL-C, and ln(TG) levels in self-identified European American (∼20,000), African American (∼9,000), American Indian (∼6,000), Mexican American/Hispanic (∼2,500), Japanese/East Asian (∼690), and Pacific Islander/Native Hawaiian (∼175) adults, regardless of lipid-lowering medication use. We replicated 55 of 60 (92%) SNP associations tested in European Americans at p<0.05. Despite sufficient power, we were unable to replicate ABCA1 rs4149268 and rs1883025, CETP rs1864163, and TTC39B rs471364 previously associated with HDL-C and MAFB rs6102059 previously associated with LDL-C. Based on significance (p<0.05) and consistent direction of effect, a majority of replicated genotype-phentoype associations for HDL-C, LDL-C, and ln(TG) in European Americans generalized to African Americans (48%, 61%, and 57%), American Indians (45%, 64%, and 77%), and Mexican Americans/Hispanics (57%, 56%, and 86%). Overall, 16 associations generalized across all three populations. For the associations that did not generalize, differences in effect sizes, allele frequencies, and linkage disequilibrium offer clues to the next generation of association studies for these traits
    • …
    corecore