110 research outputs found

    A Bayesian method for evaluating and discovering disease loci associations

    Get PDF
    Background: A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings: We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance: We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. © 2011 Jiang et al

    Genome-wide association reveals genetic effects on human Aβ<sub>42 </sub>and τ protein levels in cerebrospinal fluids: a case control study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alzheimer's disease (AD) is common and highly heritable with many genes and gene variants associated with AD in one or more studies, including APOE ε2/ε3/ε4. However, the genetic backgrounds for normal cognition, mild cognitive impairment (MCI) and AD in terms of changes in cerebrospinal fluid (CSF) levels of Aβ<sub>1-42</sub>, T-tau, and P-tau<sub>181P</sub>, have not been clearly delineated. We carried out a genome-wide association study (GWAS) in order to better define the genetic backgrounds to these three states in relation to CSF levels.</p> <p>Methods</p> <p>Subjects were participants in the Alzheimer's Disease Neuroimaging Initiative (ADNI). The GWAS dataset consisted of 818 participants (mainly Caucasian) genotyped using the Illumina Human Genome 610 Quad BeadChips. This sample included 410 subjects (119 Normal, 115 MCI and 176 AD) with measurements of CSF Aβ<sub>1-42</sub>, T-tau, and P-tau<sub>181P </sub>Levels. We used PLINK to find genetic associations with the three CSF biomarker levels. Association of each of the 498,205 SNPs was tested using additive, dominant, and general association models while considering APOE genotype and age. Finally, an effort was made to better identify relevant biochemical pathways for associated genes using the ALIGATOR software.</p> <p>Results</p> <p>We found that there were some associations with APOE genotype although CSF levels were about the same for each subject group; CSF Aβ<sub>1-42 </sub>levels decreased with APOE gene dose for each subject group. T-tau levels tended to be higher among AD cases than among normal subjects. From adjusted result using APOE genotype and age as covariates, no SNP was associated with CSF levels among AD subjects. <it>CYP19A1 </it>'aromatase' (rs2899472), <it>NCAM2</it>, and multiple SNPs located on chromosome 10 near the <it>ARL5B </it>gene demonstrated the strongest associations with Aβ<sub>1-42 </sub>in normal subjects. Two genes found to be near the top SNPs, <it>CYP19A1 </it>(rs2899472, p = 1.90 × 10<sup>-7</sup>) and <it>NCAM2 </it>(rs1022442, p = 2.75 × 10<sup>-7</sup>) have been reported as genetic factors related to the progression of AD from previous studies. In AD subjects, APOE ε2/ε3 and ε2/ε4 genotypes were associated with elevated T-tau levels and ε4/ε4 genotype was associated with elevated T-tau and P-tau<sub>181P </sub>levels. Pathway analysis detected several biological pathways implicated in Normal with CSF β-amyloid peptide (Aβ<sub>1-42</sub>).</p> <p>Conclusions</p> <p>Our genome-wide association analysis identified several SNPs as important factors for CSF biomarker. We also provide new evidence for additional candidate genetic risk factors from pathway analysis that can be tested in further studies.</p

    The diploid genome sequence of an Asian individual

    Get PDF
    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics

    GWAS for executive function and processing speed suggests involvement of the CADM2 gene

    Get PDF
    To identify common variants contributing to normal variation in two specific domains of cognitive functioning, we conducted a genome-wide association study (GWAS) of executive functioning and information processing speed in non-demented older adults from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) consortium. Neuropsychological testing was available for 5429-32 070 subjects of European ancestry aged 45 years or older, free of dementia and clinical stroke at the time of cognitive testing from 20 cohorts in the discovery phase. We analyzed performance on the Trail Making Test parts A and B, the Letter Digit Substitution Test (LDST), the Digit Symbol Substitution Task (DSST), semantic and phonemic fluency tests, and the Stroop Color and Word Test. Replication was sought in 1311-21860 subjects from 20 independent cohorts. A significant association was observed in the discovery cohorts for the single-nucleotide polymorphism (SNP) rs17518584 (discovery P-value=3.12 × 10(-8)) and in the joint discovery and replication meta-analysis (P-value=3.28 × 10(-9) after adjustment for age, gender and education) in an intron of the gene cell adhesion molecule 2 (CADM2) for performance on the LDST/DSST. Rs17518584 is located about 170 kb upstream of the transcription start site of the major transcript for the CADM2 gene, but is within an intron of a variant transcript that includes an alternative first exon. The variant is associated with expression of CADM2 in the cingulate cortex (P-value=4 × 10(-4)). The protein encoded by CADM2 is involved in glutamate signaling (P-value=7.22 × 10(-15)), gamma-aminobutyric acid (GABA) transport (P-value=1.36 × 10(-11)) and neuron cell-cell adhesion (P-value=1.48 × 10(-13)). Our findings suggest that genetic variation in the CADM2 gene is associated with individual differences in information processing speed.Molecular Psychiatry advance online publication, 14 April 2015; doi:10.1038/mp.2015.37

    Genome-wide association study of Alzheimer's disease

    Get PDF
    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples

    A genome-wide association study for late-onset Alzheimer's disease using DNA pooling

    Get PDF
    Background: Late-onset Alzheimer's disease (LOAD) is an age related neurodegenerative disease with a high prevalence that places major demands on healthcare resources in societies with increasingly aged populations. The only extensively replicable genetic risk factor for LOAD is the apolipoprotein E gene. In order to identify additional genetic risk loci we have conducted a genome-wide association (GWA) study in a large LOAD case – control sample, reducing costs through the use of DNA pooling. Methods: DNA samples were collected from 1,082 individuals with LOAD and 1,239 control subjects. Age at onset ranged from 60 to 95 and Controls were matched for age (mean = 76.53 years, SD = 33), gender and ethnicity. Equimolar amounts of each DNA sample were added to either a case or control pool. The pools were genotyped using Illumina HumanHap300 and Illumina Sentrix HumanHap240S arrays testing 561,494 SNPs. 114 of our best hit SNPs from the pooling data were identified and then individually genotyped in the case – control sample used to construct the pools. Results: Highly significant association with LOAD was observed at the APOE locus confirming the validity of the pooled genotyping approach. For 109 SNPs outside the APOE locus, we obtained uncorrected p-values ≤ 0.05 for 74 after individual genotyping. To further test these associations, we added control data from 1400 subjects from the 1958 Birth Cohort with the evidence for association increasing to 3.4 × 10-6 for our strongest finding, rs727153. rs727153 lies 13 kb from the start of transcription of lecithin retinol acyltransferase (phosphatidylcholine – retinol O-acyltransferase, LRAT). Five of seven tag SNPs chosen to cover LRAT showed significant association with LOAD with a SNP in intron 2 of LRAT, showing greatest evidence of association (rs201825, p-value = 6.1 × 10-7). Conclusion: We have validated the pooling method for GWA studies by both identifying the APOE locus and by observing a strong enrichment for significantly associated SNPs. We provide evidence for LRAT as a novel candidate gene for LOAD. LRAT plays a prominent role in the Vitamin A cascade, a system that has been previously implicated in LOAD

    Data mining of high density genomic variant data for prediction of Alzheimer's disease risk

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs) at genes implicated in immune response, cholesterol/lipid metabolism, and cell membrane processes have been confirmed by genome-wide association studies (GWAS) to be associated with late-onset Alzheimer's disease (LOAD), a percentage of AD heritability continues to be unexplained. We try to find other genetic variants that may influence LOAD risk utilizing data mining methods.</p> <p>Methods</p> <p>Two different approaches were devised to select SNPs associated with LOAD in a publicly available GWAS data set consisting of three cohorts. In both approaches, single-locus analysis (logistic regression) was conducted to filter the data with a less conservative p-value than the Bonferroni threshold; this resulted in a subset of SNPs used next in multi-locus analysis (random forest (RF)). In the second approach, we took into account prior biological knowledge, and performed sample stratification and linkage disequilibrium (LD) in addition to logistic regression analysis to preselect loci to input into the RF classifier construction step.</p> <p>Results</p> <p>The first approach gave 199 SNPs mostly associated with genes in calcium signaling, cell adhesion, endocytosis, immune response, and synaptic function. These SNPs together with <it>APOE and GAB2 </it>SNPs formed a predictive subset for LOAD status with an average error of 9.8% using 10-fold cross validation (CV) in RF modeling. Nineteen variants in LD with <it>ST5, TRPC1, ATG10, ANO3, NDUFA12, and NISCH </it>respectively, genes linked directly or indirectly with neurobiology, were identified with the second approach. These variants were part of a model that included <it>APOE </it>and <it>GAB2 </it>SNPs to predict LOAD risk which produced a 10-fold CV average error of 17.5% in the classification modeling.</p> <p>Conclusions</p> <p>With the two proposed approaches, we identified a large subset of SNPs in genes mostly clustered around specific pathways/functions and a smaller set of SNPs, within or in proximity to five genes not previously reported, that may be relevant for the prediction/understanding of AD.</p

    Age-Specific Epigenetic Drift in Late-Onset Alzheimer's Disease

    Get PDF
    Despite an enormous research effort, most cases of late-onset Alzheimer's disease (LOAD) still remain unexplained and the current biomedical science is still a long way from the ultimate goal of revealing clear risk factors that can help in the diagnosis, prevention and treatment of the disease. Current theories about the development of LOAD hinge on the premise that Alzheimer's arises mainly from heritable causes. Yet, the complex, non-Mendelian disease etiology suggests that an epigenetic component could be involved. Using MALDI-TOF mass spectrometry in post-mortem brain samples and lymphocytes, we have performed an analysis of DNA methylation across 12 potential Alzheimer's susceptibility loci. In the LOAD brain samples we identified a notably age-specific epigenetic drift, supporting a potential role of epigenetic effects in the development of the disease. Additionally, we found that some genes that participate in amyloid-β processing (PSEN1, APOE) and methylation homeostasis (MTHFR, DNMT1) show a significant interindividual epigenetic variability, which may contribute to LOAD predisposition. The APOE gene was found to be of bimodal structure, with a hypomethylated CpG-poor promoter and a fully methylated 3′-CpG-island, that contains the sequences for the ε4-haplotype, which is the only undisputed genetic risk factor for LOAD. Aberrant epigenetic control in this CpG-island may contribute to LOAD pathology. We propose that epigenetic drift is likely to be a substantial mechanism predisposing individuals to LOAD and contributing to the course of disease

    Meta-Analysis for Genome-Wide Association Study Identifies Multiple Variants at the BIN1 Locus Associated with Late-Onset Alzheimer's Disease

    Get PDF
    Recent GWAS studies focused on uncovering novel genetic loci related to AD have revealed associations with variants near CLU, CR1, PICALM and BIN1. In this study, we conducted a genome-wide association study in an independent set of 1034 cases and 1186 controls using the Illumina genotyping platforms. By coupling our data with available GWAS datasets from the ADNI and GenADA, we replicated the original associations in both PICALM (rs3851179) and CR1 (rs3818361). The PICALM variant seems to be non-significant after we adjusted for APOE e4 status. We further tested our top markers in 751 independent cases and 751 matched controls. Besides the markers close to the APOE locus, a marker (rs12989701) upstream of BIN1 locus was replicated and the combined analysis reached genome-wide significance level (p = 5E-08). We combined our data with the published Harold et al. study and meta-analysis with all available 6521 cases and 10360 controls at the BIN1 locus revealed two significant variants (rs12989701, p = 1.32E-10 and rs744373, p = 3.16E-10) in limited linkage disequilibrium (r2 = 0.05) with each other. The independent contribution of both SNPs was supported by haplotype conditional analysis. We also conducted multivariate analysis in canonical pathways and identified a consistent signal in the downstream pathways targeted by Gleevec (P = 0.004 in Pfizer; P = 0.028 in ADNI and P = 0.04 in GenADA). We further tested variants in CLU, PICALM, BIN1 and CR1 for association with disease progression in 597 AD patients where longitudinal cognitive measures are sufficient. Both the PICALM and CLU variants showed nominal significant association with cognitive decline as measured by change in Clinical Dementia Rating-sum of boxes (CDR-SB) score from the baseline but did not pass multiple-test correction. Future experiments will help us better understand potential roles of these genetic loci in AD pathology
    corecore