482 research outputs found

    Variant biomarker discovery using mass spectrometry-based proteogenomics

    Get PDF
    Genomic diversity plays critical roles in risk of disease pathogenesis and diagnosis. While genomic variants—including single nucleotide variants, frameshift variants, and mis-splicing isoforms—are commonly detected at the DNA or RNA level, their translated variant protein or polypeptide products are ultimately the functional units of the associated disease. These products are often released in biofluids and could be leveraged for clinical diagnosis and patient stratification. Recent emergence of integrated analysis of genomics with mass spectrometry-based proteomics for biomarker discovery, also known as proteogenomics, have significantly advanced the understanding disease risk variants, precise medicine, and biomarker discovery. In this review, we discuss variant proteins in the context of cancers and neurodegenerative diseases, outline current and emerging proteogenomic approaches for biomarker discovery, and provide a comprehensive proteogenomic strategy for detection of putative biomarker candidates in human biospecimens. This strategy can be implemented for proteogenomic studies in any field of enquiry. Our review timely addresses the need of biomarkers for aging related diseases

    Genetic risk factors in Finnish patients with Parkinson's disease

    Get PDF
    Introduction Variation contributing to the risk of Parkinson's disease (PD) has been identified in several genes and at several loci including GBA, SMPD1, LRRK2, POLG1, CHCHD10 and MAPT, but the frequencies of risk variants seem to vary according to ethnic background. Our aim was to analyze how variation in these genes contributes to PD in the Finnish population. Methods The subjects consisted of 527 Finnish patients with early-onset PD, 325 patients with late-onset PD and 403 population controls. We screened for known genetic risk variants in GBA, SMPD1, LRRK2, POLG1, CHCHD10 and MAPT. In addition, DNA from 225 patients with early-onset Parkinson's disease was subjected to whole exome sequencing (WES). Results We detected a significant difference in the length variation of the CAG repeat in POLG1 between patients with early-onset PD compared to controls. The p.N370S and p.L444P variants in GBA contributed to a relative risk of 3.8 in early-onset PD and 2.5 in late-onset PD. WES revealed five variants in LRRK2 and SMPD1 that were found in the patients but not in the Finnish ExAC sequences. These are possible risk variants that require further confirmation. The p.G2019S variant in LRRK2, common in North African Arabs and Ashkenazi Jews, was not detected in any of the 849 PD patients. Conclusions The POLG1 CAG repeat length variation and the GBA p.L444P variant are associated with PD in the Finnish population.Peer reviewe

    Finnish Parkinson's disease study integrating protein-protein interaction network data with exome sequencing analysis

    Get PDF
    Variants associated with Parkinson's disease (PD) have generally a small effect size and, therefore, large sample sizes or targeted analyses are required to detect significant associations in a whole exome sequencing (WES) study. Here, we used protein-protein interaction (PPI) information on 36 genes with established or suggested associations with PD to target the analysis of the WES data. We performed an association analysis on WES data from 439 Finnish PD subjects and 855 controls, and included a Finnish population cohort as the replication dataset with 60 PD subjects and 8214 controls. Single variant association (SVA) test in the discovery dataset yielded 11 candidate variants in seven genes, but the associations were not significant in the replication cohort after correction for multiple testing. Polygenic risk score using variants rs2230288 and rs2291312, however, was associated to PD with odds ratio of 2.7 (95% confidence interval 1.4-5.2; p < 2.56e-03). Furthermore, an analysis of the PPI network revealed enriched clusters of biological processes among established and candidate genes, and these functional networks were visualized in the study. We identified novel candidate variants for PD using a gene prioritization based on PPI information, and described why these variants may be involved in the pathogenesis of PD

    Defining the causes of sporadic Parkinson's disease in the global Parkinson's genetics program (GP2)

    Get PDF
    The Global Parkinson's Genetics Program (GP2) will genotype over 150,000 participants from around the world, and integrate genetic and clinical data for use in large-scale analyses to dramatically expand our understanding of the genetic architecture of PD. This report details the workflow for cohort integration into the complex arm of GP2, and together with our outline of the monogenic hub in a companion paper, provides a generalizable blueprint for establishing large scale collaborative research consortia

    A common genetic factor for Parkinson disease in ethnic Chinese population in Taiwan

    Get PDF
    BACKGROUND: Parkinson's disease (PD) is the most common neurodegenerative movement disorder, characterized clinically by resting tremor, bradykinesia, postural instability and rigidity. The prevalence of PD is approximately 2% of the population over 65 years of age and 1.7 million PD patients (age ≥ 55 years) live in China. Recently, a common LRRK2 variant Gly2385Arg was reported in ethnic Chinese PD population in Taiwan. We analyzed the frequency of this variant in our independent PD case-control population of Han Chinese from Taiwan. METHODS: 305 patients and 176 genetically unrelated healthy controls were examined by neurologists and the diagnosis of PD was based on the published criteria. The region of interest was amplified with standard polymerase chain reaction (PCR). PCR fragments then were directly sequenced in both forward and reverse directions. Differences in genotype frequencies between groups were assessed by the X(2 )test, while X(2 )analysis was used to test for the Hardy-Weinberg equilibrium. RESULTS: Of the 305 patients screened we identified 27 (9%) with heterozygous G2385R variant. This mutation was only found in 1 (0.5%) in our healthy control samples (odds ratio = 16.99, 95% CI: 2.29 to 126.21, p = 0.0002). Sequencing of the entire open reading frame of LRRK2 in G2385R carriers revealed no other variants. CONCLUSION: These data suggest that the G2385R variant contributes significantly to the etiology of PD in ethnic Han Chinese individuals. With consideration of the enormous and expanding aging Chinese population in mainland China and in Taiwan, this variant is probably the most common known genetic factor for PD worldwide

    Genome-wide admixture and association study of serum iron, ferritin, transferrin saturation and total iron binding capacity in African Americans

    Get PDF
    Iron is an essential component of many important proteins and enzymes, including hemoglobin, which is responsible for carrying oxygen to the cells. African Americans (AAs) have a greater prevalence of iron deficiency compared with European Americans. We conducted genome-wide admixture-mapping and association studies for serum iron, serum ferritin, transferrin saturation (SAT) and total iron binding capacity (TIBC) in 2347 AAs participating in the Jackson Heart Study (JHS). Follow-up replication analyses for JHS iron-trait associated SNPs were conducted in 329 AA participants in the Healthy Aging in Neighborhoods of Diversity across the Life Span study (HANDLS). Higher estimated proportions of global African ancestry were significantly associated with lower levels of iron (P = 2.4 × 10−5), SAT (P = 0.0019) and TIBC (P = 0.042). We observed significant associations (P < 5 × 10−8) between serum TIBC levels and two independent SNPs around TF on chromosome 3, the first report of a genome-wide significant second independent signal in this region, and SNPs near two novel genes: HDGFL1 on chromosome 6 and MAF on chromosome 16. We also observed significant associations between ferritin levels and SNPs near GAB3 on chromosome X. We replicated our two independent associations at TF and our association at GAB3 in HANDLS. Our study provides evidence for both shared and unique genetic risk factors that are associated with iron-related measures in AAs. The top two variants in TF explain 11.2% of the total variation in TIBC levels in AAs after accounting for age, gender, body mass index and background ancestry

    Imputation of variants from the 1000 Genomes Project modestly improves known associations and can identify low-frequency variant-phenotype associations undetected by HapMap based imputation

    Get PDF
    notes: PMCID: PMC3655956This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Genome-wide association (GWA) studies have been limited by the reliance on common variants present on microarrays or imputable from the HapMap Project data. More recently, the completion of the 1000 Genomes Project has provided variant and haplotype information for several million variants derived from sequencing over 1,000 individuals. To help understand the extent to which more variants (including low frequency (1% ≤ MAF <5%) and rare variants (<1%)) can enhance previously identified associations and identify novel loci, we selected 93 quantitative circulating factors where data was available from the InCHIANTI population study. These phenotypes included cytokines, binding proteins, hormones, vitamins and ions. We selected these phenotypes because many have known strong genetic associations and are potentially important to help understand disease processes. We performed a genome-wide scan for these 93 phenotypes in InCHIANTI. We identified 21 signals and 33 signals that reached P<5×10(-8) based on HapMap and 1000 Genomes imputation, respectively, and 9 and 11 that reached a stricter, likely conservative, threshold of P<5×10(-11) respectively. Imputation of 1000 Genomes genotype data modestly improved the strength of known associations. Of 20 associations detected at P<5×10(-8) in both analyses (17 of which represent well replicated signals in the NHGRI catalogue), six were captured by the same index SNP, five were nominally more strongly associated in 1000 Genomes imputed data and one was nominally more strongly associated in HapMap imputed data. We also detected an association between a low frequency variant and phenotype that was previously missed by HapMap based imputation approaches. An association between rs112635299 and alpha-1 globulin near the SERPINA gene represented the known association between rs28929474 (MAF = 0.007) and alpha1-antitrypsin that predisposes to emphysema (P = 2.5×10(-12)). Our data provide important proof of principle that 1000 Genomes imputation will detect novel, low frequency-large effect associations

    Multi-modality machine learning predicting Parkinson's disease

    Get PDF
    Personalized medicine promises individualized disease prediction and treatment. The convergence of machine learning (ML) and available multimodal data is key moving forward. We build upon previous work to deliver multimodal predictions of Parkinson's disease (PD) risk and systematically develop a model using GenoML, an automated ML package, to make improved multi-omic predictions of PD, validated in an external cohort. We investigated top features, constructed hypothesis-free disease-relevant networks, and investigated drug-gene interactions. We performed automated ML on multimodal data from the Parkinson's progression marker initiative (PPMI). After selecting the best performing algorithm, all PPMI data was used to tune the selected model. The model was validated in the Parkinson's Disease Biomarker Program (PDBP) dataset. Our initial model showed an area under the curve (AUC) of 89.72% for the diagnosis of PD. The tuned model was then tested for validation on external data (PDBP, AUC 85.03%). Optimizing thresholds for classification increased the diagnosis prediction accuracy and other metrics. Finally, networks were built to identify gene communities specific to PD. Combining data modalities outperforms the single biomarker paradigm. UPSIT and PRS contributed most to the predictive power of the model, but the accuracy of these are supplemented by many smaller effect transcripts and risk SNPs. Our model is best suited to identifying large groups of individuals to monitor within a health registry or biobank to prioritize for further testing. This approach allows complex predictive models to be reproducible and accessible to the community, with the package, code, and results publicly available
    • …
    corecore