64 research outputs found

    Prognostic implications of troponin T variations in inherited cardiomyopathies using systems biology.

    Get PDF
    The cardiac troponin T variations have often been used as an example of the application of clinical genotyping for prognostication and risk stratification measures for the management of patients with a family history of sudden cardiac death or familial cardiomyopathy. Given the disparity in patient outcomes and therapy options, we investigated the impact of variations on the intermolecular interactions across the thin filament complex as an example of an unbiased systems biology method to better define clinical prognosis to aid future management options. We present a novel unbiased dynamic model to define and analyse the functional, structural and physico-chemical consequences of genetic variations among the troponins. This was subsequently integrated with clinical data from accessible global multi-centre systematic reviews of familial cardiomyopathy cases from 106 articles of the literature: 136 disease-causing variations pertaining to 981 global clinical cases. Troponin T variations showed distinct pathogenic hotspots for dilated and hypertrophic cardiomyopathies; considering the causes of cardiovascular death separately, there was a worse survival in terms of sudden cardiac death for patients with a variation at regions 90-129 and 130-179 when compared to amino acids 1-89 and 200-288. Our data support variations among 90-130 as being a hotspot for sudden cardiac death and the region 131-179 for heart failure death/transplantation outcomes wherein the most common phenotype was dilated cardiomyopathy. Survival analysis into regions of high risk (regions 90-129 and 130-180) and low risk (regions 1-89 and 200-288) was significant for sudden cardiac death (p = 0.011) and for heart failure death/transplant (p = 0.028). Our integrative genomic, structural, model from genotype to clinical data integration has implications for enhancing clinical genomics methodologies to improve risk stratification

    Mendelian randomization supports bidirectional causality between telomere length and clonal hematopoiesis of indeterminate potential

    Get PDF
    Human genetic studies support an inverse causal relationship between leukocyte telomere length (LTL) and coronary artery disease (CAD), but directionally mixed effects for LTL and diverse malignancies. Clonal hematopoiesis of indeterminate potential (CHIP), characterized by expansion of hematopoietic cells bearing leukemogenic mutations, predisposes both hematologic malignancy and CAD. TERT (which encodes telomerase reverse transcriptase) is the most significantly associated germline locus for CHIP in genome-wide association studies. Here, we investigated the relationship between CHIP, LTL, and CAD in the Trans-Omics for Precision Medicine (TOPMed) program (n = 63,302) and UK Biobank (n = 47,080). Bidirectional Mendelian randomization studies were consistent with longer genetically imputed LTL increasing propensity to develop CHIP, but CHIP then, in turn, hastens to shorten measured LTL (mLTL). We also demonstrated evidence of modest mediation between CHIP and CAD by mLTL. Our data promote an understanding of potential causal relationships across CHIP and LTL toward prevention of CAD

    Clonal Haematopoiesis and Risk of Chronic Liver Disease

    Get PDF
    Chronic liver disease is a major public health burden worldwide1. Although different aetiologies and mechanisms of liver injury exist, progression of chronic liver disease follows a common pathway of liver inflammation, injury and fibrosis2. Here we examined the association between clonal haematopoiesis of indeterminate potential (CHIP) and chronic liver disease in 214,563 individuals from 4 independent cohorts with whole-exome sequencing data (Framingham Heart Study, Atherosclerosis Risk in Communities Study, UK Biobank and Mass General Brigham Biobank). CHIP was associated with an increased risk of prevalent and incident chronic liver disease (odds ratio = 2.01, 95% confidence interval (95% CI) [1.46, 2.79]; P \u3c 0.001). Individuals with CHIP were more likely to demonstrate liver inflammation and fibrosis detectable by magnetic resonance imaging compared to those without CHIP (odds ratio = 1.74, 95% CI [1.16, 2.60]; P = 0.007). to assess potential causality, Mendelian randomization analyses showed that genetic predisposition to CHIP was associated with a greater risk of chronic liver disease (odds ratio = 2.37, 95% CI [1.57, 3.6]; P \u3c 0.001). In a dietary model of non-alcoholic steatohepatitis, mice transplanted with Tet2-deficient haematopoietic cells demonstrated more severe liver inflammation and fibrosis. These effects were mediated by the NLRP3 inflammasome and increased levels of expression of downstream inflammatory cytokines in Tet2-deficient macrophages. In summary, clonal haematopoiesis is associated with an elevated risk of liver inflammation and chronic liver disease progression through an aberrant inflammatory response

    Germline variants at SOHLH2 influence multiple myeloma risk

    Get PDF
    Funding Information: This work was supported by grants from the Knut and Alice Wallenberg Foundation (2012.0193 and 2017.0436), the Swedish Research Council (2017-02023), the Swedish Cancer Society (2017/265), Stiftelsen Borås Forsknings-och Utvecklingsfond mot Cancer, the Nordic Cancer Union (R217-A13329-18-S65), EU-MSCA-COFUND 754299 CanFaster, the Myeloma UK and Cancer Research UK (C1298/A8362), a Jacquelin Forbes-Nixon Fellowship, and Mr. Ralph Stockwell. We thank Ellinor Johnsson and Anna Collin for their assistance. We are indebted to the clinicians and patients who contributed samples. Open access funding provided by Lund University. Publisher Copyright: © 2021, The Author(s).Multiple myeloma (MM) is caused by the uncontrolled, clonal expansion of plasma cells. While there is epidemiological evidence for inherited susceptibility, the molecular basis remains incompletely understood. We report a genome-wide association study totalling 5,320 cases and 422,289 controls from four Nordic populations, and find a novel MM risk variant at SOHLH2 at 13q13.3 (risk allele frequency = 3.5%; odds ratio = 1.38; P = 2.2 × 10−14). This gene encodes a transcription factor involved in gametogenesis that is normally only weakly expressed in plasma cells. The association is represented by 14 variants in linkage disequilibrium. Among these, rs75712673 maps to a genomic region with open chromatin in plasma cells, and upregulates SOHLH2 in this cell type. Moreover, rs75712673 influences transcriptional activity in luciferase assays, and shows a chromatin looping interaction with the SOHLH2 promoter. Our work provides novel insight into MM susceptibility.Peer reviewe

    Functional dissection of inherited non-coding variation influencing multiple myeloma risk

    Get PDF
    Funding Information: This work was supported by grants from the Knut and Alice Wallenberg Foundation (2012.0193 and 2017.0436), the Swedish Research Council (2017-02023 and 2018-00424), the Swedish Cancer Society (2017/265), the Nordic Cancer Union (R217-A13329-18-S65), Arne and Inga-Britt Lundberg’s Stiftelse (2017-0055), European Research Council (EU-MSCA-COFUND 754299 CanFaster), Myeloma UK and Cancer Research UK (C1298/A8362), The National Institute of Health (R01 DK103794 and R01HL146500), the New York Stem Cell Foundation, a gift from the Lodish Family to Boston Children’s Hospital, and Mr. Ralph Stockwell. We thank Ellinor Johnsson for her assistance between 2011 and 2020. We are indebted to the patients who participated in the study. Publisher Copyright: © 2022, The Author(s).Thousands of non-coding variants have been associated with increased risk of human diseases, yet the causal variants and their mechanisms-of-action remain obscure. In an integrative study combining massively parallel reporter assays (MPRA), expression analyses (eQTL, meQTL, PCHiC) and chromatin accessibility analyses in primary cells (caQTL), we investigate 1,039 variants associated with multiple myeloma (MM). We demonstrate that MM susceptibility is mediated by gene-regulatory changes in plasma cells and B-cells, and identify putative causal variants at six risk loci (SMARCD3, WAC, ELL2, CDCA7L, CEP120, and PREX1). Notably, three of these variants co-localize with significant plasma cell caQTLs, signaling the presence of causal activity at these precise genomic positions in an endogenous chromosomal context in vivo. Our results provide a systematic functional dissection of risk loci for a hematologic malignancy.Peer reviewe

    Feature selection for classification of single amino acid variations

    Get PDF
    Genetic variations that lead to changes in amino acid sequences have the ability to cause structural and functional changes of proteins. All such variations do not show phenotypic effects, so it is important to have classifiers that can classify the disease causing variations from neutral to prioritize the experimental study of variants. Large number of features associated with variations can be extracted but many of them do not contribute to classification instead increase the computational time and sometimes they may even deteriorate the classification ability. Feature selection filters out the non-relevant and redundant features from an input feature set so to obtain a feature subset that can induce a model with higher performance. 615 features that define the physicochemical and biochemical properties of amino acids were collected from the AAindex database. Four different feature selection techniques: Least Absolute Shrinkage and Selection Operator (LASSO), random forest, Random Forest Artificial Contrast with Ensembles (RF-ACE) and Area Under the ROC Curve of Random Forest (AUCRF) were applied to select the most relevant features for classification of variations. The classification abilities of the feature subsets, selected by different approaches, were compared. 7 features that can represent 615 input features were selected. The selected feature subset takes less computational time and has slightly better classification ability compared to the whole feature set. Feature selection is an effective tool in machine learning to reduce the number of features and thus reduce the computational time. Application of feature selection can also increase the performance of the model. Asiasanat:Feature selection, variation classification, amino acid feature

    Tools and pipelines for interpreting the impacts of genetic variants

    No full text
    Next generation sequencing (NGS) methods have been widely used for diagnosis. As time and cost of sequencing has reduced sharply during the last decade, genome and exome-wide sequencing have increasingly been used. The genome and exome projects produce large amounts of variation data and the clinical relevance of large proportions of them are not known. Among various types of genetic variations, the single nucleotide variations (SNVs) that lead to amino acid substitutions (AASs) are the most challenging to interpret. The best way to characterize the impacts of variations is by experimental studies. Since these experiments are expensive and time consuming, they cannot be performed for all identified variants. Computational tools can be used for scoring and ranking the variants and prioritizing them for experimental studies. Reliable and fast tools are necessary for accurate variation interpretation and to cope with the amounts of generated data. Several tools are available for predicting impacts of genetic variations. These tools use various types of information and have different performances. Various performance assessment studies have shown that most of the widely used tools have inconsistent and sub-optimal performance.In this study, we implemented a systematic approach to develop four computational tools for interpreting the impacts of genetic variations. The tools are based on machine learning algorithm. Benchmark variation datasets were obtained from various sources for training and testing the tools. A systematic feature selection technique was employed to identify relevant and non-redundant features for predicting variation impact. The benchmark datasets and the features were used for training the tools. Finally, the tools were tested by using independent datasets to estimate their performance for unseen data. The tools PON-P2, PON-MMR2, and PON-PS predict impacts of AASs in human proteins and the PON-mt-tRNA tool predicts the impacts of SNVs in human mitochondrial transfer RNAs (mt-tRNAs). All the tools showed better performance when compared with state-of-the-art tools. These tools have consistently shown the best performance in our studies as well as in independent studies.The tools developed in this study are useful for ranking variations and prioritizing the likely harmful ones for further evaluation. These tools were developed for different purposes. Three of the tools (PON-P2, PON-MMR2, and PON-mt-tRNA) predict pathogenicity of variations. While PON-P2 is a generic tool for predicting pathogenicity of AASs in all human proteins, PON-MMR2 and PON-mt-tRNA are specific tools for predicting pathogenicity of variations in mismatch repair proteins and mt-tRNA genes, respectively. PON-PS is the first tool for predicting disease severity due to AASs. Pathogenicity of variations indicate the relevance of variation to a disease but cannot predict severity of phenotype. Early identification of disease severity promotes personalized medicine by facilitating early interventions, such as preventive measures, clinical monitoring, and molecular tests, for patients and their family members.The developed computational tools were used for analysing the impacts of variations in DNA mismatch repair proteins, mt-tRNA genes, and somatic variations in cancer. The impacts of all possible AASs in four mismatch repair proteins (MLH1, MSH2, MSH6, and PMS2) were predicted using PON-MMR2 and the impacts of all possible SNVs in 22 human mt-tRNAs were predicted using PON-mt-tRNA. We also studied the distribution of predicted pathogenic and benign variations in the protein domains and 3-dimensional structures of proteins and mt-tRNAs. PON-P2 was used to identify harmful somatic AASs from among 5 million somatic variations from 7,042 genomes or exomes grouped into 30 types of cancer. Only a small fraction of the somatic variations were identified to be harmful. Although known cancer genes contained higher numbers of harmful variations, the proportion of harmful variations was only 40%. We prioritized the proteins that were implicated (containing harmful AASs) in the largest number of samples in each cancer type and studied the networks and pathways affected by them. In the functional interaction network, the prioritized proteins were centrally located. The significantly enriched pathways included several new pathways and previously known pathways implicated in cancer. Our findings facilitates prioritization of experimental studies in various cancer types as well as interpretation of variation impacts in mismatch repair proteins and mt-tRNA genes

    Feature selection for classification of single amino acid variations

    Get PDF
    Genetic variations that lead to changes in amino acid sequences have the ability to cause structural and functional changes of proteins. All such variations do not show phenotypic effects, so it is important to have classifiers that can classify the disease causing variations from neutral to prioritize the experimental study of variants. Large number of features associated with variations can be extracted but many of them do not contribute to classification instead increase the computational time and sometimes they may even deteriorate the classification ability. Feature selection filters out the non-relevant and redundant features from an input feature set so to obtain a feature subset that can induce a model with higher performance. 615 features that define the physicochemical and biochemical properties of amino acids were collected from the AAindex database. Four different feature selection techniques: Least Absolute Shrinkage and Selection Operator (LASSO), random forest, Random Forest Artificial Contrast with Ensembles (RF-ACE) and Area Under the ROC Curve of Random Forest (AUCRF) were applied to select the most relevant features for classification of variations. The classification abilities of the feature subsets, selected by different approaches, were compared. 7 features that can represent 615 input features were selected. The selected feature subset takes less computational time and has slightly better classification ability compared to the whole feature set. Feature selection is an effective tool in machine learning to reduce the number of features and thus reduce the computational time. Application of feature selection can also increase the performance of the model. Asiasanat:Feature selection, variation classification, amino acid feature

    PON-mt-tRNA: a multifactorial probability-based method for classification of mitochondrial tRNA variations.

    No full text
    Transfer RNAs (tRNAs) are essential for encoding the transcribed genetic information from DNA into proteins. Variations in the human tRNAs are involved in diverse clinical phenotypes. Interestingly, all pathogenic variations in tRNAs are located in mitochondrial tRNAs (mt-tRNAs). Therefore, it is crucial to identify pathogenic variations in mt-tRNAs for disease diagnosis and proper treatment. We collected mt-tRNA variations using a classification based on evidence from several sources and used the data to develop a multifactorial probability-based prediction method, PON-mt-tRNA, for classification of mt-tRNA single nucleotide substitutions. We integrated a machine learning-based predictor and an evidence-based likelihood ratio for pathogenicity using evidence of segregation, biochemistry and histochemistry to predict the posterior probability of pathogenicity of variants. The accuracy and Matthews correlation coefficient (MCC) of PON-mt-tRNA are 1.00 and 0.99, respectively. In the absence of evidence from segregation, biochemistry and histochemistry, PON-mt-tRNA classifies variations based on the machine learning method with an accuracy and MCC of 0.69 and 0.39, respectively. We classified all possible single nucleotide substitutions in all human mt-tRNAs using PON-mt-tRNA. The variations in the loops are more often tolerated compared to the variations in stems. The anticodon loop contains comparatively more predicted pathogenic variations than the other loops. PON-mt-tRNA is available at http://structure.bmc.lu.se/PON-mt-tRNA/

    Variation Interpretation Predictors : Principles, Types, Performance, and Choice

    No full text
    Next-generation sequencing methods have revolutionized the speed of generating variation information. Sequence data have a plethora of applications and will increasingly be used for disease diagnosis. Interpretation of the identified variants is usually not possible with experimental methods. This has caused a bottleneck that many computational methods aim at addressing. Fast and efficient methods for explaining the significance and mechanisms of detected variants are required for efficient precision/personalized medicine. Computational prediction methods have been developed in three areas to address the issue. There are generic tolerance (pathogenicity) predictors for filtering harmful variants. Gene/protein/disease-specific tools are available for some applications. Mechanism and effect-specific computer programs aim at explaining the consequences of variations. Here, we discuss the different types of predictors and their applications. We review available variation databases and prediction methods useful for variation interpretation. We discuss how the performance of methods is assessed and summarize existing assessment studies. A brief introduction is provided to the principles of the methods developed for variation interpretation as well as guidelines for how to choose the optimal tools and where the field is heading in the future
    corecore