11 research outputs found

    SKEMPI 2.0: an updated benchmark of changes in protein–protein binding energy, kinetics and thermodynamics upon mutation

    Get PDF
    Motivation: Understanding the relationship between the sequence, structure, binding energy, binding kinetics and binding thermodynamics of protein–protein interactions is crucial to understanding cellular signaling, the assembly and regulation of molecular complexes, the mechanisms through which mutations lead to disease, and protein engineering. Results: We present SKEMPI 2.0, a major update to our database of binding free energy changes upon mutation for structurally resolved protein–protein interactions. This version now contains manually curated binding data for 7085 mutations, an increase of 133%, including changes in kinetics for 1844 mutations, enthalpy and entropy changes for 443 mutations, and 440 mutations, which abolish detectable binding.This work has been supported by the European Molecular Biology Laboratory [I.H.M.]; Biotechnology and Biological Sciences Research Council [Future Leader Fellowship BB/N011600/1 to I.H.M.]; Spanish Ministry of Economy and Competitiveness (MINECO) [BIO2016-79930-R to J.F.R.]; Interreg POCTEFA [EFA086/15 to J.F.R.]; European Commission [H2020 grant 676556 (MuG)].Peer ReviewedPostprint (published version

    Mas-related G protein-coupled receptor-X2 (MRGPRX2) in drug hypersensitivity reactions

    Get PDF
    The human ortholog MRGPRX2 and the mice ortholog, Mrgprb2 are activated by basic secretagogues and neurokinins. A number of commonly used small-molecule drugs (e.g., neuromuscular blocking agents, fluoroquinolones, vancomycin) have been recently shown to activate these receptors under in vitro experimental conditions, what results in mast cell degranulation. The above drugs are also known to cause IgE-mediated anaphylactic reactions in allergic patients. The new findings on mechanisms of drug-induced mast cell degranulation may modify the current management of drug hypersensitivity reactions. Clinical interpretation of mild drug-provoked hypersensitivity reactions, interpretation of skin test with a drug of interest or further recommendations for patients suspected of drug allergy are likely to be reconsidered. In the paper we discussed future directions in research on identification and differentiation of MRGPRX2-mediated and IgE-dependent mast cell degranulation in patients presenting clinical features of drug-induced hypersensitivity reactions

    Mutations at protein-protein interfaces: Small changes over big surfaces have large impacts on human health.

    Get PDF
    Many essential biological processes including cell regulation and signalling are mediated through the assembly of protein complexes. Changes to protein-protein interaction (PPI) interfaces can affect the formation of multiprotein complexes, and consequently lead to disruptions in interconnected networks of PPIs within and between cells, further leading to phenotypic changes as functional interactions are created or disrupted. Mutations altering PPIs have been linked to the development of genetic diseases including cancer and rare Mendelian diseases, and to the development of drug resistance. The importance of these protein mutations has led to the development of many resources for understanding and predicting their effects. We propose that a better understanding of how these mutations affect the structure, function, and formation of multiprotein complexes provides novel opportunities for tackling them, including the development of small-molecule drugs targeted specifically to mutated PPIs.H.J. is currently funded by an Astex Pharmaceuticals Sustaining Innovation Postdoctoral Fellowship hosted at the Wellcome Trust Sanger Institute. M.A.T was supported by scholarships from Promega Corporation, as well as the College of Agricultural and Life Sciences and the Department of Biochemistry at the University of Wisconsin-Madison, USA. B.O.M was supported by the Bill and Melinda Gates Foundation. D.B.A is the recipient of a C. J. Martin Research Fellowship from the National Health and Medical Research Council of Australia (APP1072476) and is funded by the Wellcome Trust and Jack Brockhoff Foundation (JBF 4186, 2016). T.L.B. receives funding from the University of Cambridge and The Wellcome Trust for facilities and support

    Analysis of genetic variation and potential applications in genome-scale metabolic modeling

    Get PDF
    Genetic variation is the motor of evolution and allows organisms to overcome the environmental challenges they encounter. It can be both beneficial and harmful in the process of engineering cell factories for the production of proteins and chemicals. Throughout the history of biotechnology, there have been efforts to exploit genetic variation in our favor to create strains with favorable phenotypes. Genetic variation can either be present in natural populations or it can be artificially created by mutagenesis and selection or adaptive laboratory evolution. On the other hand, unintended genetic variation during a long term production process may lead to significant economic losses and it is important to understand how to control this type of variation. With the emergence of next-generation sequencing technologies, genetic variation in microbial strains can now be determined on an unprecedented scale and resolution by re-sequencing thousands of strains systematically. In this article, we review challenges in the integration and analysis of large-scale re-sequencing data, present an extensive overview of bioinformatics methods for predicting the effects of genetic variants on protein function, and discuss approaches for interfacing existing bioinformatics approaches with genome-scale models of cellular processes in order to predict effects of sequence variation on cellular phenotypes

    Determining Effects of Non-synonymous SNPs on Protein-Protein Interactions using Supervised and Semi-supervised Learning

    No full text
    <div><p>Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous <u>SNP IN</u>teraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of large-scale protein-protein interaction networks, and can be useful for functional annotation of disease-associated SNPs. SNIP-IN tool is freely accessible as a web-server at <a href="http://korkinlab.org/snpintool/" target="_blank">http://korkinlab.org/snpintool/</a>.</p></div

    Strategies for the intelligent integration of genetic variance information in multiscale models of neurodegenerative diseases

    Get PDF
    A more complete understanding of the genetic architecture of complex traits and diseases can maximize the utility of human genetics in disease screening, diagnosis, prognosis, and therapy. Undoubtedly, the identification of genetic variants linked to polygenic and complex diseases is of supreme interest for clinicians, geneticists, patients, and the public. Furthermore, determining how genetic variants affect an individual’s health and transmuting this knowledge into the development of new medicine can revolutionize the treatment of most common deleterious diseases. However, this requires the correlation of genetic variants with specific diseases, and accurate functional assessment of genetic variation in human DNA sequencing studies is still a nontrivial challenge in clinical genomics. Assigning functional consequences and clinical significances to genetic variants is an important step in human genome interpretation. The translation of the genetic variants into functional molecular mechanisms is essential in disease pathogenesis and, eventually in therapy design. Although various statistical methods are helpful to short-list the genetic variants for fine-mapping investigation, demonstrating their role in molecular mechanism requires knowledge of functional consequences. This undoubtedly requires comprehensive investigation. Experimental interpretation of all the observed genetic variants is still impractical. Thus, the prediction of functional and regulatory consequences of the genetic variants using in-silico approaches is an important step in the discovery of clinically actionable knowledge. Since the interactions between phenotypes and genotypes are multi-layered and biologically complex. Such associations present several challenges and simultaneously offer many opportunities to design new protocols for in-silico variant evaluation strategies. This thesis presents a comprehensive protocol based on a causal reasoning algorithm that harvests and integrates multifaceted genetic and biomedical knowledge with various types of entities from several resources and repositories to understand how genetic variants perturb molecular interaction, and initiate a disease mechanism. Firstly, as a case study of genetic susceptibility loci of Alzheimer’s disease, I reviewed and summarized all the existing methodologies for Genome Wide Association Studies (GWAS) interpretation, currently available algorithms, and computable modelling approaches. In addition, I formulated a new approach for modelling and simulations of genetic regulatory networks as an extension of the syntax of the Biological Expression Language (OpenBEL). This could allow the representation of genetic variation information in cause-and-effect models to predict the functional consequences of disease-associated genetic variants. Secondly, by using the new syntax of OpenBEL, I generated an OpenBEL model for Alzheimer´s Disease (AD) together with genetic variants including their DNA, RNA or protein position, variant type and associated allele. To better understand the role of genetic variants in a disease context, I subsequently tried to predict the consequences of genetic variation based on the functional context provided by the network model. I further explained that how genetic variation information could help to identify candidate molecular mechanisms for aetiologically complex diseases such as Alzheimer’s disease (AD) and Parkinson’s disease (PD). Though integration of genetic variation information can enhance the evidence base for shared pathophysiology pathways in complex diseases, I have addressed to one of the key questions, namely the role of shared genetic variants to initiate shared molecular mechanisms between neurodegenerative diseases. I systematically analysed shared genetic variation information of AD and PD and mapped them to find shared molecular aetiology between neurodegenerative diseases. My methodology highlighted that a comprehensive understanding of genetic variation needs integration and analysis of all omics data, in order to build a joint model to capture all datasets concurrently. Moreover genomic loci should be considered to investigate the effects of GWAS variants rather than an individual genetic variant, which is hard to predict in a biologically complex molecular mechanism, predominantly to investigate shared pathology

    Systems biology of plant molecular networks: from networks to models

    Get PDF
    Developmental processes are controlled by regulatory networks (GRNs), which are tightly coordinated networks of transcription factors (TFs) that activate and repress gene expression within a spatial and temporal context. In Arabidopsis thaliana, the key components and network structures of the GRNs controlling major plant reproduction processes, such as floral transition and floral organ identity specification, have been comprehensively unveiled. This thanks to advances in ‘omics’ technologies combined with genetic approaches. Yet, because of the multidimensional nature of the data and because of the complexity of the regulatory mechanisms, there is a clear need to analyse these data in such a way that we can understand how TFs control complex traits. The use of mathematical modelling facilitates the representation of the dynamics of a GRN and enables better insight into GRN complexity; while multidimensional data analysis enables the identification of properties that connect different layers from genotype-to-phenotype. Mathematical modelling and multidimensional data analysis are both parts of a systems biology approach, and this thesis presents the application of both types of systems biology approaches to flowering GRNs. Chapter 1 comprehensively reviews advances in understanding of GRNs underlying plant reproduction processes, as well as mathematical models and multidimensional data analysis approaches to study plant systems biology. As discussed in Chapter 1, an important aspect of understanding these GRNs is how perturbations in one part of the network are transmitted to other parts, and ultimately how this results in changes in phenotype. Given the complexity of recent versions of Arabidopsis GRNs - which involves highly-connected, non-linear networks of TFs, microRNAs, movable factors, hormones and chromatin modifying proteins - it is not possible to predict the effect of gene perturbations on e.g. flowering time in an intuitive way by just looking at the network structure. Therefore, mathematical modelling plays an important role in providing a quantitative understanding of GRNs. In addition, aspects of multidimensional data analysis for understanding GRNs underlying plant reproduction are also discussed in the first Chapter. This includes not only the integration of experimental data, e.g. transcriptomics with protein-DNA binding profiling, but also the integration of different types of networks identified by ‘omics’ approaches, e.g. protein-protein interaction networks and gene regulatory networks. Chapter 2 describes a mathematical model for representing the dynamics of key genes in the GRN of flowering time control. We modelled with ordinary differential equations (ODEs) the physical interactions and regulatory relationships of a set of core genes controlling Arabidopsis flowering time in order to quantitatively analyse the relationship between their expression levels and the flowering time response. We considered a core GRN composed of eight TFs: SHORT VEGETATIVE PHASE (SVP), FLOWERING LOCUS C (FLC), AGAMOUS-LIKE 24 (AGL24), SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1), APETALA1 (AP1), FLOWERING LOCUS T (FT), LEAFY (LFY) and FD. The connections and interactions amongst these components are justified based on experimental data, and the model is parameterised by fitting the equations to quantitative data on gene expression and flowering time. Then the model is validated with transcript data from a range of mutants. We verify that the model is able to describe some quantitative patterns seen in expression data under genetic perturbations, which supported the credibility of the model and its dynamic properties. The proposed model is able to predict the flowering time by assessing changes in the expression of the orchestrator of floral transition AP1. Overall, the work presents a framework, which allows addressing how different quantitative inputs are combined into a single quantitative output, i.e. the timing of flowering. The model allowed studying the established genetic regulations, and we discuss in Chapter 5 the steps towards using the proposed framework to zoom in and obtain new insides about the molecular mechanisms underlying the regulations. Systems biology does not only involve the use of dynamic modelling but also the development of approaches for multidimensional data analysis that are able to integrate multiple levels of systems organization. In Chapter 3, we aimed at comprehensively identifying and characterizing cis-regulatory mutations that have an effect on the GRN of flowering time control. By using ChIP-seq data and information about known DNA binding motifs of TFs involved in plant reproduction, we identified single-nucleotide polymorphisms (SNPs) that are highly discriminative in the classification of the flowering time phenotypes. Often, SNPs that overlap the position of experimentally determined binding sites (e.g. by ChIP-seq), are considered putative regulatory SNPs. We showed that regulatory SNPs are difficult to pinpoint among the sea of polymorphisms localized within binding sites determined by ChIP-seq studies. To overcome this, we narrowed the resolution by focusing on the subset of SNPs that are located within ChIP-seq peaks but that are also part of known regulatory motifs. These SNPs were used as input in a classification algorithm that could predict flowering time of Arabidopsis accessions relative to Col-0. Our strategy is able to identify SNPs that have a biological link with changes in flowering time. We then surveyed the literature to formulate hypothesis that explain the regulatory mechanism underlying the difference in phenotype conferred by a SNP. Examples include SNPs that disrupt the flowering time gene FT; in which the mutation presumably disrupts the binding region of SVP. In Chapter 5 we discuss the steps towards extending our approach to obtain a more comprehensive survey of variants that have an effect on the flowering time control. In Chapter 4, we propose a method for genome-wide prediction of protein-protein interaction (PPI) sites form the Arabidopsis interactome. Our method, named SLIDERbio, uses features encoded in the sequence of proteins and their interactions to predict PPI sites. More specifically, our method mines PPI networks to find over-represented sequence motifs in pairs of interacting proteins. In addition, the inter-species conservation of these over-represented motifs, as well as their predicted surface accessibility, are take into account to compute the likelihood of these motifs being located in a PPI site. Our results suggested that motifs overrepresented in pairs of interacting proteins that are conserved across orthologs and that have high predicted surface accessibility, are in general good putative interaction sites. We applied our method to obtain interactome-wide predictions for Arabidopsis proteins. The results were explored to formulate testable hypothesis for the molecular mechanisms underlying effects of spontaneous or induced mutagenesis on e.g. ZEITLUPE, CXIP1 and SHY2 (proteins relevant for flowering time). In addition, we showed that the binding sites are under stronger selective pressure than the overall protein sequence, and that this may be used to link sequence variability to functional divergence. Finally, Chapter 5 concludes this thesis and describes future perspectives in systems biology applied to the study of GRNs underlying plant reproduction processes. Two key directions are often followed in systems biology: 1) compiling systems-wide snapshots in which the relationships and interactions between the molecules of a system are comprehensively represented; and 2) generating accurate experimental data that can be used as input for the modelling concepts and techniques or multi-dimensional data analysis. Highlighted in Chapter 5 are the limitations in key steps within the systems biology framework applied to GRN studies. In addition, I discussed improvements and extensions that we envision for our model related to the GRN underlying the control of flowering time. Future steps for multi-dimensional data analysis are also discussed. To sum up, I discussed how to connect the different technologies developed in this thesis towards understanding the interplay between the roles of the genes, developmental stages and environmental conditions.</p

    Molecular characterization of streptococcus agalactiae isolated from pregnant women in the Eastern Cape, South Africa and Windhoek, Namibia and antibacterial activities of some medicinal plant extracts on the isolates

    Get PDF
    Streptococcus agalactiae (S. agalactiae) also known as group B Streptococcus (GBS) is one of the leading causes of bacterial morbidity and mortality among neonates worldwide. It is the cause of invasive Early Onset Disease (EOD), which occurs in the first 7 days of life and characterised by sepsis, pneumonia and meningitis and Late Onset Disease (LOD) occurring between 7 and 89 days of life. Late onset disease is characterised by meningitis and long term neurological sequelae such as cerebral palsy, hearing impairment and cognitive challenges. S. agalactiae does not only infect neonates, it also infects the elderly, immunocompromised individuals and pregnant and non-pregnant women, causing invasive disease. In the world, 10-40 percent of healthy women are rectally or vaginally colonised with GBS and they face the risk of passing it to their babies during the process of childbirth. During parturition, a GBS colonized pregnant woman transfers the bacterium to her new-born as the baby passes through the ruptured membrane, thus infecting the child. However, GBS has been reported to be transferred even without rupture of membranes. Once it infects the membranes, it is transferred into the amniotic fluid and subsequently infects the baby. It can be aspirated into the lungs causing pneumonia or it can infect the blood stream and disseminated round the body causing septicaemia, meningitis and other infections. Once in the neonate’s body, the bacteria is able to evade the immune system as the host immune system is not yet fully developed. Bacterial evasion of the immune system is enhanced by its various virulence factors which are deployed to help it escape the immune system. These include the polysaccharide capsule, haemolysin and the release of complement inactivating factors such C5a peptidase. The World Health Organisation (WHO) (2010) recommends universal screening of pregnant women to identify those colonised and who are at risk of passing the bacterium to their babies during birth. WHO also recommends identification of at risk women and providing Intrapartum Antibiotic Prophylaxis (IAP) using penicillin. However, problems arise in penicillin allergic women and while alternatives for IAP include erythromycin and clindamycin, there is increasing resistance to these drugs thereby limiting therapeutic options. Antimicrobial susceptibility testing is also not always possible in most resource constrained countries due to poor infrastructure, limited access to health care and the logistical problems in implementing the WHO guidelines. Alternative therapeutic options to GBS infection include developing new and potent antibiotics, development of a vaccine, use of medicinal plants and the use of bacteriophage therapy. While these look like better alternatives there is massive scientific work to be carried out to ensure proper characterisation and efficiency of such alternatives. This process should be followed by in vitro diagnostic testing, experiments with animal models and clinical trials. The problems encountered during vaccine development to curtail GBS infection are compounded by the multiplicity of S. agalactiae capsular types which vary in different geographic locations. Medicinal plants are a cheap and convenient option since they are widely used in communities but the phytochemical components of the plants have to be identified and subjected to in vitro testing to evaluate their therapeutic efficacy as antimicrobial agents. This study therefore sought to isolate GBS from pregnant women between 35 and 37 weeks gestation in Windhoek (Namibia) and the Eastern Cape (South Africa), to determine the prevalence of GBS colonisation in the vagina and rectum of the pregnant women, characterise the isolates by molecular techniques, determine the antimicrobial resistance profiles and genes of the isolates and explore the efficacies of medicinal plant extracts as possible candidates for therapeutic options
    corecore