43 research outputs found
A Coevolutionary Residue Network at the Site of a Functionally Important Conformational Change in a Phosphohexomutase Enzyme Family
Coevolution analyses identify residues that co-vary with each other during evolution, revealing sequence relationships unobservable from traditional multiple sequence alignments. Here we describe a coevolutionary analysis of phosphomannomutase/phosphoglucomutase (PMM/PGM), a widespread and diverse enzyme family involved in carbohydrate biosynthesis. Mutual information and graph theory were utilized to identify a network of highly connected residues with high significance. An examination of the most tightly connected regions of the coevolutionary network reveals that most of the involved residues are localized near an interdomain interface of this enzyme, known to be the site of a functionally important conformational change. The roles of four interface residues found in this network were examined via site-directed mutagenesis and kinetic characterization. For three of these residues, mutation to alanine reduces enzyme specificity to ∼10% or less of wild-type, while the other has ∼45% activity of wild-type enzyme. An additional mutant of an interface residue that is not densely connected in the coevolutionary network was also characterized, and shows no change in activity relative to wild-type enzyme. The results of these studies are interpreted in the context of structural and functional data on PMM/PGM. Together, they demonstrate that a network of coevolving residues links the highly conserved active site with the interdomain conformational change necessary for the multi-step catalytic reaction. This work adds to our understanding of the functional roles of coevolving residue networks, and has implications for the definition of catalytically important residues
L1pred: A Sequence-Based Prediction Tool for Catalytic Residues in Enzymes with the L1-logreg Classifier
To understand enzyme functions, identifying the catalytic residues is a usual first step. Moreover, knowledge about catalytic residues is also useful for protein engineering and drug-design. However, to experimentally identify catalytic residues remains challenging for reasons of time and cost. Therefore, computational methods have been explored to predict catalytic residues. Here, we developed a new algorithm, L1pred, for catalytic residue prediction, by using the L1-logreg classifier to integrate eight sequence-based scoring functions. We tested L1pred and compared it against several existing sequence-based methods on carefully designed datasets Data604 and Data63. With ten-fold cross-validation, L1pred showed the area under precision-recall curve (AUPR) and the area under ROC curve (AUC) of 0.2198 and 0.9494 on the training dataset, Data604, respectively. In addition, on the independent test dataset, Data63, it showed the AUPR and AUC values of 0.2636 and 0.9375, respectively. Compared with other sequence-based methods, L1pred showed the best performance on both datasets. We also analyzed the importance of each attribute in the algorithm, and found that all the scores contributed more or less equally to the L1pred performance
The arabidopsis DNA polymerase δ has a role in the deposition of transcriptionally active epigenetic marks, development and flowering
DNA replication is a key process in living organisms. DNA polymerase α (Polα) initiates strand synthesis, which is performed by Polε and Polδ in leading and lagging strands, respectively. Whereas loss of DNA polymerase activity is incompatible with life, viable mutants of Polα and Polε were isolated, allowing the identification of their functions beyond DNA replication. In contrast, no viable mutants in the Polδ polymerase-domain were reported in multicellular organisms. Here we identify such a mutant which is also thermosensitive. Mutant plants were unable to complete development at 28°C, looked normal at 18°C, but displayed increased expression of DNA replication-stress marker genes, homologous recombination and lysine 4 histone 3 trimethylation at the SEPALLATA3 (SEP3) locus at 24°C, which correlated with ectopic expression of SEP3. Surprisingly, high expression of SEP3 in vascular tissue promoted FLOWERING LOCUS T (FT) expression, forming a positive feedback loop with SEP3 and leading to early flowering and curly leaves phenotypes. These results strongly suggest that the DNA polymerase δ is required for the proper establishment of transcriptionally active epigenetic marks and that its failure might affect development by affecting the epigenetic control of master genes.Fil: Iglesias, Francisco Manuel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaFil: Bruera, Natalia Alejandra. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaFil: Dergan Dylon, Leonardo Sebastian. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaFil: Marino, Cristina Ester. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaFil: Lorenzi, Hernán. J. Craig Venter Institute; Estados UnidosFil: Mateos, Julieta Lisa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina. Max Planck Institute for Plant Breeding Research; AlemaniaFil: Turck, Franziska. Max Planck Institute for Plant Breeding Research; AlemaniaFil: Coupland, George. Max Planck Institute for Plant Breeding Research; AlemaniaFil: Cerdan, Pablo Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquimicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina. Universidad de Buenos Aires. Departamento de Ciencias Exactas; Argentin
DisProt: intrinsic protein disorder annotation in 2020
The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome
A Comparative Structural Bioinformatics Analysis of the Insulin Receptor Family Ectodomain Based on Phylogenetic Information
The insulin receptor (IR), the insulin-like growth factor 1 receptor (IGF1R) and the insulin receptor-related receptor (IRR) are covalently-linked homodimers made up of several structural domains. The molecular mechanism of ligand binding to the ectodomain of these receptors and the resulting activation of their tyrosine kinase domain is still not well understood. We have carried out an amino acid residue conservation analysis in order to reconstruct the phylogeny of the IR Family. We have confirmed the location of ligand binding site 1 of the IGF1R and IR. Importantly, we have also predicted the likely location of the insulin binding site 2 on the surface of the fibronectin type III domains of the IR. An evolutionary conserved surface on the second leucine-rich domain that may interact with the ligand could not be detected. We suggest a possible mechanical trigger of the activation of the IR that involves a slight ‘twist’ rotation of the last two fibronectin type III domains in order to face the likely location of insulin. Finally, a strong selective pressure was found amongst the IRR orthologous sequences, suggesting that this orphan receptor has a yet unknown physiological role which may be conserved from amphibians to mammals
Critical assessment of protein intrinsic disorder prediction
Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude
Chasing coevolutionary signals in intrinsically disordered proteins complexes
Intrinsically disordered proteins/regions (IDPs/IDRs) are crucial components of the cell, they are highly abundant and participate ubiquitously in a wide range of biological functions, such as regulatory processes and cell signaling. Many of their important functions rely on protein interactions, by which they trigger or modulate different pathways. Sequence covariation, a powerful tool for protein contact prediction, has been applied successfully to predict protein structure and to identify protein\u2013protein interactions mostly of globular proteins. IDPs/IDRs also mediate a plethora of protein\u2013protein interactions, highlighting the importance of addressing sequence covariation-based inter-protein contact prediction of this class of proteins. Despite their importance, a systematic approach to analyze the covariation phenomena of intrinsically disordered proteins and their complexes is still missing. Here we carry out a comprehensive critical assessment of coevolution-based contact prediction in IDP/IDR complexes and detail the challenges and possible limitations that emerge from their analysis. We found that the coevolutionary signal is faint in most of the complexes of disordered proteins but positively correlates with the interface size and binding affinity between partners. In addition, we discuss the state-of-art methodology by biological interpretation of the results, formulate evaluation guidelines and suggest future directions of development to the field