48 research outputs found
Evolutionarily Conserved Substrate Substructures for Automated Annotation of Enzyme Superfamilies
The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies
The evolution of enzyme function in the isomerases.
The advent of computational approaches to measure functional similarity between enzymes adds a new dimension to existing evolutionary studies based on sequence and structure. This paper reviews research efforts aiming to understand the evolution of enzyme function in superfamilies, presenting a novel strategy to provide an overview of the evolution of enzymes belonging to an individual EC class, using the isomerases as an exemplar
Accurate Protein Structure Annotation through Competitive Diffusion of Enzymatic Functions over a Network of Local Evolutionary Similarities
High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks
Exploring the biological and chemical complexity of the ligases.
Using a novel method to map and cluster chemical reactions, we have re-examined the chemistry of the ligases [Enzyme Commission (EC) Class 6] and their associated protein families in detail. The type of bond formed by the ligase can be automatically extracted from the equation of the reaction, replicating the EC subclass division. However, this subclass division hides considerable complexities, especially for the C-N forming ligases, which fall into at least three distinct types. The lower levels of the EC classification for ligases are somewhat arbitrary in their definition and add little to understanding their chemistry or evolution. By comparing the multi-domain architecture of the enzymes and using sequence similarity networks, we examined the links between overall reaction and evolution of the ligases. These show that, whilst many enzymes that perform the same overall chemistry group together, both convergent (similar function, different ancestral lineage) and divergent (different function, common ancestor) evolution of function are observed. However, a common theme is that a single conserved domain (often the nucleoside triphosphate binding domain) is combined with ancillary domains that provide the variation in substrate binding and function
Leveraging Structural Flexibility to Predict Protein Function
Proteins are essentially versatile and flexible molecules and understanding protein function plays a fundamental role in understanding biological systems. Protein structure comparisons are widely used for revealing protein function. However,with rigidity or partial rigidity assumption, most existing comparison methods do not consider conformational flexibility in protein structures. To address this issue, this thesis seeks to develop algorithms for flexible structure comparisons to predict one specific aspect of protein function, binding specificity. Given conformational samples as flexibility representation, we focus on two predictive problems related to specificity: aggregate prediction and individual prediction.For aggregate prediction, we have designed FAVA (Flexible Aggregate Volumetric Analysis). FAVA is the first conformationally general method to compare proteins with identical folds but different specificities. FAVA is able to correctly categorize members of protein superfamilies and to identify influential amino acids that cause different specificities. A second method PEAP (Point-based Ensemble for Aggregate Prediction) employs ensemble clustering techniques from many base clustering to predict binding specificity. This method incorporates structural motions of functional substructures and is capable of mitigating prediction errors.For individual prediction, the first method is an atomic point representation for representing flexibilities in the binding cavity. This representation is able to predict binding specificity on each protein conformation with high accuracy, and it is the first to analyze maps of binding cavity conformations that describe proteins with different specificities. Our second method introduces a volumetric lattice representation. This representation localizes solvent-accessible shape of the binding cavity by computing cavity volume in each user-defined space. It proves to be more informative than point-based representations. Last but not least, we discuss a structure-independent representation. This representation builds a lattice model on protein electrostatic isopotentials. This is the first known method to predict binding specificity explicitly from the perspective of electrostatic fields.The methods presented in this thesis incorporate the variety of protein conformations into the analysis of protein ligand binding, and provide more views on flexible structure comparisons and structure-based function annotation of molecular design
Recommended from our members
The chemistry and evolution of enzyme function: isomerases as a case study
The study of the evolution of proteins has been traditionally undertaken from a sequence and structural point of view. However any attempt to understand how protein function changes during evolution benefits from consistent definitions of function and robust approaches to quantitatively compare them. The function of enzymes is described as their ability to catalyse biochemical reactions according to the Enzyme Commission (EC). This dissertation explores aspects of the chemistry and evolution of a small class of enzymes catalysing geometrical and structural rearrangements between isomers, the isomerases (EC 5).
A comprehensive analysis of the overall chemistry of isomerase reactions based on bond changes, reaction centres and substrates and products revealed that isomerase reactions are chemically diverse and difficult to classify using a hierarchical system. Although racemases and epimerases (EC 5.1) and cis-trans isomerases (EC 5.2) are sensibly grouped according to changes of stereochemistry, the overall chemistry of intramolecular oxidoreductases (EC 5.3), intramolecular transferases (EC 5.4) and intramolecular lyases (EC 5.5) is challenging. The subclass \other isomerases" (EC 5.99) sits apart from other subclasses and exhibits great diversity. The current classification of isomerases in six subclasses reduces to two subclasses if the type of isomerism is considered. In addition, the separation of groups of isomerases sharing similar chemistry such as oxidosqualene cyclases and pseudouridine synthases from chemically complex sub-subclasses like intramolecular transferases acting on \other groups" (EC 5.4.99) might also improve the classification.
An overview of the evolution of isomerase function in superfamilies revealed three main findings. First, isomerases are more likely to evolve new functions in different EC primary classes, especially lyases (EC 4), rather than evolve to perform different isomerase reactions. Second, isomerases change their overall chemistry and conserve the structure of their substrates and products more often than conserving the chemistry and changing substrates and products. Last, the relationship between sequence and functional similarity suggests that correlations should be investigated on the basis of closely related enzymes.
Although previous research assumes a one-to-one relationship between EC number and biochemical reaction, almost one-third of all known EC numbers are linked to more than one biochemical reaction. This complexity was characterised for isomerase reactions and used to develop an approach to automatically explore it across the entire EC classification. Remarkably, about 30% of the EC numbers bearing more than one reaction are linked to different types of reactions, bearing key differences in catalysed bond changes. Several recommendations to improve the description of complex biochemical reaction data in the EC classification were proposed.
This dissertation explores enzymes from a functional perspective as an alternative to
classical studies based on homology. This standpoint might prove useful to help to search for sequence candidates for orphan enzymes and in the design of enzymes with novel activities