98 research outputs found

    Why do Sequence Signatures Predict Enzyme Mechanism?:Homology versus Chemistry

    Get PDF
    We identify, firstly, InterPro sequence signatures representing evolutionary relatedness and, secondly, signatures identifying specific chemical machinery. Thus, we predict the chemical mechanisms of enzyme catalysed reactions from “catalytic” and “non-catalytic” subsets of InterPro signatures. We first scanned our 249 sequences with InterProScan and then used the MACiE database to identify those amino acid residues which are important for catalysis. The sequences were mutated in silico to replace these catalytic residues with glycine, and then again scanned with InterProScan. Those signature matches from the original scan which disappeared on mutation were called “catalytic”. Mechanism was predicted using all signatures, only the 78 “catalytic” signatures, or only the 519 “non-catalytic” signatures. The noncatalytic signatures gave results indistinguishable from those for the whole feature set, with precision of 0.991 and sensitivity of 0.970. The catalytic signatures alone gave less impressive predictivity, with precision and sensitivity of 0.791 and 0.735, respectively. These results show that our successful prediction of enzyme mechanism is mostly by homology rather than by identifying catalytic machinery.Publisher PDFPeer reviewe

    Nucleoside and Nucleotide Nomenclature

    Full text link
    Current nomenclature in the area of nucleosides, nucleotides, and nucleic acids comprises a mixture of (1) common names that have gained official recognition, (2) guidelines that have been derived and officially recommended by the International Union of Pure and Applied Chemistry (IUPAC)/International Union of Biochemistry and Molecular Biology (IUBMB), and (3) evolving usage that is derived by individual scientists and laboratories and subjected to peer review through publication. A working group was commissioned in 1998 by IUBMB to review guidelines for nucleotide (including oligonucleotide) nomenclature. As those guidelines are developed and made available, they will be referenced in future updates of this appendix. The main purpose of this appendix is to provide pertinent references that will direct the reader to the relevant guidelines or evolving nomenclature as described in the literature. When additional suggestions or guidance are appropriate, those comments are included as well.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/143595/1/cpnca01d.pd

    Migratory Urge and Gill Na(+),K(+)-ATPase Activity of Hatchery-Reared Atlantic Salmon Smolts from the Dennys and Penobscot River Stocks, Maine

    Get PDF
    Hatchery-reared Atlantic salmon Salmo salar smolts produced from captive-reared Dennys River and sea-run Penobscot River broodstock are released into their source rivers in Maine. The adult return rate of Dennys smolts is comparatively low, and disparity in smolt quality between stocks resulting from genetic or broodstock rearing effects is plausible. Smolt behavior and physiology were assessed during sequential 14-d trials conducted in seminatural annular tanks with circular flow. Migratory urge\u27\u27 (downstream movement) was monitored remotely using passive integrated transponder tags, and gill Na(+),K(+)-ATPase activity was measured at the beginning and end of the trials to provide an index of smolt development. The migratory urge of both stocks was low in early April, increased 20-fold through late May, and declined by the end of June. The frequency and seasonal distribution of downstream movement were independent of stock. In March and April, initial gill Na(+),K(+)-ATPase activities of Penobscot River smolts were lower than those of Dennys River smolts. For these trials, however, Penobscot River smolts increased enzyme activity after exposure to the tank, whereas Dennys River smolts did not, resulting in similar activities between stocks at the end of all trials. There was no clear relationship between migratory urge and gill Na(+),K(+)-ATPase activity. Gill Na(+),K(+)-ATPase activity of both stocks increased in advance of migratory urge and then declined while migratory urge was increasing. Maximum movement was observed from 2 h after sunset through 1 h after sunrise but varied seasonally. Dennys River smolts were slightly more nocturnal than Penobscot River smolts. These data suggest that Dennys and Penobscot River stocks are not markedly different in either physiological or behavioral expression of smolting

    Gene Function Classification Using Bayesian Models with Hierarchy-Based Priors

    Get PDF
    We investigate the application of hierarchical classification schemes to the annotation of gene function based on several characteristics of protein sequences including phylogenic descriptors, sequence based attributes, and predicted secondary structure. We discuss three Bayesian models and compare their performance in terms of predictive accuracy. These models are the ordinary multinomial logit (MNL) model, a hierarchical model based on a set of nested MNL models, and a MNL model with a prior that introduces correlations between the parameters for classes that are nearby in the hierarchy. We also provide a new scheme for combining different sources of information. We use these models to predict the functional class of Open Reading Frames (ORFs) from the E. coli genome. The results from all three models show substantial improvement over previous methods, which were based on the C5 algorithm. The MNL model using a prior based on the hierarchy outperforms both the non-hierarchical MNL model and the nested MNL model. In contrast to previous attempts at combining these sources of information, our approach results in a higher accuracy rate when compared to models that use each data source alone. Together, these results show that gene function can be predicted with higher accuracy than previously achieved, using Bayesian models that incorporate suitable prior information

    A new measure for functional similarity of gene products based on Gene Ontology

    Get PDF
    BACKGROUND: Gene Ontology (GO) is a standard vocabulary of functional terms and allows for coherent annotation of gene products. These annotations provide a basis for new methods that compare gene products regarding their molecular function and biological role. RESULTS: We present a new method for comparing sets of GO terms and for assessing the functional similarity of gene products. The method relies on two semantic similarity measures; sim(Rel )and funSim. One measure (sim(Rel)) is applied in the comparison of the biological processes found in different groups of organisms. The other measure (funSim) is used to find functionally related gene products within the same or between different genomes. Results indicate that the method, in addition to being in good agreement with established sequence similarity approaches, also provides a means for the identification of functionally related proteins independent of evolutionary relationships. The method is also applied to estimating functional similarity between all proteins in Saccharomyces cerevisiae and to visualizing the molecular function space of yeast in a map of the functional space. A similar approach is used to visualize the functional relationships between protein families. CONCLUSION: The approach enables the comparison of the underlying molecular biology of different taxonomic groups and provides a new comparative genomics tool identifying functionally related gene products independent of homology. The proposed map of the functional space provides a new global view on the functional relationships between gene products or protein families

    Is EC class predictable from reaction mechanism?

    Get PDF
    We thank the Scottish Universities Life Sciences Alliance (SULSA) and the Scottish Overseas Research Student Awards Scheme of the Scottish Funding Council (SFC) for financial support.Background: We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism in descriptors, and also three approaches that encode only the overall chemical reaction. Both cross-validation and also an external test set are used. Results: The three descriptor sets encoding overall chemical transformation perform better than the two descriptions of mechanism. SVM and RF models perform comparably well; kNN is less successful. Oxidoreductases and hydrolases are relatively well predicted by all types of descriptor; isomerases are well predicted by overall reaction descriptors but not by mechanistic ones. Conclusions: Our results suggest that pairs of similar enzyme reactions tend to proceed by different mechanisms. Oxidoreductases, hydrolases, and to some extent isomerases and ligases, have clear chemical signatures, making them easier to predict than transferases and lyases. We find evidence that isomerases as a class are notably mechanistically diverse and that their one shared property, of substrate and product being isomers, can arise in various unrelated ways. The performance of the different machine learning algorithms is in line with many cheminformatics applications, with SVM and RF being roughly equally effective. kNN is less successful, given the role that non-local information plays in successful classification. We note also that, despite a lack of clarity in the literature, EC number prediction is not a single problem; the challenge of predicting protein function from available sequence data is quite different from assigning an EC classification from a cheminformatics representation of a reaction.Publisher PDFPeer reviewe

    Finding one's way in proteomics: a protein species nomenclature

    Get PDF
    Our knowledge of proteins has greatly improved in recent years, driven by new technologies in the fields of molecular biology and proteome research. It has become clear that from a single gene not only one single gene product but many different ones - termed protein species - are generated, all of which may be associated with different functions. Nonetheless, an unambiguous nomenclature for describing individual protein species is still lacking. With the present paper we therefore propose a systematic nomenclature for the comprehensive description of protein species. The protein species nomenclature is flexible and adaptable to every level of knowledge and of experimental data in accordance with the exact chemical composition of individual protein species. As a minimum description the entry name (gene name + species according to the UniProt knowledgebase) can be used, if no analytical data about the target protein species are available

    A Need for Improved Cellulase Identification from Metagenomic Sequence Data

    No full text
    corecore