127 research outputs found

    How to Find “Missing” Genes

    Get PDF
    AbstractAssigning function to “new” proteins is frequently the rate-determining step for deciphering metabolic pathways and regulatory networks. Osterman and Begley break down this barrier by demonstrating that comparative analyses of microbial genomes is a powerful strategy for identifying pathway components

    Can sequence determine function?

    Get PDF
    The functional annotation of proteins identified in genome sequencing projects is based on similarities to homologs in the databases. As a result of the possible strategies for divergent evolution, homologous enzymes frequently do not catalyze the same reaction, and we conclude that assignment of function from sequence information alone should be viewed with some skepticism

    A gold standard set of mechanistically diverse enzyme superfamilies

    Get PDF
    Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures

    Target selection and annotation for the structural genomics of the amidohydrolase and enolase superfamilies

    Get PDF
    To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence–structure–function relationships

    Tools and strategies for discovering novel enzymes and metabolic pathways

    Get PDF
    SummaryThe number of entries in the sequence databases continues to increase exponentially – the UniProt database is increasing with a doubling time of ∼4 years (2% increase/month). Approximately 50% of the entries have uncertain, unknown, or incorrect function annotations because these are made by automated methods based on sequence homology. If the potential in complete genome sequences is to be realized, strategies and tools must be developed to facilitate experimental assignment of functions to uncharacterized proteins discovered in genome projects. The Enzyme Function Initiative (EFI; previously supported by U54GM093342 from the National Institutes of Health, now supported by P01GM118303) developed web tools for visualizing and analyzing (1) sequence and function space in protein families (EFI-EST) and (2) genome neighbourhoods in microbial and fungal genomes (EFI-GNT) to assist the design of experimental strategies for discovering the in vitro activities and in vivo metabolic functions of uncharacterized enzymes. The EFI developed an experimental platform for large-scale production of the solute binding proteins (SBPs) for ABC, TRAP, and TCT transport systems and their screening with a physical ligand library to identify the identities of the ligands for these transport systems. Because the genes that encode transport systems are often co-located with the genes that encode the catabolic pathways for the transported solutes, the identity of the SBP ligand together with the EFI-EST and EFI-GNT web tools can be used to discover new enzyme functions and new metabolic pathways. This approach is demonstrated with the characterization of a novel pathway for ethanolamine catabolism

    A Protein Structure (or Function ?) Initiative

    No full text

    Special Issue: Current Topics in Mechanistic Enzymology 2019

    No full text

    The Need for Manuscripts To Include Database Identifiers for Proteins

    No full text
    corecore