54 research outputs found

    Non-homologous isofunctional enzymes: A systematic analysis of alternative solutions in enzyme evolution

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Evolutionarily unrelated proteins that catalyze the same biochemical reactions are often referred to as analogous - as opposed to homologous - enzymes. The existence of numerous alternative, non-homologous enzyme isoforms presents an interesting evolutionary problem; it also complicates genome-based reconstruction of the metabolic pathways in a variety of organisms. In 1998, a systematic search for analogous enzymes resulted in the identification of 105 Enzyme Commission (EC) numbers that included two or more proteins without detectable sequence similarity to each other, including 34 EC nodes where proteins were known (or predicted) to have distinct structural folds, indicating independent evolutionary origins. In the past 12 years, many putative non-homologous isofunctional enzymes were identified in newly sequenced genomes. In addition, efforts in structural genomics resulted in a vastly improved structural coverage of proteomes, providing for definitive assessment of (non)homologous relationships between proteins.</p> <p>Results</p> <p>We report the results of a comprehensive search for non-homologous isofunctional enzymes (NISE) that yielded 185 EC nodes with two or more experimentally characterized - or predicted - structurally unrelated proteins. Of these NISE sets, only 74 were from the original 1998 list. Structural assignments of the NISE show over-representation of proteins with the TIM barrel fold and the nucleotide-binding Rossmann fold. From the functional perspective, the set of NISE is enriched in hydrolases, particularly carbohydrate hydrolases, and in enzymes involved in defense against oxidative stress.</p> <p>Conclusions</p> <p>These results indicate that at least some of the non-homologous isofunctional enzymes were recruited relatively recently from enzyme families that are active against related substrates and are sufficiently flexible to accommodate changes in substrate specificity.</p> <p>Reviewers</p> <p>This article was reviewed by Andrei Osterman, Keith F. Tipton (nominated by Martijn Huynen) and Igor B. Zhulin. For the full reviews, go to the Reviewers' comments section.</p

    Natural Product Genomics and Metabolomics of Marine Bacteria

    Get PDF
    Marine organisms are a treasure trove for the discovery of novel natural products, and, thus, marine natural products have been a focus of interest for researchers for decades. Some marine bacteria are prolific producers of natural products, occurring either free-living or, as recently shown, in symbiosis with marine animals. Recent advances in DNA sequencing have led to an enormous increase in published bacterial genomes and bioinformatics tools to analyze natural product biosynthetic potential by various “genome mining” approaches. Similarly, analytical NMR and MS methods for the characterization and comparison of metabolomes of natural product producers have advanced. Novel interdisciplinary approaches combine genomics and metabolomics data for accelerated and targeted natural product discovery. This Special Issue invites articles from both genomics- and metabolomics-driven studies on marine bacteria with a focus on natural product discovery and characterization. We particularly welcome articles that combine genomics and metabolomic approaches for the dereplication and characterization of marine bacterial natural products

    Exploring functional annotation through genomic and metagenomic data mining

    Get PDF
    Functional profiling of genomes and metagenomes, as well as data mining for novel proteins, all rely on computational methods for functional annotation of protein sequences. Standard methods assign protein function based on detected homology to reference sequences, but often leave behind a significant fraction of hypothetical sequences ("dark matter") that cannot be annotated. To maximize our ability to extract new biological insights from newly sequenced genomes, it is critical to understand the advantages and limitations of homology-based annotation, and explore alternative methods for inferring function. In this thesis, I performed a comprehensive exploration of computational protein annotation, with a focus on bacterial genomes and metagenomes. First, I applied homology-based methods to functionally annotate and analyze original datasets including newly sequenced Streptomyces strains, a wastewater metagenome, and microbial communities involved in vertebrate decomposition. These studies identified genes and functions of interest including cellulases, antibiotic resistance genes, and virulence factors. I then explored the limits of homology-based annotation by measuring annotation coverage, the fraction of annotated proteins in a proteome, across ~27,000 organisms in the microbial tree of life. This study demonstrated a wide range in annotation coverage across bacteria, from 2-86%. In addition, it revealed multiple factors including taxonomy, genome size, and research bias, as heavy influences on the degree to which proteomes could be annotated. To gain biological insights into hypothetical proteins of unknown function, I analyzed 4,049 domains of unknown function (DUFs) from Pfam. Using phylogenomic, taxonomic and metagenomic information, I detected statistical associations between domains and biological traits. Association-based methods uncovered environment, lineage, and/or pathogen associations in just under half of all DUFs and highlighted new families such as DUF4765 as intriguing virulence factor candidates. Finally, I constructed a database of "ORFan" metagenomic sequences that cannot be annotated using standard approaches, and inferred functions for tens of thousands of these sequences using profile-profile comparison approaches. Motif analysis and genomic context validated these predictions, enabling the discovery of hundreds of novel candidate metalloproteases. Protein "dark matter", which includes a large pool of unannotated coding sequences, is an incredible resource to find new proteins and functions of interest, and included are suggestions on how to prioritize these sequences for future study. A combination of homology-based and alternative annotation methods will be most effective for broad functional profiling of genomes and metagenomes, and can push the boundaries for functional interpretation of sequence data

    Exploring the chemical space of post-translationally modified peptides in Streptomyces with machine learning

    Get PDF
    The ongoing increase in antimicrobial resistance combined with the low discovery of novel antibiotics is a serious threat to our health care. Genome mining has given new potential to the field of natural product discovery, as thousands of biosynthetic gene clusters (BGCs) are discovered for which the natural product is not known.Ribosomally synthesized and post-translationally modified peptides (RiPPs) represent a highly diverse class of natural products. The large number of different modifications that can be applied to a RiPP results in a large variety of chemical structures, but also stems from a large genetic variety in BGCs. As a result, no single method can effectively mine for all RiPP BGCs, making it an interesting source for new molecules.In this thesis, new methods are explored to mine genomes for the BGCs of novel RiPP variants, with a focus on discovering RiPPs that have new modifications. RRE-Finder is a new tool for the detection of RiPP Recognition Elements, domains that are often found in RiPP BGCs. DecRiPPter is another tool that employs machine learning models to discover new RiPP precursor genes encoded in the genomes. Both tools can be used to prioritize novel RiPP BGCs. Two candidate BGCs are characterized, one of which could be shown to specify a new RiPP, validating the approach.Grant 731.014.206 (Syngenopep, TKI Chemie) from the Dutch Research Council (NWO)Microbial Biotechnolog

    Characterizing alternative splicing and long non-coding RNA with high-throughput sequencing technology

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Several experimental methods has been developed for the study of the central dogma since late 20th century. Protein mass spectrometry and next generation sequencing (including DNA-Seq and RNA-Seq) forms a triangle of experimental methods, corresponding to the three vertices of the central dogma, i.e., DNA, RNA and protein. Numerous RNA sequencing and protein mass spectrometry experiments has been carried out in attempt to understand how the expression change of known genes affect biological functions in various of organisms, however, it has been once overlooked that the result data of these experiments are in fact holograms which also reveals other delicate biological mechanisms, such as RNA splicing and the expression of long non-coding RNAs. In this dissertation, we carried out five studies based on high-throughput sequencing data, in an attempt to understand how RNA splicing and differential expression of long non-coding RNAs is associated biological functions. In the first two studies, we identified and characterized 197 stimulant induced and 477 developmentally regulated alternative splicing events from RNA sequencing data. In the third study, we introduced a method for identifying novel alternative splicing events that were never documented. In the fourth study, we introduced a method for identifying known and novel RNA splicing junctions from protein mass spectrometry data. In the fifth study, we introduced a method for identifying long non-coding RNAs from poly-A selected RNA sequencing data. Taking advantage of these methods, we turned RNA sequencing and protein mass spectrometry data into an information gold mine of splicing and long non-coding RNA activities.2019-05-0

    Allergen Delivery Inhibitors: A Rationale for Targeting Sentinel Innate Immune Signaling of Group 1 House Dust Mite Allergens through Structure-Based Protease Inhibitor Design

    Get PDF
    Diverse evidence from epidemiologic surveys and investigations into the molecular basis of allergenicity have revealed that a small cadre of “initiator” allergens promote the development of allergic diseases, such as asthma, allergic rhinitis, and atopic dermatitis. Pre-eminent among these initiators are the group 1 allergens from house dust mites (HDM). In mites, group 1 allergens function as cysteine peptidase digestive enzymes to which humans are exposed by inhalation of HDM fecal pellets. Their protease nature confers the ability to activate high gain signaling mechanisms which promote innate immune responses, leading to the persistence of allergic sensitization. An important feature of this process is that the initiator drives responses both to itself and to unrelated allergens lacking these properties through a process of collateral priming. The clinical significance of group 1 HDM allergens in disease, their serodominance as allergens, and their IgE-independent bioactivities in innate immunity make these allergens interesting therapeutic targets in the design of new small-molecule interventions in allergic disease. The attraction of this new approach is that it offers a powerful, root-cause-level intervention from which beneficial effects can be anticipated by interference in a wide range of effector pathways associated with these complex diseases. This review addresses the general background to HDM allergens and the validation of group 1 as putative targets. We then discuss structure-based drug design of the first-in-class representatives of allergen delivery inhibitors aimed at neutralizing the proteolytic effects of HDM group 1 allergens, which are essential to the development and maintenance of allergic diseases

    Polysaccharide utilization loci and associated genes in marine Bacteroidetes - compositional diversity and ecological relevance

    Get PDF
    The synthesis of marine organic carbon compounds by photosynthetic macroalgae, microalgae (phytoplankton) and bacteria provide a basis for life in the ocean. In marine surface waters this primary production is largely dominated by microalgae and is especially pronounced during spring phytoplankton blooms. During and after these often diatom-dominated blooms, increased amounts of organic matter are released into the surrounding waters. Here, the organic matter, rich in polysaccharides, can trigger blooms of heterotrophic bacteria. Marine members of the Bacteroidetes are consistently found related to such bloom events. These bacteria are regularly detected as the first responders to thrive after phytoplankton spring blooms in temperate coastal regions and are often equipped with a variety of polysaccharide utilization gene clusters. These gene clusters, termed polysaccharide utilization loci (PULs), encode enzymes for the extracellular hydrolysis of polysaccharides and the subsequent uptake of oligosaccharides into the periplasm, where they are shielded from competing bacteria. This mechanism allows for rapid uptake and substrate hoarding, and thus could be one reason why Bacteroidetes are often seen as the first responders of the bacterioplankton community. The investigation of the so far largely unknown diversity and the ecological relevance of PULs in marine Bacteroidetes was the major goal of the work presented here. We could show that genomes of Bacteroidetes isolates from the North Sea, with free-living to micro- and macro-algae associated lifestyles, harboured a variety of these loci predicted to target in total 18 different substrate classes. Overall PUL repertoires of these isolates showed considerable intra-genus and inter-genus, variations suggesting that Bacteroidetes species harbour distinct glycan niches, independent of their phylogenetic relationships. By investigating the PUL repertoires of uncultured free-living Bacteroidetes during three consecutive years of spring phytoplankton blooms at the North Sea island of Helgoland, I could further reveal that the set of targeted substrates during these bloom events was dominated by only five of the substrate classes targeted by the isolates. These were the diatom storage polysaccharide laminarin, alpha-glucans, alginates, as well as substrates rich in alpha-mannans and sulfated xylans. In addition to this constrained set of substrate classes targeted by the free-living Bacteroidetes community, I could show that the species diversity during these blooms was limited and dominated by only 27 abundant and recurrent species that carried a limited number of abundant PULs. The majority of these PULs were targeting laminarin and alpha-glucan substrates, which were likely targeted during the entire time of the blooms. The less frequent PULs, targeting alpha-mannans and sulfated xylans, were predominantly detected during mid- and late- bloom phases, suggesting a relevance of these two substrate classes in the later phases of phytoplankton blooms. Overall these findings highlight the recurrence of a few specialized Bacteroidetes species and the environmental relevance of specific polysaccharide substrate classes during spring phytoplankton blooms. However, for some of these substrate classes the origin, structural details and their abundance during blooms are as yet largely unknown. To further shed light on the polysaccharide niches of abundant key-players, these findings can serve as a guide for future laboratory studies
    • …
    corecore