161 research outputs found

    SOLiD sequencing of four Vibrio vulnificus genomes enables comparative genomic analysis and identification of candidate clade-specific virulence genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Vibrio vulnificus </it>is the leading cause of reported death from consumption of seafood in the United States. Despite several decades of research on molecular pathogenesis, much remains to be learned about the mechanisms of virulence of this opportunistic bacterial pathogen. The two complete and annotated genomic DNA sequences of <it>V. vulnificus </it>belong to strains of clade 2, which is the predominant clade among clinical strains. Clade 2 strains generally possess higher virulence potential in animal models of disease compared with clade 1, which predominates among environmental strains. SOLiD sequencing of four <it>V. vulnificus </it>strains representing different clades (1 and 2) and biotypes (1 and 2) was used for comparative genomic analysis.</p> <p>Results</p> <p>Greater than 4,100,000 bases were sequenced of each strain, yielding approximately 100-fold coverage for each of the four genomes. Although the read lengths of SOLiD genomic sequencing were only 35 nt, we were able to make significant conclusions about the unique and shared sequences among the genomes, including identification of single nucleotide polymorphisms. Comparative analysis of the newly sequenced genomes to the existing reference genomes enabled the identification of 3,459 core <it>V. vulnificus </it>genes shared among all six strains and 80 clade 2-specific genes. We identified 523,161 SNPs among the six genomes.</p> <p>Conclusions</p> <p>We were able to glean much information about the genomic content of each strain using next generation sequencing. Flp pili, GGDEF proteins, and genomic island XII were identified as possible virulence factors because of their presence in virulent sequenced strains. Genomic comparisons also point toward the involvement of sialic acid catabolism in pathogenesis.</p

    Identification of a conserved N-terminal domain in the first module of ACV synthetases

    Get PDF
    Abstract The l‐δ‐(α‐aminoadipoyl)‐l‐cysteinyl‐d‐valine synthetase (ACVS) is a trimodular nonribosomal peptide synthetase (NRPS) that provides the peptide precursor for the synthesis of β‐lactams. The enzyme has been extensively characterized in terms of tripeptide formation and substrate specificity. The first module is highly specific and is the only NRPS unit known to recruit and activate the substrate l‐α‐aminoadipic acid, which is coupled to the α‐amino group of l‐cysteine through an unusual peptide bond, involving its δ‐carboxyl group. Here we carried out an in‐depth investigation on the architecture of the first module of the ACVS enzymes from the fungus Penicillium rubens and the bacterium Nocardia lactamdurans. Bioinformatic analyses revealed the presence of a previously unidentified domain at the N‐terminus which is structurally related to condensation domains, but smaller in size. Deletion variants of both enzymes were generated to investigate the potential impact on penicillin biosynthesis in vivo and in vitro. The data indicate that the N‐terminal domain is important for catalysis

    Novel genomic island modifies DNA with 7-deazaguanine derivatives

    Get PDF
    The discovery of ∼20-kb gene clusters containing a family of paralogs of tRNA guanosine transglycosylase genes, called tgtA5, alongside 7-cyano-7-deazaguanine (preQ[subscript 0]) synthesis and DNA metabolism genes, led to the hypothesis that 7-deazaguanine derivatives are inserted in DNA. This was established by detecting 2’-deoxy-preQ[subscript 0] and 2’-deoxy-7-amido-7-deazaguanosine in enzymatic hydrolysates of DNA extracted from the pathogenic, Gram-negative bacteria Salmonella enterica serovar Montevideo. These modifications were absent in the closely related S. enterica serovar Typhimurium LT2 and from a mutant of S. Montevideo, each lacking the gene cluster. This led us to rename the genes of the S. Montevideo cluster as dpdA-K for 7-deazapurine in DNA. Similar gene clusters were analyzed in ∼150 phylogenetically diverse bacteria, and the modifications were detected in DNA from other organisms containing these clusters, including Kineococcus radiotolerans, Comamonas testosteroni, and Sphingopyxis alaskensis. Comparative genomic analysis shows that, in Enterobacteriaceae, the cluster is a genomic island integrated at the leuX locus, and the phylogenetic analysis of the TgtA5 family is consistent with widespread horizontal gene transfer. Comparison of transformation efficiencies of modified or unmodified plasmids into isogenic S. Montevideo strains containing or lacking the cluster strongly suggests a restriction–modification role for the cluster in Enterobacteriaceae. Another preQ[subscript 0] derivative, 2’-deoxy-7-formamidino-7-deazaguanosine, was found in the Escherichia coli bacteriophage 9g, as predicted from the presence of homologs of genes involved in the synthesis of the archaeosine tRNA modification. These results illustrate a deep and unexpected evolutionary connection between DNA and tRNA metabolism.Deutsche ForschungsgemeinschaftSingapore-MIT Alliance in Research and Technology (SMART

    Synergistic use of plant-prokaryote comparative genomics for functional annotations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying functions for all gene products in all sequenced organisms is a central challenge of the post-genomic era. However, at least 30-50% of the proteins encoded by any given genome are of unknown or vaguely known function, and a large number are wrongly annotated. Many of these ‘unknown’ proteins are common to prokaryotes and plants. We set out to predict and experimentally test the functions of such proteins. Our approach to functional prediction integrates comparative genomics based mainly on microbial genomes with functional genomic data from model microorganisms and post-genomic data from plants. This approach bridges the gap between automated homology-based annotations and the classical gene discovery efforts of experimentalists, and is more powerful than purely computational approaches to identifying gene-function associations.</p> <p>Results</p> <p>Among Arabidopsis genes, we focused on those (2,325 in total) that (i) are unique or belong to families with no more than three members, (ii) occur in prokaryotes, and (iii) have unknown or poorly known functions. Computer-assisted selection of promising targets for deeper analysis was based on homology-independent characteristics associated in the SEED database with the prokaryotic members of each family. In-depth comparative genomic analysis was performed for 360 top candidate families. From this pool, 78 families were connected to general areas of metabolism and, of these families, specific functional predictions were made for 41. Twenty-one predicted functions have been experimentally tested or are currently under investigation by our group in at least one prokaryotic organism (nine of them have been validated, four invalidated, and eight are in progress). Ten additional predictions have been independently validated by other groups. Discovering the function of very widespread but hitherto enigmatic proteins such as the YrdC or YgfZ families illustrates the power of our approach.</p> <p>Conclusions</p> <p>Our approach correctly predicted functions for 19 uncharacterized protein families from plants and prokaryotes; none of these functions had previously been correctly predicted by computational methods. The resulting annotations could be propagated with confidence to over six thousand homologous proteins encoded in over 900 bacterial, archaeal, and eukaryotic genomes currently available in public databases.</p

    High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource

    Get PDF
    The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed

    Mutations in PROSC Disrupt Cellular Pyridoxal Phosphate Homeostasis and Cause Vitamin B6-Dependent Epilepsy

    Get PDF
    Pyridoxal 5'-phosphate (PLP), the active form of vitamin B6, functions as a cofactor in humans for more than 140 enzymes, many of which are involved in neurotransmitter synthesis and degradation. A deficiency of PLP can present, therefore, as seizures and other symptoms which are treatable with PLP and/or pyridoxine. Deficiency of PLP in the brain can be caused by inborn errors affecting B6 vitamer metabolism or by inactivation of PLP; by compounds accumulating as a result of inborn errors of other pathways or by ingested small molecules. Whole exome sequencing of 2 children from a consanguineous family with pyridoxine-dependent epilepsy revealed a homozygous nonsense mutation in proline synthetase co-transcribed homolog (bacterial) (PROSC), a PLPbinding protein of hitherto unknown function. Subsequent sequencing of 29 unrelated indivduals with pyridoxine-responsive epilepsy identified 4 additional children with biallelic PROSC mutations. Pretreatment cerebrospinal fluid samples showed low PLP concentrations and evidence of reduced activity of PLP-dependent enzymes. However, cultured fibroblasts showed excessive PLP accumulation. An E.coli mutant, lacking the PROSC homologue (ΔYggS) is pyridoxine-sensitive; complementation with human PROSC restored growth whilst hPROSC bearing p.Leu175Pro, p.Arg241Gln and p.Ser78Ter did not. PLP, a highly reactive aldehyde, poses a problem for cells - how to supply enough PLP for apoenzymes while maintaining free PLP concentrations low enough to avoid unwanted reactions with other important cellular nucleophiles. Whilst the mechanism involved is not fully understood our studies suggest that PROSC is involved in intracellular homeostatic regulation of PLP, supplying this cofactor to apoenzymes while minimizing any toxic side reactions

    7-Deazaguanine modifications protect phage DNA from host restriction systems

    Get PDF
    Genome modifications are central components of the continuous arms race between viruses and their hosts. The archaeosine base (G+), which was thought to be found only in archaeal tRNAs, was recently detected in genomic DNA of Enterobacteria phage 9g and was proposed to protect phage DNA from a wide variety of restriction enzymes. In this study, we identify three additional 2′-deoxy-7-deazaguanine modifications, which are all intermediates of the same pathway, in viruses: 2′-deoxy-7-amido-7-deazaguanine (dADG), 2′-deoxy-7-cyano-7-deazaguanine (dPreQ0) and 2′-deoxy-7- aminomethyl-7-deazaguanine (dPreQ1). We identify 180 phages or archaeal viruses that encode at least one of the enzymes of this pathway with an overrepresentation (60%) of viruses potentially infecting pathogenic microbial hosts. Genetic studies with the Escherichia phage CAjan show that DpdA is essential to insert the 7-deazaguanine base in phage genomic DNA and that 2′-deoxy-7-deazaguanine modifications protect phage DNA from host restriction enzymes

    Mitochondrial and plastidial COG0354 proteins have folate-dependent functions in iron–sulphur cluster metabolism

    Get PDF
    COG0354 proteins have been implicated in synthesis or repair of iron/sulfur (Fe/S) clusters in all domains of life, and those of bacteria, animals, and protists have been shown to require a tetrahydrofolate to function. Two COG0354 proteins were identified in Arabidopsis and many other plants, one (At4g12130) related to those of α-proteobacteria and predicted to be mitochondrial, the other (At1g60990) related to those of cyanobacteria and predicted to be plastidial. Grasses and poplar appear to lack the latter. The predicted subcellular locations of the Arabidopsis proteins were validated by in vitro import assays with purified pea organelles and by targeting assays in Arabidopsis and tobacco protoplasts using green fluorescent protein fusions. The At4g12130 protein was shown to be expressed mainly in flowers, siliques, and seeds, whereas the At1g60990 protein was expressed mainly in young leaves. The folate dependence of both Arabidopsis proteins was established by functional complementation of an Escherichia coli COG0354 (ygfZ) deletant; both plant genes restored in vivo activity of the Fe/S enzyme MiaB but restoration was abrogated when folates were eliminated by deleting folP. Insertional inactivation of At4g12130 was embryo lethal; this phenotype was reversed by genetic complementation of the mutant. These data establish that COG0354 proteins have a folate-dependent function in mitochondria and plastids, and that the mitochondrial protein is essential. That plants retain mitochondrial and plastidial COG0354 proteins with distinct phylogenetic origins emphasizes how deeply the extant Fe/S cluster assembly machinery still reflects the ancient endosymbioses that gave rise to plants

    GOBLET: the Global Organisation for Bioinformatics Learning, Education and Training

    Get PDF
    In recent years, high-throughput technologies have brought big data to the life sciences. The march of progress has been rapid, leaving in its wake a demand for courses in data analysis, data stewardship, computing fundamentals, etc., a need that universities have not yet been able to satisfy--paradoxically, many are actually closing "niche" bioinformatics courses at a time of critical need. The impact of this is being felt across continents, as many students and early-stage researchers are being left without appropriate skills to manage, analyse, and interpret their data with confidence. This situation has galvanised a group of scientists to address the problems on an international scale. For the first time, bioinformatics educators and trainers across the globe have come together to address common needs, rising above institutional and international boundaries to cooperate in sharing bioinformatics training expertise, experience, and resources, aiming to put ad hoc training practices on a more professional footing for the benefit of all

    Gcn4 misregulation reveals a direct role for the evolutionary conserved EKC/KEOPS in the t6A modification of tRNAs

    Get PDF
    The EKC/KEOPS complex is universally conserved in Archaea and Eukarya and has been implicated in several cellular processes, including transcription, telomere homeostasis and genomic instability. However, the molecular function of the complex has remained elusive so far. We analyzed the transcriptome of EKC/KEOPS mutants and observed a specific profile that is highly enriched in targets of the Gcn4p transcriptional activator. GCN4 expression was found to be activated at the translational level in mutants via the defective recognition of the inhibitory upstream ORFs (uORFs) present in its leader. We show that EKC/KEOPS mutants are defective for the N6-threonylcarbamoyl adenosine modification at position 37 (t6A37) of tRNAs decoding ANN codons, which affects initiation at the inhibitory uORFs and provokes Gcn4 de-repression. Structural modeling reveals similarities between Kae1 and bacterial enzymes involved in carbamoylation reactions analogous to t6A37 formation, supporting a direct role for the EKC in tRNA modification. These findings are further supported by strong genetic interactions of EKC mutants with a translation initiation factor and with threonine biosynthesis genes. Overall, our data provide a novel twist to understanding the primary function of the EKC/KEOPS and its impact on several essential cellular functions like transcription and telomere homeostasis
    corecore