9 research outputs found

    Multiple mechanisms contribute to lateral transfer of an organophosphate degradation (opd) island in Sphingobium fuliginis ATCC 27551

    Get PDF
    The complete sequence of pPDL2 (37,317 bp), an indigenous plasmid of Sphingobium fuliginis ATCC 27551 that encodes genes for organophosphate degradation (opd), revealed the existence of a site-specific integrase (int) gene with an attachment site attP, typically seen in Integrative Mobilizable Elements (IME). In agreement with this sequence information, site-specific recombination was observed between pPDL2 and an artificial plasmid having a temperature-sensitive replicon and a cloned attB site at the 3′ end of the seryl tRNA gene of Sphingobium japonicum. The opd gene cluster on pPDL2 was found to be part of an active catabolic transposon with mobile elements y4qE and Tn3 at its flanking ends. Besides the previously reported opd cluster, this transposon contains genes coding for protocatechuate dioxygenase and for two transport proteins from the major facilitator family that are predicted to be involved in transport and metabolism of aromatic compounds. A pPDL2 derivative, pPDL2-K, was horizontally transferred into Escherichia coli and Acinetobacter strains, suggesting that the oriT identified in pPDL2 is functional. A well-defined replicative origin (oriV), repA was identified along with a plasmid addiction module relB/relE that would support stable maintenance of pPDL2 in Sphingobium fuliginis ATCC 27551. However, if pPDL2 is laterally transferred into hosts that do not support its replication, the opd cluster appears to integrate into the host chromosome, either through transposition or through site-specific integration. The data presented in this study help to explain the existence of identical opd genes among soil bacteria

    Integrating transcriptomic and proteomic data for accurate assembly and annotation of genomes

    Get PDF
    © 2017 Wong et al.; Published by Cold Spring Harbor Laboratory Press. Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted noncoding RNAs to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes

    Characterization of human pineal gland proteome

    No full text
    We employed a high-resolution mass spectrometry-based approach to characterize the proteome of the human pineal gland.</p

    Chromosome-centric Human Proteome Project: Deciphering Proteins Associated with Glioma and Neurodegenerative Disorders on Chromosome 12

    No full text
    In line with the aims of the Chromosome-centric Human Proteome Project (C-HPP) to completely annotate proteins of each chromosome and biology/disease driven HPP (B/D-HPP) to decipher their relation to diseases, we have generated a nonredundant catalogue of protein-coding genes for Chromosome 12 (Chr. 12) and further annotated proteins associated with major neurological disorders. Integrating high level proteomic evidence from four major databases (neXtProt, Global Proteome Machine (GPMdb), PeptideAtlas, and Human Protein Atlas (HPA)) along with Ensembl data resource resulted in the identification of 1066 protein coding genes, of which 171 were defined as “missing proteins” based on the weak or complete absence of experimental evidence. With functional annotations using DAVID and GAD, about 40% of the proteins could be grouped as brain related with implications in cancer or neurological disorders. We used published and unpublished high confidence mass spectrometry data from our group and other literature consisting of more than 5000 proteins derived from clinical specimens from patients with human gliomas, Alzheimer’s disease, and Parkinson’s disease and mapped it onto Chr. 12. We observed a total of 202 proteins mapping to human Chr. 12, 136 of which were differentially expressed in these disease conditions as compared to the normal. Functional grouping indicated their association with cell cycle, cell-to-cell signaling, and other important processes and networks, whereas their disease association analysis confirmed neurological diseases and cancer as the major group along with psycological disorders, with several overexpressed genes/proteins mapping to 12q13-15 amplicon region. Using multiple strategies and bioinformatics tools, we identified 103 differentially expressed proteins to have secretory potential, 17 of which have already been reported in direct analysis of the plasma or cerebrospinal fluid (CSF) from the patients and 21 of them mapped to cancer associated protein (CAPs) database that are amenable to selective reaction monitoring (SRM) assays for targeted proteomic analysis. Our analysis also reveals, for the first time, mass spectrometric evidence for two “missing proteins” from Chr. 12, namely, synaptic vesicle 2-related protein (SVOP) and IQ motif containing D (IQCD). The analysis provides a snapshot of Chr. 12 encoded proteins associated with gliomas and major neurological conditions and their secretability which can be used to drive efforts for clinical applications

    Annotation of the Zebrafish Genome through an Integrated Transcriptomic and Proteomic Analysis

    No full text
    Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes

    Proteomic analysis of purified protein derivative of mycobacterium tuberculosis

    Get PDF
    BACKGROUND: Purified protein derivative (PPD) has been used for more than half a century as an antigen for the diagnosis of tuberculosis infection based on delayed type hypersensitivity. Although designated as “purified,” in reality, the composition of PPD is highly complex and remains ill-defined. In this report, high resolution mass spectrometry was applied to understand the complexity of its constituent components. A comparative proteomic analysis of various PPD preparations and their functional characterization is likely to help in short-listing the relevant antigens required to prepare a less complex and more potent reagent for diagnostic purposes. RESULTS: Proteomic analysis of Connaught Tuberculin 68 (PPD-CT68), a tuberculin preparation generated from M. tuberculosis, was carried out in this study. PPD-CT68 is the protein component of a commercially available tuberculin preparation, Tubersol, which is used for tuberculin skin testing. Using a high resolution LTQ-Orbitrap Velos mass spectrometer, we identified 265 different proteins. The identified proteins were compared with those identified from PPD M. bovis, PPD M. avium and PPD-S2 from previous mass spectrometry-based studies. In all, 142 proteins were found to be shared between PPD-CT68 and PPD-S2 preparations. Out of the 354 proteins from M. tuberculosis–derived PPDs (i.e. proteins in either PPD-CT68 or PPD-S2), 37 proteins were found to be shared with M. avium PPD and 80 were shared with M. bovis PPD. Alignment of PPD-CT68 proteins with proteins encoded by 24 lung infecting bacteria revealed a number of similar proteins (206 bacterial proteins shared epitopes with 47 PPD-CT68 proteins), which could potentially be involved in causing cross-reactivity. The data have been deposited to the ProteomeXchange with identifier PXD000377. CONCLUSIONS: Proteomic and bioinformatics analysis of different PPD preparations revealed commonly and differentially represented proteins. This information could help in delineating the relevant antigens represented in various PPDs, which could further lead to development of a lesser complex and better defined skin test antigen with a higher specificity and sensitivity

    A draft map of the human proteome

    No full text
    The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease. © 2014 Macmillan Publishers Limited
    corecore