9 research outputs found
Multiple mechanisms contribute to lateral transfer of an organophosphate degradation (opd) island in Sphingobium fuliginis ATCC 27551
The complete sequence of pPDL2 (37,317 bp), an indigenous plasmid of Sphingobium fuliginis ATCC 27551 that encodes genes for organophosphate degradation (opd), revealed the existence of a site-specific integrase (int) gene with an attachment site attP, typically seen in Integrative Mobilizable Elements (IME). In agreement with this sequence information, site-specific recombination was observed between pPDL2 and an artificial plasmid having a temperature-sensitive replicon and a cloned attB site at the 3′ end of the seryl tRNA gene of Sphingobium japonicum. The opd gene cluster on pPDL2 was found to be part of an active catabolic transposon with mobile elements y4qE and Tn3 at its flanking ends. Besides the previously reported opd cluster, this transposon contains genes coding for protocatechuate dioxygenase and for two transport proteins from the major facilitator family that are predicted to be involved in transport and metabolism of aromatic compounds. A pPDL2 derivative, pPDL2-K, was horizontally transferred into Escherichia coli and Acinetobacter strains, suggesting that the oriT identified in pPDL2 is functional. A well-defined replicative origin (oriV), repA was identified along with a plasmid addiction module relB/relE that would support stable maintenance of pPDL2 in Sphingobium fuliginis ATCC 27551. However, if pPDL2 is laterally transferred into hosts that do not support its replication, the opd cluster appears to integrate into the host chromosome, either through transposition or through site-specific integration. The data presented in this study help to explain the existence of identical opd genes among soil bacteria
Integrating transcriptomic and proteomic data for accurate assembly and annotation of genomes
© 2017 Wong et al.; Published by Cold Spring Harbor Laboratory Press. Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted noncoding RNAs to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes
Surgery, Octreotide, Temozolomide, Bevacizumab, Radiotherapy, and Pegvisomant Treatment of an AIP Mutation-Positive Child
UK India Education Research Initiative and the British Council (75-2014; to P.D.)Council of Scientific and Industrial Research, University Grants Commission, for financial support (2061330632; to A.R.).Medical Research Council (MR/M018539/1; to M.K
Characterization of human pineal gland proteome
We employed a high-resolution mass spectrometry-based approach to characterize the proteome of the human pineal gland.</p
Chromosome-centric Human Proteome Project: Deciphering Proteins Associated with Glioma and Neurodegenerative Disorders on Chromosome 12
In
line with the aims of the Chromosome-centric Human Proteome
Project (C-HPP) to completely annotate proteins of each chromosome
and biology/disease driven HPP (B/D-HPP) to decipher their relation
to diseases, we have generated a nonredundant catalogue of protein-coding
genes for Chromosome 12 (Chr. 12) and further annotated proteins associated
with major neurological disorders. Integrating high level proteomic
evidence from four major databases (neXtProt, Global Proteome Machine
(GPMdb), PeptideAtlas, and Human Protein Atlas (HPA)) along with Ensembl
data resource resulted in the identification of 1066 protein coding
genes, of which 171 were defined as “missing proteins”
based on the weak or complete absence of experimental evidence. With
functional annotations using DAVID and GAD, about 40% of the proteins
could be grouped as brain related with implications in cancer or neurological
disorders. We used published and unpublished high confidence mass
spectrometry data from our group and other literature consisting of
more than 5000 proteins derived from clinical specimens from patients
with human gliomas, Alzheimer’s disease, and Parkinson’s
disease and mapped it onto Chr. 12. We observed a total of 202 proteins
mapping to human Chr. 12, 136 of which were differentially expressed
in these disease conditions as compared to the normal. Functional
grouping indicated their association with cell cycle, cell-to-cell
signaling, and other important processes and networks, whereas their
disease association analysis confirmed neurological diseases and cancer
as the major group along with psycological disorders, with several
overexpressed genes/proteins mapping to 12q13-15 amplicon region.
Using multiple strategies and bioinformatics tools, we identified
103 differentially expressed proteins to have secretory potential,
17 of which have already been reported in direct analysis of the plasma
or cerebrospinal fluid (CSF) from the patients and 21 of them mapped
to cancer associated protein (CAPs) database that are amenable to
selective reaction monitoring (SRM) assays for targeted proteomic
analysis. Our analysis also reveals, for the first time, mass spectrometric
evidence for two “missing proteins” from Chr. 12, namely,
synaptic vesicle 2-related protein (SVOP) and IQ motif containing
D (IQCD). The analysis provides a snapshot of Chr. 12 encoded proteins
associated with gliomas and major neurological conditions and their
secretability which can be used to drive efforts for clinical applications
Annotation of the Zebrafish Genome through an Integrated Transcriptomic and Proteomic Analysis
Accurate annotation of protein-coding genes is one of the primary tasks upon the completion of whole genome sequencing of any organism. In this study, we used an integrated transcriptomic and proteomic strategy to validate and improve the existing zebrafish genome annotation. We undertook high-resolution mass-spectrometry-based proteomic profiling of 10 adult organs, whole adult fish body, and two developmental stages of zebrafish (SAT line), in addition to transcriptomic profiling of six organs. More than 7,000 proteins were identified from proteomic analyses, and ∼69,000 high-confidence transcripts were assembled from the RNA sequencing data. Approximately 15% of the transcripts mapped to intergenic regions, the majority of which are likely long non-coding RNAs. These high-quality transcriptomic and proteomic data were used to manually reannotate the zebrafish genome. We report the identification of 157 novel protein-coding genes. In addition, our data led to modification of existing gene structures including novel exons, changes in exon coordinates, changes in frame of translation, translation in annotated UTRs, and joining of genes. Finally, we discovered four instances of genome assembly errors that were supported by both proteomic and transcriptomic data. Our study shows how an integrative analysis of the transcriptome and the proteome can extend our understanding of even well-annotated genomes
Proteomic analysis of purified protein derivative of mycobacterium tuberculosis
BACKGROUND: Purified protein derivative (PPD) has been used for more than half a century as an antigen for the diagnosis of tuberculosis infection based on delayed type hypersensitivity. Although designated as “purified,” in reality, the composition of PPD is highly complex and remains ill-defined. In this report, high resolution mass spectrometry was applied to understand the complexity of its constituent components. A comparative proteomic analysis of various PPD preparations and their functional characterization is likely to help in short-listing the relevant antigens required to prepare a less complex and more potent reagent for diagnostic purposes. RESULTS: Proteomic analysis of Connaught Tuberculin 68 (PPD-CT68), a tuberculin preparation generated from M. tuberculosis, was carried out in this study. PPD-CT68 is the protein component of a commercially available tuberculin preparation, Tubersol, which is used for tuberculin skin testing. Using a high resolution LTQ-Orbitrap Velos mass spectrometer, we identified 265 different proteins. The identified proteins were compared with those identified from PPD M. bovis, PPD M. avium and PPD-S2 from previous mass spectrometry-based studies. In all, 142 proteins were found to be shared between PPD-CT68 and PPD-S2 preparations. Out of the 354 proteins from M. tuberculosis–derived PPDs (i.e. proteins in either PPD-CT68 or PPD-S2), 37 proteins were found to be shared with M. avium PPD and 80 were shared with M. bovis PPD. Alignment of PPD-CT68 proteins with proteins encoded by 24 lung infecting bacteria revealed a number of similar proteins (206 bacterial proteins shared epitopes with 47 PPD-CT68 proteins), which could potentially be involved in causing cross-reactivity. The data have been deposited to the ProteomeXchange with identifier PXD000377. CONCLUSIONS: Proteomic and bioinformatics analysis of different PPD preparations revealed commonly and differentially represented proteins. This information could help in delineating the relevant antigens represented in various PPDs, which could further lead to development of a lesser complex and better defined skin test antigen with a higher specificity and sensitivity
A draft map of the human proteome
The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease. © 2014 Macmillan Publishers Limited