12 research outputs found

    Integrating transcriptomic and proteomic data for accurate assembly and annotation of genomes

    Get PDF
    © 2017 Wong et al.; Published by Cold Spring Harbor Laboratory Press. Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted noncoding RNAs to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes

    Characterization of human pineal gland proteome

    No full text
    We employed a high-resolution mass spectrometry-based approach to characterize the proteome of the human pineal gland.</p

    Comprehensive Proteomics Analysis of Glycosomes from Leishmania donovani

    No full text
    Leishmania donovani is a kinetoplastid protozoan that causes a severe and fatal disease kala-azar, or visceral leishmaniasis. L. donovani infects human host after the phlebotomine sandfly takes a blood meal and resides within the phagolysosome of infected macrophages. Previous studies on host–parasite interactions have not focused on Leishmania organelles and the role that they play in the survival of this parasite within macrophages. Leishmania possess glycosomes that are unique and specialized subcellular microbody organelles. Glycosomes are known to harbor most peroxisomal enzymes and, in addition, they also possess nine glycolytic enzymes. In the present study, we have carried out proteomic profiling using high resolution mass spectrometry of a sucrose density gradient-enriched glycosomal fraction isolated from L. donovani promastigotes. This study resulted in the identification of 4022 unique peptides, leading to the identification of 1355 unique proteins from a preparation enriched in L. donovani glycosomes. Based on protein annotation, 566 (41.8%) were identified as hypothetical proteins with no known function. A majority of the identified proteins are involved in metabolic processes such as carbohydrate, lipid, and nucleic acid metabolism. Our present proteomic analysis is the most comprehensive study to date to map the proteome of L. donovani glycosomes

    Brain proteomics of anopheles gambiae

    No full text
    Anopheles gambiae has a well-adapted system for host localization, feeding, and mating behavior, which are all governed by neuronal processes in the brain. However, there are no published reports characterizing the brain proteome to elucidate neuronal signaling mechanisms in the vector. To this end, a large-scale mapping of the brain proteome of An. gambiae was carried out using high resolution tandem mass spectrometry, revealing a repertoire of \u3e1800 proteins, of which 15% could not be assigned any function. A large proportion of the identified proteins were predicted to be involved in diverse biological processes including metabolism, transport, protein synthesis, and olfaction. This study also led to the identification of 10 GPCR classes of proteins, which could govern sensory pathways in mosquitoes. Proteins involved in metabolic and neural processes, chromatin modeling, and synaptic vesicle transport associated with neuronal transmission were predominantly expressed in the brain. Proteogenomic analysis expanded our findings with the identification of 15 novel genes and 71 cases of gene refinements, a subset of which were validated by RT-PCR and sequencing. Overall, our study offers valuable insights into the brain physiology of the vector that could possibly open avenues for intervention strategies for malaria in the future. © Copyright 2014, Mary Ann Liebert, Inc. 2014

    Moving from unsequenced to sequenced genome:Reanalysis of the proteome of Leishmania donovani

    No full text
    The kinetoplastid protozoan parasite, Leishmania donovani, is the causative agent of kala azar or visceral leishmaniasis. Kala azar is a severe form of leishmaniasis that is fatal in the majority of untreated cases. Studies on proteomic analysis of L. donovani thus far have been carried out using homology-based identification based on related Leishmania species (L. infantum, L. major and L. braziliensis) whose genomes have been sequenced. Recently, the genome of L. donovani was fully sequenced and the data became publicly available. We took advantage of the availability of its genomic sequence to carry out a more accurate proteogenomic analysis of L. donovani proteome using our previously generated dataset. This resulted in identification of 17,504 unique peptides upon database-dependent search against the annotated proteins in L. donovani. These peptides were assigned to 3999 unique proteins in L. donovani. 2296 proteins were identified in both the life stages of L. donovani, while 613 and 1090 proteins were identified only from amastigote and promastigote stages, respectively. The proteomic data was also searched against six-frame translated L. donovani genome, which led to 255 genome search-specific peptides (GSSPs) resulting in identification of 20 novel genes and correction of 40 existing gene models in L. donovani. BIOLOGICAL SIGNIFICANCE: Leishmania donovani genome sequencing was recently completed, which permitted us to use a proteogenomic approach to map its proteome and to carry out annotation of it genome. This resulted in mapping of 50% (3999 proteins) of L. donovani proteome. Our study identified 20 novel genes previously not predicted from the L. donovani genome in addition to correcting annotations of 40 existing gene models. The identified proteins may help in better understanding of stage-specific protein expression profiles in L. donovani and to identify novel stage-specific drug targets in L. donovani which could be used in the treatment of leishmaniasis

    Downregulation of S100 calcium binding protein A9 in esophageal squamous cell carcinoma

    Get PDF
    The development of esophageal squamous cell carcinoma (ESCC) is poorly understood and the major regulatory molecules involved in the process of tumorigenesis have not yet been identified. We had previously employed a quantitative proteomic approach to identify differentially expressed proteins in ESCC tumors. A total of 238 differentially expressed proteins were identified in that study including S100 calcium binding protein A9 (S100A9) as one of the major downregulated proteins. In the present study, we carried out immunohistochemical validation of S100A9 in a large cohort of ESCC patients to determine the expression and subcellular localization of S100A9 in tumors and adjacent normal esophageal epithelia. Downregulation of S100A9 was observed in 67% (n=192) of 288 different ESCC tumors, with the most dramatic downregulation observed in the poorly differentiated tumors (99/111). Expression of S100A9 was restricted to the prickle and functional layers of normal esophageal mucosa and localized predominantly in the cytoplasm and nucleus whereas virtually no expression was observed in the tumor and stromal cells. This suggests the important role that S100A9 plays in maintaining the differentiated state of epithelium and suggests that its downregulation may be associated with increased susceptibility to tumor formation

    Neglected tropical diseases and omics science:Proteogenomics analysis of the promastigote stage of leishmania major parasite

    No full text
    Among the neglected tropical diseases, leishmaniasis is one of the most devastating, resulting in significant mortality and contributing to nearly 2 million disability-adjusted life years. Cutaneous leishmaniasis is a debilitating disorder caused by the kinetoplastid protozoan parasite Leishmania major, which results in disfiguration and scars. L. major genome was the first to be sequenced within the genus Leishmania. Use of proteomic data for annotating genomes is a complementary approach to conventional genome annotation approaches and is referred to as proteogenomics. We have used a proteogenomics-based approach to map the proteome of L. major and also annotate its genome. In this study, we searched L. major promastigote proteomic data against the annotated L. major protein database. Additionally, we searched the proteomic data against six-frame translated L. major genome. In all, we identified 3613 proteins in L. major promastigotes, which covered 43% of its proteome. We also identified 26 genome search-specific peptides, which led to the identification of three novel genes previously not identified in L. major. We also corrected the annotation of N-termini of 15 genes, which resulted in extension of their protein products. We have validated our proteogenomics findings by RT-PCR and sequencing. In addition, our study resulted in identification of 266 N-terminally acetylated peptides in L. major, one of the largest acetylated peptide datasets thus far in Leishmania. This dataset should be a valuable resource to researchers focusing on neglected tropical diseases.</p

    Dysregulation of splicing proteins in head and neck squamous cell carcinoma

    No full text
    Signaling plays an important role in regulating all cellular pathways. Altered signaling is one of the hallmarks of cancers. Phosphoproteomics enables interrogation of kinase mediated signaling pathways in biological systems. In cancers, this approach can be utilized to identify aberrantly activated pathways that potentially drive proliferation and tumorigenesis. To identify signaling alterations in head and neck squamous cell carcinoma (HNSCC), we carried out proteomic and phosphoproteomic analysis of HNSCC cell lines using a combination of tandem mass tag (TMT) labeling approach and titanium dioxide-based enrichment. We identified 4,920 phosphosites corresponding to 2,212 proteins in six HNSCC cell lines compared to a normal oral cell line. Our data indicated significant enrichment of proteins associated with splicing. We observed hyperphosphorylation of SRSF protein kinase 2 (SRPK2) and its downstream substrates in HNSCC cell lines. SRPK2 is a splicing kinase, known to phosphorylate serine/arginine (SR) rich domain proteins and regulate splicing process in eukaryotes. Although genome-wide studies have reported the contribution of alternative splicing events of several genes in the progression of cancer, the involvement of splicing kinases in HNSCC is not known. In this study, we studied the role of SRPK2 in HNSCC. Inhibition of SRPK2 resulted in significant decrease in colony forming and invasive ability in a panel of HNSCC cell lines. Our results indicate that phosphorylation of SRPK2 plays a crucial role in the regulation of splicing process in HNSCC and that splicing kinases can be developed as a new class of therapeutic target in HNSCC
    corecore