21 research outputs found

    Dissecting human lymphoma using an integrated network analysis

    Get PDF
    Lymphomas are cancers that originate in the lymphatic system. Despite the heterogeneity, lymphomas are characterized by the deregulation of the p53-signaling pathway. This study describes a systems-wide network analysis of the key mal-functioning p53-centered network of interactions involved in lymphoma pathogenesis through integration of transcriptomic and proteomic data

    Deciphering the role of autophagy and exosomes in human lymphoma

    Get PDF
    The orchestrated homeostatic action of autophagy and exosome biogenesis provides a potential causal relation with lymphomagenesis. This study aims to characterize the differently expressed proteins associated with both above-mentioned pathways, as well as to explore their potential implication in therapeutic response controlling lymphomagenesis via N3a-induced re-activated p53 in different lymphoma subtypes

    Bioinformatics analysis of proteomics data and targeting signals in E.coli

    No full text
    Cell compartmentalization serves both the isolation and the specialization of cell functions. All cells have evolved specialized folding/targeting and secretion machineries to deliver proteins to the specific locations where they perform their functions. Bacteria have evolved at least 16 currently known secretion systems. The Sec pathway is the main universal system through which the vast majority of secreted proteins gets translocated. After synthesis in the cytoplasm over a third of all proteins are targeted to other sub-cellular compartments. The complete elucidation of protein localization in sub-cellular compartments is a cornerstone for further study of any cell. The ever increasing need for proteomics analyses requires well described reference proteomes. Moreover the availability of more pure protein datasets can serve the development of more reliable bioinformatic tools.Current thesis first describes a multicombinatorial process that led to the sub-cellular organization of the complete E.coli proteome into 13 sub-cellular locations. Necessary step towards this annotation was the exhaustive manual curation. Then using the complete sub-cellular annotation a starting point: 1) novel targeting signals were identified and characterized in Sec secretory proteins and bioinformatic predictors were developed 2) computational methods were developed for the optimization of experimental approaches through the investigation of the physicochemical properties of proteins and 3) the E.coli peripheral inner membrane proteins (PIM proteins) were organized into protein-protein interaction networks and cellular functions.Η ύπαρξη διακριτών περιοχών στο κύτταρο εξυπηρετεί την απομόνωση αλλά και την εξειδίκευση των διαφορετικών κυτταρικών λειτουργιών. Για το λόγο αυτό τα κύτταρα έχουν αναπτύξει διαφορετικούς μηχανισμούς έκκρισης και καθοδήγησης των πρωτεϊνών στα σημεία όπου πρόκειται να πραγματοποιήσουν τις λειτουργίες τους. Στα βακτήρια έχουν αναγνωριστεί τουλάχιστον 16 τέτοιοι μηχανισμοί εκ των οποίων ο μηχανισμός Sec είναι κοινός και αναγκαίος για την βιωσιμότητα του κυττάρου.Εκτιμάται ότι περίπου το ένα τρίτο των πρωτεϊνών μετά την σύνθεση τους στο κυτταρόπλασμα οδηγείται σε άλλα υποκυτταρικά διαμερίσματα ή στο εξωκυττάριο χώρο. Η γνώση της κατανομής των πρωτεϊνών μέσα στο κύτταρο αλλά και των αλληλεπιδράσεων μεταξύ τους αποτελεί το πρώτο βήμα για την κατανόηση του κυττάρου ως ολότητα. Συνιστά επίσης, απαραίτητη προεργασία για οποιαδήποτε πειραματική μελέτη ευρείας κλίμακας (π.χ. πρωτεομικές αναλύσεις) με μελλοντικό στόχο την in silico μοντελοποίηση και κατανόηση των κυττάρων αλλά και για την ανάπτυξη αξιόπιστων βιοπληροφορικών εργαλείων.Στη συγκεκριμένη διατριβή περιγράφεται η συνδυαστική ανάλυση που ακολουθήθηκε για την εμπεριστατωμένη ταξινόμηση του πρωτεϊνώματος του E.coli σε 13 υποκυτταρικά διαμερίσματα και η οποία βασίστηκε κυρίως σε επισταμένη βιβλιογραφική έρευνα. Στη συνέχεια έχοντας ως εφαλτήριο και ως βασικό εργαλείο την εκτεταμένη υποκυτταρική ταξινόμηση των πρωτεϊνών του E.coli: 1) αναπτύχθηκαν νέα βιοπληροφορικά εργαλεία για την ανάδειξη και τον χαρακτηρισμό άγνωστων μέχρι σήμερα σημάτων έκκρισης/στόχευσης, β) αναπτύχθηκαν μέθοδοι για την βελτίωση πειραματικών μεθοδολογιών μέσα από την διερεύνηση των φυσικοχημικών ιδιοτήτων των πρωτεϊνών και 3) ολοκληρώθηκε μια πρώτη χαρτογράφηση των πρωτεϊνικών αλληλεπιδράσεων και κυτταρικών λειτουργιών στις οποίες συμμετέχουν οι περιφερικές πρωτεΐνες

    Proteome-wide subcellular topologies of E.coli polypeptides database (STEPdb)

    No full text
    Cell compartmentalization serves both the isolation and the specialization of cell functions. After synthesis in the cytoplasm, over a third of all proteins are targeted to other subcellular compartments. Knowing how proteins are distributed within the cell and how they interact is a prerequisite for understanding it as a whole. Surface and secreted proteins are important pathogenicity determinants. Here we present the STEP database (STEPdb) that contains a comprehensive characterization of subcellular localization and topology of the complete proteome of Escherichia coli. Two widely used E. coli proteomes (K-12 and BL21) are presented organized into thirteen subcellular classes. STEPdb exploits the wealth of genetic, proteomic, biochemical, and functional information on protein localization, secretion, and targeting in E. coli, one of the best understood model organisms. Subcellular annotations were derived from a combination of bioinformatics prediction, proteomic, biochemical, functional, topological data and extensive literature re-examination that were refined through manual curation. Strong experimental support for the location of 1553 out of 4303 proteins was based on 426 articles and some experimental indications for another 526. Annotations were provided for another 320 proteins based on firm bioinformatic predictions. STEPdb is the first database that contains an extensive set of peripheral IM proteins (PIM proteins) and includes their graphical visualization into complexes, cellular functions, and interactions. It also summarizes all currently known protein export machineries of E. coli K-12 and pairs them, where available, with the secretory proteins that use them. It catalogs the Sec- and TAT-utilizing secretomes and summarizes their topological features such as signal peptides and transmembrane regions, transmembrane topologies and orientations. It also catalogs physicochemical and structural features that influence topology such as abundance, solubility, disorder, heat resistance, and structural domain families. Finally, STEPdb incorporates prediction tools for topology (TMHMM, SignalP, and Phobius) and disorder (IUPred) and implements the BLAST2STEP that performs protein homology searches against the STEPdb.status: publishe

    MatureP: prediction of secreted proteins with exclusive information from their mature regions

    Get PDF
    More than a third of the cellular proteome is non-cytoplasmic. Most secretory proteins use the Sec system for export and are targeted to membranes using signal peptides and mature domains. To specifically analyze bacterial mature domain features, we developed MatureP, a classifier that predicts secretory sequences through features exclusively computed from their mature domains. MatureP was trained using Just Add Data Bio, an automated machine learning tool. Mature domains are predicted efficiently with ~92% success, as measured by the Area Under the Receiver Operating Characteristic Curve (AUC). Predictions were validated using experimental datasets of mutated secretory proteins. The features selected by MatureP reveal prominent differences in amino acid content between secreted and cytoplasmic proteins. Amino-terminal mature domain sequences have enhanced disorder, more hydroxyl and polar residues and less hydrophobics. Cytoplasmic proteins have prominent amino-terminal hydrophobic stretches and charged regions downstream. Presumably, secretory mature domains comprise a distinct protein class. They balance properties that promote the necessary flexibility required for the maintenance of non-folded states during targeting and secretion with the ability of post-secretion folding. These findings provide novel insight in protein trafficking, sorting and folding mechanisms and may benefit protein secretion biotechnology.status: publishe

    Protein folding in the cell envelope of Escherichia coli

    No full text
    While the entire proteome is synthesized on cytoplasmic ribosomes, almost half associates with, localizes in or crosses the bacterial cell envelope. In Escherichia coli a variety of mechanisms are important for taking these polypeptides into or across the plasma membrane, maintaining them in soluble form, trafficking them to their correct cell envelope locations and then folding them into the right structures. The fidelity of these processes must be maintained under various environmental conditions including during stress; if this fails, proteases are called in to degrade mislocalized or aggregated proteins. Various soluble, diffusible chaperones (acting as holdases, foldases or pilotins) and folding catalysts are also utilized to restore proteostasis. These responses can be general, dealing with multiple polypeptides, with functional overlaps and operating within redundant networks. Other chaperones are specialized factors, dealing only with a few exported proteins. Several complex machineries have evolved to deal with binding to, integration in and crossing of the outer membrane. This complex protein network is responsible for fundamental cellular processes such as cell wall biogenesis; cell division; the export, uptake and degradation of molecules; and resistance against exogenous toxic factors. The underlying processes, contributing to our fundamental understanding of proteostasis, are a treasure trove for the development of novel antibiotics, biopharmaceuticals and vaccines.status: publishe

    The Escherichia coli Peripheral Inner Membrane Proteome

    No full text
    Biological membranes are essential for cell viability. Their functional characteristics strongly depend on their protein content, which consists of transmembrane (integral) and peripherally associated membrane proteins. Both integral and peripheral inner membrane proteins mediate a plethora of biological processes. Whereas transmembrane proteins have characteristic hydrophobic stretches and can be predicted using bioinformatics approaches, peripheral inner membrane proteins are hydrophilic, exist in equilibria with soluble pools, and carry no discernible membrane targeting signals. We experimentally determined the cytoplasmic peripheral inner membrane proteome of the model organism Escherichia coli using a multidisciplinary approach. Initially, we extensively re-annotated the theoretical proteome regarding subcellular localization using literature searches, manual curation, and multi-combinatorial bioinformatics searches of the available databases. Next we used sequential biochemical fractionations coupled to direct identification of individual proteins and protein complexes using high resolution mass spectrometry. We determined that the proposed cytoplasmic peripheral inner membrane proteome occupies a previously unsuspected ∼19% of the basic E. coli BL21(DE3) proteome, and the detected peripheral inner membrane proteome occupies ∼25% of the estimated expressed proteome of this cell grown in LB medium to mid-log phase. This value might increase when fleeting interactions, not studied here, are taken into account. Several proteins previously regarded as exclusively cytoplasmic bind membranes avidly. Many of these proteins are organized in functional or/and structural oligomeric complexes that bind to the membrane with multiple interactions. Identified proteins cover the full spectrum of biological activities, and more than half of them are essential. Our data suggest that the cytoplasmic proteome displays remarkably dynamic and extensive communication with biological membrane surfaces that we are only beginning to decipher.status: publishe

    Rapid label-free quantitative analysis of the E. coli BL21(DE3) inner membrane proteome

    No full text
    Biological membranes define cells and cellular compartments and are essential in regulating bidirectional flow of chemicals and signals. Characterizing their protein content therefore is required to determine their function, nevertheless, the comprehensive determination of membrane-embedded sub-proteomes remains challenging. Here, we experimentally characterized the inner membrane proteome (IMP) of the model organism E. coli BL21(DE3). We took advantage of the recent extensive re-annotation of the theoretical E. coli IMP regarding the sub-cellular localization of all its proteins. Using surface proteolysis of IMVs with variable chemical treatments followed by nanoLC-MS/MS analysis, we experimentally identified ∼45% of the expressed IMP in wild type E. coli BL21(DE3) with 242 proteins reported here for the first time. Using modified label-free approaches we quantified 220 IM proteins. Finally, we compared protein levels between wild type cells and those over-synthesizing the membrane-embedded translocation channel SecYEG proteins. We propose that this proteomics pipeline will be generally applicable to the determination of IMP from other bacteria. This article is protected by copyright. All rights reserved.status: publishe

    Long-Lived Folding Intermediates Predominate the Targeting-Competent Secretome

    No full text
    Secretory preproteins carry signal peptides fused amino-terminally to mature domains. They are post-translationally targeted to cross the plasma membrane in non-folded states with the help of translocases, and fold only at their final destinations. The mechanism of this process of postponed folding is unknown, but is generally attributed to signal peptides and chaperones. We herein demonstrate that, during targeting, most mature domains maintain loosely packed folding intermediates. These largely soluble states are signal peptide independent and essential for translocase recognition. These intermediates are promoted by mature domain features: residue composition, elevated disorder, and reduced hydrophobicity. Consequently, a mature domain folds slower than its cytoplasmic structural homolog. Some mature domains could not evolve stable, loose intermediates, and hence depend on signal peptides for slow folding to the detriment of solubility. These unique features of secretory proteins impact our understanding of protein trafficking, folding, and aggregation, and thus place them in a distinct class.status: publishe

    Abstract 1122‐000089: Characterization of Critical Sequelae in Ischemic Stroke Using Natural Language Processing

    No full text
    Introduction: Automated processing of electronic health data to classify complications of ischemic stroke serves numerous purposes, including improved electronic phenotyping for clinical research. Here, we present a natural language processing (NLP) approach to identify critical findings in acute ischemic stroke from unstructured radiology reports of computed tomography (CT) and magnetic resonance imaging (MRI). Methods: Text reports of CT and MRI scans taken from 2292 patients admitted for large (>1/2 middle cerebral artery territory), acute anterior circulation ischemic stroke were gathered from a single‐institution retrospective cohort. Reports were reviewed and labelled for the presence of hemorrhagic conversion, intracerebral edema, midline shift, intraventricular hemorrhage and parenchymal hematoma as defined by European Cooperative Acute Stroke Study PH1 and PH2 categories. For binary classifications, we quantified co‐occurrence of individual words within reports using two separate NLP methods: Bag‐of‐Words (BOW) and Term Frequency‐Inverse Document Frequency (TF‐IDF). We then trained Lasso regression, random forest, and neural network classifiers to predict all complications based on word co‐occurrence. Classifier performance was measured by area under receiver operating characteristic curves (AUC) using five separate folds of an internal test dataset. To predict midline shift as a continuous outcome, we developed a semantic rule‐based system (RBS) based on regular radiographic report expressions. This system was tested using an external validation dataset of 1472 acute large anterior circulation stroke reports from a separate hospital. Results: 2292 reports were fully labelled for the presence of all stroke complications. Lasso regression consistently displayed the best discrimination among all models. For BOW and TF‐IDF, Lasso yielded respective AUCs of 0.894 and 0.919 (hemorrhagic conversion), 0.935 and 0.950 (intracerebral edema), 0.968 and 0.963 (midline shift), 0.933 and 0.904 (intraventricular hemorrhage), and 0.873 and 0.879 (parenchymal hematoma). All models were well‐calibrated to underlying complication rates. The RBS also achieved strong performance in quantifying midline shift, achieving a mean absolute error (MAE) of 0.103 mm, sensitivity of 99.1% and specificity of 97.5% in the original cohort. In the external validation set of 1472 additional stroke reports, this same system achieved a MAE of 0.126 mm, sensitivity of 99.5% and specificity of 97.5% for midline shift. Wilcoxon rank sum testing on bootstrapped samples confirmed no statistically‐significant differences in RBS performance between institutions when comparing MAE (p = 0.918), sensitivity (p = 0.152), and specificity (p = 0.929). Conclusions: A machine learning pipeline based on Lasso regression successfully identified critical complications of large anterior circulation ischemic stroke from unstructured radiology reports, while our RBS quantified midline shift with a high degree of generalized accuracy between different institutions. We propose that these systems may warrant prospective validation in care settings and data mining for stroke research
    corecore