344 research outputs found

    The Case for Proteomics and Phospho-Proteomics in Personalized Cancer Medicine

    Get PDF
    The concept of personalized medicine is predominantly been pursued through genomic and transcriptomic technologies, leading to the identification of multiple mutations in a large variety of cancers. However, it has proven challenging to distinguish driver and passenger mutations and to deal with tumor heterogeneity and resistant clonal populations. More generally, these heterogeneous mutation patterns do not in themselves predict the tumor phenotype. Analysis of the expressed proteins in a tumor and their modification states reveals if and how these mutations are translated to the functional level. It is already known that proteomic changes including posttranslational modifications are crucial drivers of oncogenesis, but proteomics technology has only recently become comparable in depth and accuracy to RNAseq. These advances also allow the rapid and highly sensitive analysis of formalin-fixed and paraffin-embedded biobank tissues, on both the proteome and phosphoproteome levels. In this perspective, pioneering mass spectrometry-based proteomic studies are highlighted that pave the way toward clinical implementation. It is argued that proteomics and phosphoproteomics could provide the missing link to make omics analysis actionable in the clinic

    PLoS One

    Get PDF

    Predicting post-translational lysine acetylation using support vector machines

    Get PDF
    Motivation: Lysine acetylation is a post-translational protein modification and a primary regulatory mechanism that controls many cell signaling processes. Lysine acetylation sites are recognized by acetyltransferases and deacetylases through sequence patterns (motifs). Recently, we used high-resolution mass spectrometry to identify 3600 lysine acetylation sites on 1750 human proteins covering most of the previously annotated sites and providing the most comprehensive acetylome so far. This dataset should provide an excellent source to train support vector machines (SVMs) allowing the high accuracy in silico prediction of acetylated lysine residues

    MAPU 2.0: high-accuracy proteomes mapped to genomes

    Get PDF
    The MAPU 2.0 database contains proteomes of organelles, tissues and cell types measured by mass spectrometry (MS)-based proteomics. In contrast to other databases it is meant to contain a limited number of experiments and only those with very high-resolution and -accuracy data. MAPU 2.0 displays the proteomes of organelles, tissues and body fluids or conversely displays the occurrence of proteins of interest in all these proteomes. The new release addresses MS-specific problems including ambiguous peptide-to-protein assignments and it provides insight into general functional features on the protein level ranging from gene ontology classification to comprehensive SwissProt annotation. Moreover, the derived proteomic data are used to annotate the genomes using Distributed Annotation Service (DAS) via EnsEMBL services. MAPU 2.0 is a model for a database specifically designed for high-accuracy proteomics and a member of the ProteomExchange Consortium. It is available on line at http://www.mapuproteome.com

    PHOSIDA 2011: the posttranslational modification database

    Get PDF
    The primary purpose of PHOSIDA (http://www.phosida.com) is to manage posttranslational modification sites of various species ranging from bacteria to human. Since its last report, PHOSIDA has grown significantly in size and evolved in scope. It comprises more than 80 000 phosphorylated, N-glycosylated or acetylated sites from nine different species. All sites are obtained from high-resolution mass spectrometric data using the same stringent quality criteria. One of the main distinguishing features of PHOSIDA is the provision of a wide range of analysis tools. PHOSIDA is comprised of three main components: the database environment, the prediction platform and the toolkit section. The database environment integrates and combines high-resolution proteomic data with multiple annotations. High-accuracy species-specific phosphorylation and acetylation site predictors, trained on the modification sites contained in PHOSIDA, allow the in silico determination of modified sites on any protein on the basis of the primary sequence. The toolkit section contains methods that search for sequence motif matches or identify de novo consensus, sequences from large scale data sets

    Mol. Syst. Biol.

    Get PDF
    We report a proteomic analysis of microdissected material from formalin-fixed and paraffin-embedded colorectal cancer, quantifying >7500 proteins between patient matched normal mucosa, primary carcinoma, and nodal metastases. Expression levels of 1808 proteins changed significantly between normal and cancer tissues, a much larger fraction than that reported in transcript-based studies. Tumor cells exhibit extensive alterations in the cell-surface and nuclear proteomes. Functionally similar changes in the proteome were observed comparing rapidly growing and differentiated CaCo-2 cells. In contrast, there was minimal proteomic remodeling between primary cancer and metastases, suggesting that no drastic proteome changes are necessary for the tumor to propagate in a different tissue context. Additionally, we introduce a new way to determine protein copy numbers per cell without protein standards. Copy numbers estimated in enterocytes and cancer cells are in good agreement with CaCo-2 and HeLa cells and with the literature data. Our proteomic data set furthermore allows mapping quantitative changes of functional protein classes, enabling novel insights into the biology of colon cancer

    Identifying Human Kinase-Specific Protein Phosphorylation Sites by Integrating Heterogeneous Information from Various Sources

    Get PDF
    Phosphorylation is an important type of protein post-translational modification. Identification of possible phosphorylation sites of a protein is important for understanding its functions. Unbiased screening for phosphorylation sites by in vitro or in vivo experiments is time consuming and expensive; in silico prediction can provide functional candidates and help narrow down the experimental efforts. Most of the existing prediction algorithms take only the polypeptide sequence around the phosphorylation sites into consideration. However, protein phosphorylation is a very complex biological process in vivo. The polypeptide sequences around the potential sites are not sufficient to determine the phosphorylation status of those residues. In the current work, we integrated various data sources such as protein functional domains, protein subcellular location and protein-protein interactions, along with the polypeptide sequences to predict protein phosphorylation sites. The heterogeneous information significantly boosted the prediction accuracy for some kinase families. To demonstrate potential application of our method, we scanned a set of human proteins and predicted putative phosphorylation sites for Cyclin-dependent kinases, Casein kinase 2, Glycogen synthase kinase 3, Mitogen-activated protein kinases, protein kinase A, and protein kinase C families (avaiable at http://cmbi.bjmu.edu.cn/huphospho). The predicted phosphorylation sites can serve as candidates for further experimental validation. Our strategy may also be applicable for the in silico identification of other post-translational modification substrates

    Phospho.ELM: a database of phosphorylation sites—update 2011

    Get PDF
    The Phospho.ELM resource (http://phospho.elm.eu.org) is a relational database designed to store in vivo and in vitro phosphorylation data extracted from the scientific literature and phosphoproteomic analyses. The resource has been actively developed for more than 7 years and currently comprises 42 574 serine, threonine and tyrosine non-redundant phosphorylation sites. Several new features have been implemented, such as structural disorder/order and accessibility information and a conservation score. Additionally, the conservation of the phosphosites can now be visualized directly on the multiple sequence alignment used for the score calculation. Finally, special emphasis has been put on linking to external resources such as interaction networks and other databases

    Unlimited multistability in multisite phosphorylation systems

    Get PDF
    Reversible phosphorylation on serine, threonine and tyrosine is the most widely studied posttranslational modification of proteins (1, 2). The number of phosphorylated sites on a protein (n) shows a significant increase from prokaryotes, with n less than or equal to 7 sites, to eukaryotes, with examples having n greater than or equal to 150 sites (3). Multisite phosphorylation has many roles (4, 5) and site conservation indicates that increasing numbers of sites cannot be due merely to promiscuous phosphorylation. A substrate with n sites has an exponential number (2^n) of phospho-forms and individual phospho-forms may have distinct biological effects (6, 7). The distribution of these phospho-forms and how this distribution is regulated have remained unknown. Here we show that, when kinase and phosphatase act in opposition on a multisite substrate, the system can exhibit distinct stable phospho-form distributions at steady state and that the maximum number of such distributions increases with n. Whereas some stable distributions are focused on a single phospho-form, others are more diffuse, giving the phospho-proteome the potential to behave as a fluid regulatory network able to encode information and flexibly respond to varying demands. Such plasticity may underlie complex information processing in eukaryotic cells (8) and suggests a functional advantage in having many sites. Our results follow from the unusual geometry of the steady-state phospho-form concentrations, which we show to constitute a rational algebraic curve, irrespective of n. We thereby reduce the complexity of calculating steady states from simulating 3 times 2^n differential equations to solving two algebraic equations, while treating parameters symbolically. We anticipate that these methods can be extended to systems with multiple substrates and multiple enzymes catalysing different modifications, as found in posttranslational modification 'codes' (9) such as the histone code (10, 11). Whereas simulations struggle with exponentially increasing molecular complexity, mathematical methods of the kind developed here can provide a new language in which to articulate the principles of cellular information processing (12)
    corecore