853 research outputs found

    Estimating the total number of phosphoproteins and phosphorylation sites in eukaryotic proteomes

    Get PDF
    Background: Phosphorylation is the most frequent post-translational modification made to proteins and may regulate protein activity as either a molecular digital switch or a rheostat. Despite the cornucopia of high-throughput (HTP) phosphoproteomic data in the last decade, it remains unclear how many proteins are phosphorylated and how many phosphorylation sites (p-sites) can exist in total within a eukaryotic proteome. We present the first reliable estimates of the total number of phosphoproteins and p-sites for four eukaryotes (human, mouse, Arabidopsis, and yeast). Results: In all, 187 HTP phosphoproteomic datasets were filtered, compiled, and studied along with two low-throughput (LTP) compendia. Estimates of the number of phosphoproteins and p-sites were inferred by two methods: Capture-Recapture, and fitting the saturation curve of cumulative redundant vs. cumulative non-redundant phosphoproteins/p-sites. Estimates were also adjusted for different levels of noise within the individual datasets and other confounding factors. We estimate that in total, 13 000, 11 000, and 3000 phosphoproteins and 230 000, 156 000, and 40 000 p-sites exist in human, mouse, and yeast, respectively, whereas estimates for Arabidopsis were not as reliable. Conclusions: Most of the phosphoproteins have been discovered for human, mouse, and yeast, while the dataset for Arabidopsis is still far from complete. The datasets for p-sites are not as close to saturation as those for phosphoproteins. Integration of the LTP data suggests that current HTP phosphoproteomics appears to be capable of capturing 70% to 95% of total phosphoproteins, but only 40% to 60% of total p-sites

    Development and application of software and algorithms for network approaches to proteomics data analysis

    Get PDF
    The cells making up all living organisms integrate external and internal signals to carry out the functions of life. Dysregulation of signaling can lead to a variety of grave diseases, including cancer [Slamon et al., 1987]. In order to understand signal transduction, one has to identify and characterize the main constituents of cellular signaling cascades. Proteins are involved in most cellular processes and form the major class of biomolecules responsible for signal transduction. Post-translational modifications (PTMs) of proteins can modulate their enzymatic activity and their protein-protein interactions (PPIs) which in turn can ultimately lead to changes in protein expression. Classical biochemistry has approached the study of proteins, PTMs and interaction from a reductionist view. The abundance, stability and localization of proteins was studied one protein at a time, following the one gene-one protein-one function paradigm [Beadle and Tatum, 1941]. Pathways were considered to be linear, where signals would be transmitted from a gene to proteins, eventually resulting in a specific phenotype. Establishing the crucial link between genotype and phenotype remains challenging despite great advances in omics technologies, such as liquid chromatography (LC)-mass spectrometry (MS) that allow for the system-wide interrogation of proteins. Systems and network biology [Barabási and Oltvai, 2004, Bensimon et al., 2012, Jørgensen and Locard-Paulet, 2012, Choudhary and Mann, 2010] aims to transform modern biology by utilizing omics technologies to understand and uncover the various complex networks that govern the cell. The first detected large-scale biological networks have been found to be highly structured and non-random [Albert and Barabási, 2002]. Furthermore, these are assembled from functional and topological modules. The smallest topological modules are formed by the direct physical interactions within protein-protein and protein-RNA complexes. These molecular machines are able to perform a diverse array of cellular functions, such as transcription and degradation [Alberts, 1998]. Members of functional modules are not required to have a direct physical interaction. Instead, such modules also include proteins with temporal co-regulation throughout the cell cycle [Olsen et al., 2010], or following the circadian day-night rhythm [Robles et al., 2014]. The signaling pathways that make up the cellular network [Jordan et al., 2000] are assembled from a hierarchy of these smaller modules [Barabási and Oltvai, 2004]. The regulation of these modules through dynamic rewiring enables the cell to respond to internal an external stimuli. The main challenge in network biology is to develop techniques to probe the topology of various biological networks, to identify topological and functional modules, and to understand their assembly and dynamic rewiring. LC-MS has become a powerful experimental platform that addresses all these challenges directly [Bensimon et al., 2012], and has long been used to study a wide range of biomolecules that participate in the cellular network. The field of proteomics in particular, which is concerned with the identification and characterization of the proteins in the cell, has been revolutionized by recent technological advances in MS. Proteomics experiments are used not only to quantify peptides and proteins, but also to uncover the edges of the cellular network, by screening for physical PPIs in a global [Hein et al., 2015] or condition specific manner [Kloet et al., 2016]. Crucial for the interpretation of the large-scale data generated by MS experiments is the development of software tools that aid researchers in translating raw measurements into biological insights. The MaxQuant and Perseus platforms were designed for this exact purpose. The aim of this thesis was to develop software tools for the analysis of MS-based proteomics data with a focus on network biology and apply the developed tools to study cellular signaling. The first step was the extension of the Perseus software with network data structures and activities. The new network module allows for the sideby-side analysis of matrices and networks inside an interactive workflow and is described in article 1. We subsequently apply the newly developed software to study the circadian phosphoproteome of cortical synapses (see article 2). In parallel we aimed to improve the analysis of large datasets by adapting the previously Windows-only MaxQuant software to the Linux operating system, which is more prevalent in high performance computing environments (see article 3)

    Evaluation of the relevance and impact of kinase dysfunction in neurological disorders through proteomics and phosphoproteomics bioinformatics

    Get PDF
    Phosphorylation is an important post-translational modification that is involved in various biological processes and its dysregulation has in particular been linked to diseases of the central nervous system including neurological disorders. The present thesis characterizes alterations in the phosphoproteome and protein abundance associated with schizophrenia and Parkinson's disease, with the goal of uncovering the underlying disease mechanisms. To support this goal, I eventually created an automated analysis pipeline in R to streamline the analysis process of proteomics and phosphoproteomics data. Mass spectrometry (MS) technology is utilized to generate proteomics and phosphoproteomics data. Study I of the thesis demonstrates an automated R pipeline, PhosPiR, created to perform multi-level functional analyses of MS data after the identification and quantification of the raw spectral data. The pipeline does not require coding knowledge to run. It supports 18 different organisms, and provides analyses of MS intensity data from preprocessing, normalization and imputation, through to figure overviews, statistical analysis, enrichment analysis, PTM-SEA, kinase prediction and activity analysis, network analysis, hub analysis, annotation mining, and homolog alignment. The LRRK2-G2019S mutation, a frequent genetic cause of late onset Parkinson's disease, was investigated in Study II and III. One study investigated the mechanism of LRRK2-G2019S function in brain, and the other identified proteins with significantly altered overall translation patterns in sporadic and LRRK2-G2019S patient samples. Specifically, study II identified that LRRK2 is localized to the small 40S ribosomal subunit and that LRRK2 activity suppresses RNA translation, as validated in cell and animal models of Parkinson's disease and in patient cells. Study III utilized bio-orthogonal non-canonical amino acid tagging to label newly translated proteins in order to identify which proteins were affected by repressed translation in patient samples, using mass spectrometry analysis. The analysis revealed 33 and 30 nascent proteins with reduced synthesis in sporadic and LRRK2-G2019S Parkinson’s cases, respectively. The biological process "cytosolic signal recognition particle (SRP)-dependent co-translational protein targeting to membrane" was functionally significantly affected in both sporadic and LRRK2-G2019S Parkinson's, while "Tubulin/FTsz C-terminal domain superfamily network" was only significantly enriched in LRRK2-G2019S Parkinson’s cases. The findings were validated bytargeted proteomics and immunoblotting. Study IV is conducted to investigate the role of JNK1 in schizophrenia. Wild type and Jnk1-/- mice were used to analyze the phosphorylation profile using LC-MS/MS analysis. 126 proteins associated with schizophrenia were identified to overlap with the significantly differentially phosphorylated proteins in Jnk1-/- mice brain. The NMDAR trafficking pathway was found to be highly enriched, and surface staining of NMDAR subunits in neurons showed that surface expression of both subunits in Jnk1-/- neurons was significantly decreased. Further behavioral tests conducted with MK801 treatment have associated the Jnk1-/- molecular and behavioral phenotype with schizophrenia and neuropsychiatric disease

    Pharmacological approaches to understanding protein kinase signaling networks

    Get PDF
    Protein kinases play vital roles in controlling cell behavior, and an array of kinase inhibitors are used successfully for treatment of disease. Typical drug development pipelines involve biological studies to validate a protein kinase target, followed by the identification of small molecules that effectively inhibit this target in cells, animal models, and patients. However, it is clear that protein kinases operate within complex signaling networks. These networks increase the resilience of signaling pathways, which can render cells relatively insensitive to inhibition of a single kinase, and provide the potential for pathway rewiring, which can result in resistance to therapy. It is therefore vital to understand the properties of kinase signaling networks in health and disease so that we can design effective multi-targeted drugs or combinations of drugs. Here, we outline how pharmacological and chemo-genetic approaches can contribute to such knowledge, despite the known low selectivity of many kinase inhibitors. We discuss how detailed profiling of target engagement by kinase inhibitors can underpin these studies; how chemical probes can be used to uncover kinase-substrate relationships, and how these tools can be used to gain insight into the configuration and function of kinase signaling networks

    Visualization and exploration of next-generation proteomics data

    Get PDF

    Phosphoproteomics

    Get PDF
    The book provides an overview of state-of-art techniques for the purification, analysis and quantification of proteins in complex samples using different enrichment strategies

    Practical considerations for omics experiments in biomedical sciences

    Get PDF
    Modern analytical techniques provide an unprecedented insight to biomedical samples, allowing an in depth characterization of cells or body fluids, to the level of genes, transcripts, peptides, proteins, metabolites, or metallic ions. The fine grained picture provided by such approaches holds the promise for a better understanding of complex pathologies, and consequently the personalization of diagnosis, prognosis and treatment procedures. In practice however, technical limitations restrict the resolution of the acquired data, and thus of downstream biomedical inference. As a result, the study of complex diseases like leukemia and other types of cancer is impaired by the high heterogeneity of pathologies as well as patient profiles. In this review, we propose an introduction to the general approach of characterizing samples and inferring biomedical results. We highlight the main limitations of the technique with regards to complex and heterogeneous pathologies, and provide ways to overcome these by improving the ability of experiments in discriminating samples.acceptedVersio

    Caloric restriction and the nutrient-sensing protein kinase TOR1 alter the pattern of protein phosphorylation in quiescent and non-quiescent cells of Saccharomyces cerevisiae

    Get PDF
    The application of yeast as a model organism for studying eukaryotic pathways, notably mechanisms and processes of chronological aging, has been recognized for decades. In fact, several signalling pathways of longevity regulation are conserved across phyla; humans (and other mammals) have orthologs and homologs of yeast proteins integrated into these pathways. One of such pathways is the TOR pathway that responds to nutrient levels, notably via TORC1 (a complex with protein kinase activity; contains TOR1 as a core protein). My thesis taps into both of those advantageous properties of Saccharomyces cerevisiae: its ease of culturing for chronological aging studies, and well annotated proteome. I study the chronologically aging quiescent and non-quiescent cell populations under caloric restriction or not using wild-type or tor1 single gene deletion mutant strains. I use quantitative phosphoproteomics – by means of mass spectrometry – to assess the differences and similarities between different cell populations. Caloric restriction has previously been shown to extend the chronological lifespan of yeast and other organisms. Reduced TOR1 activity (such as via inhibitors or by gene deletion) is also shown to extend yeast chronological lifespan in literature. Quiescence, an ability of a nutrient-limited post-mitotic cell to re-enter the cell cycle when the nutrient supply is restored, is also a lifespan-extending process. Combining these factors, I compared the phosphoproteomes of quiescent and non-quiescent yeast cells limited or not limited in calorie supply and having or lacking the TOR1 protein. I found that both the diet and the state of quiescence have significant effect on the phosphorylation of proteins. Moreover, I found that a single-gene-deletion mutation that eliminates the TOR1 protein has a significant impact on both the state of quiescence and the cell phosphoproteome
    corecore