5 research outputs found

    Development and application of software and algorithms for network approaches to proteomics data analysis

    Get PDF
    The cells making up all living organisms integrate external and internal signals to carry out the functions of life. Dysregulation of signaling can lead to a variety of grave diseases, including cancer [Slamon et al., 1987]. In order to understand signal transduction, one has to identify and characterize the main constituents of cellular signaling cascades. Proteins are involved in most cellular processes and form the major class of biomolecules responsible for signal transduction. Post-translational modifications (PTMs) of proteins can modulate their enzymatic activity and their protein-protein interactions (PPIs) which in turn can ultimately lead to changes in protein expression. Classical biochemistry has approached the study of proteins, PTMs and interaction from a reductionist view. The abundance, stability and localization of proteins was studied one protein at a time, following the one gene-one protein-one function paradigm [Beadle and Tatum, 1941]. Pathways were considered to be linear, where signals would be transmitted from a gene to proteins, eventually resulting in a specific phenotype. Establishing the crucial link between genotype and phenotype remains challenging despite great advances in omics technologies, such as liquid chromatography (LC)-mass spectrometry (MS) that allow for the system-wide interrogation of proteins. Systems and network biology [Barabási and Oltvai, 2004, Bensimon et al., 2012, Jørgensen and Locard-Paulet, 2012, Choudhary and Mann, 2010] aims to transform modern biology by utilizing omics technologies to understand and uncover the various complex networks that govern the cell. The first detected large-scale biological networks have been found to be highly structured and non-random [Albert and Barabási, 2002]. Furthermore, these are assembled from functional and topological modules. The smallest topological modules are formed by the direct physical interactions within protein-protein and protein-RNA complexes. These molecular machines are able to perform a diverse array of cellular functions, such as transcription and degradation [Alberts, 1998]. Members of functional modules are not required to have a direct physical interaction. Instead, such modules also include proteins with temporal co-regulation throughout the cell cycle [Olsen et al., 2010], or following the circadian day-night rhythm [Robles et al., 2014]. The signaling pathways that make up the cellular network [Jordan et al., 2000] are assembled from a hierarchy of these smaller modules [Barabási and Oltvai, 2004]. The regulation of these modules through dynamic rewiring enables the cell to respond to internal an external stimuli. The main challenge in network biology is to develop techniques to probe the topology of various biological networks, to identify topological and functional modules, and to understand their assembly and dynamic rewiring. LC-MS has become a powerful experimental platform that addresses all these challenges directly [Bensimon et al., 2012], and has long been used to study a wide range of biomolecules that participate in the cellular network. The field of proteomics in particular, which is concerned with the identification and characterization of the proteins in the cell, has been revolutionized by recent technological advances in MS. Proteomics experiments are used not only to quantify peptides and proteins, but also to uncover the edges of the cellular network, by screening for physical PPIs in a global [Hein et al., 2015] or condition specific manner [Kloet et al., 2016]. Crucial for the interpretation of the large-scale data generated by MS experiments is the development of software tools that aid researchers in translating raw measurements into biological insights. The MaxQuant and Perseus platforms were designed for this exact purpose. The aim of this thesis was to develop software tools for the analysis of MS-based proteomics data with a focus on network biology and apply the developed tools to study cellular signaling. The first step was the extension of the Perseus software with network data structures and activities. The new network module allows for the sideby-side analysis of matrices and networks inside an interactive workflow and is described in article 1. We subsequently apply the newly developed software to study the circadian phosphoproteome of cortical synapses (see article 2). In parallel we aimed to improve the analysis of large datasets by adapting the previously Windows-only MaxQuant software to the Linux operating system, which is more prevalent in high performance computing environments (see article 3)

    Exploring protein phosphorylation dynamics by mass spectrometry: pushing the boundaries of time resolved phosphoproteomics

    Get PDF
    Here we used mass spectrometry to study protein phosphorylation across different experimental models: from single-embryos to patient-derived tissue samples. We combined different data acquisition strategies, from shotgun proteomics to get a global view of the phosphoproteome, to targeted measurement of specific phosphorylation sites of known biological relevance. Despite these differences, all the experimental chapters of this thesis share one feature: the measurement of protein phosphorylation levels throughout time. First, in chapter I we introduced basic concepts of mass spectrometry and how this technology prompted the rise of phosphoproteomics as a field of study. In chapter II, we focused on studying signal transduction in the MAPK-AKT-mTOR pathway by targeted phosphoproteomics. Using single reaction monitoring, we mapped the dynamic behavior of phosphorylation sites with known functionality after treating cells with different stimuli. We think this assay could be a valuable resource for those interested in reproducibly measuring phosphorylation dynamics of this protein network. In chapter III, we assessed the differences in specificity between the mitogen activated protein kinases ERK1 and ERK2, which could potentially explain why absence of ERK2 has been linked to more severe phenotypes on a variety of experimental models. On our most comprehensive chapter (IV), we explored changes in the phosphoproteome during the early cell divisions of X. laevis single-embryos, which showed our ability to distinguish cell cycle related phosphorylation in vivo. Next, our bioinformatic analysis of cell cycle phosphorylated substrates revealed enrichment of phosphorylation sites in highly disordered proteins. Using a model disordered protein (Ki-67), we showed that CDK1 driven phosphorylation modulates protein phase separation. Altogether, this allowed us to formulate a working hypothesis about how phosphorylation regulates protein condensation during cell cycle transitions, potentially explaining the drastic cellular reorganization observed in mitosis. In the last experimental chapter (V), we combined mass spectrometry technologies to assess changes in protein expression and protein phosphorylation in human intestinal tissue samples exposed to ischemia and reperfusion, in an attempt to understand the molecular mechanisms driving the pathophysiology of this clinical event. Finally, in chapter VI we share our conclusions and outlook for the field of phosphoproteomics and the study of protein phosphorylation

    Homology-driven assembly of NOn-redundant protEin sequence sets (NOmESS) for mass spectrometry

    No full text
    To enable mass spectrometry (MS)-based proteomic studies with poorly characterized organisms, we developed a computational workflow for the homology-driven assembly of a non-redundant reference sequence dataset. In the automated pipeline, translated DNA sequences (e.g. ESTs, RNA deep-sequencing data) are aligned to those of a closely related and fully sequenced organism. Representative sequences are derived from each cluster and joined, resulting in a non-redundant reference set representing the maximal available amino acid sequence information for each protein. We here applied NOmESS to assemble a reference database for the widely used model organism Xenopus laevis and demonstrate its use in proteomic applications
    corecore