480 research outputs found

    The BET protein FSH functionally interacts with ASH1 to orchestrate global gene activity in Drosophila

    Get PDF
    BACKGROUND: The question of how cells re-establish gene expression states after cell division is still poorly understood. Genetic and molecular analyses have indicated that Trithorax group (TrxG) proteins are critical for the long-term maintenance of active gene expression states in many organisms. A generally accepted model suggests that TrxG proteins contribute to maintenance of transcription by protecting genes from inappropriate Polycomb group (PcG)-mediated silencing, instead of directly promoting transcription. RESULTS AND DISCUSSION: Here we report a physical and functional interaction in Drosophila between two members of the TrxG, the histone methyltransferase ASH1 and the bromodomain and extraterminal family protein FSH. We investigated this interface at the genome level, uncovering a widespread co-localization of both proteins at promoters and PcG-bound intergenic elements. Our integrative analysis of chromatin maps and gene expression profiles revealed that the observed ASH1-FSH binding pattern at promoters is a hallmark of active genes. Inhibition of FSH-binding to chromatin resulted in global down-regulation of transcription. In addition, we found that genes displaying marks of robust PcG-mediated repression also have ASH1 and FSH bound to their promoters. CONCLUSIONS: Our data strongly favor a global coactivator function of ASH1 and FSH during transcription, as opposed to the notion that TrxG proteins impede inappropriate PcG-mediated silencing, but are dispensable elsewhere. Instead, our results suggest that PcG repression needs to overcome the transcription-promoting function of ASH1 and FSH in order to silence genes

    Unravelling the proteome of chromatin bound RNA polymerase II using Proteome-ChIP in murine stem cells

    Get PDF
    Regulation of gene expression is critical to govern distinct transcriptional programs for a cell type, lineage specification and developmental stage. Transcription is the first step in gene expression wherein RNA Polymerase II (RNAPII) transcribes protein-coding genes. Transcription is a highly coordinated process that involves a range of chromatin interactions including transcription machinery, chromatin remodellers and co-transcriptional RNA processing. Embryonic stem (ES) cells are pluripotent, self-renewing cells that can differentiate to give rise to all lineages making them an invaluable tool to study early development and in therapy. Genome-wide analysis in murine mES cells has identified 30% of known genes harbouring bivalent chromatin modifications along with repressive Polycomb complexes and a novel variant of RNAPII (modified as S5p+S7p-S2p-) with mechanistic implications in stem cell pluripotency, differentiation potential and lineage specification. To explore chromatin composition associated with different variants of RNAPII, I developed an unbiased method, ‘Proteome-ChIP’ (pChIP) wherein crosslinked chromatin is purified by immunoprecipitation followed by protein extraction and identification by Mass Spectrometry. Using an unbiased comprehensive experimental strategy and a novel systems biology approach, I qualitatively and quantitatively dissect the proteome composition and dependencies on RNAPII modifications during different stages of the transcription cycle. The work done in this thesis provides an invaluable resource of RNAPII chromatin interactions. We identify known and novel components of the co-transcriptional machinery, chromatin remodelling and RNA processing machinery. The work also uncovers novel processes associated with unusual RNAPII (S5p+S7p-S2p-) including DNA replication, Polycomb proteins and chromatin remodellers; many of these processes critical for stem cell viability and regulation. Extending the RNAPII-pChIP analysis on low complexity samples by Native-pChIP and Gradient-pChIP highlights the versatility of robustness of our method. The work described in this sheds light on regulatory chromatin processes specific to mES cells, which informs our understanding of stem cell biology and reprogramming

    Silence as a way of niche adaptation: mecC-MRSA with variations in the accessory gene regulator (agr) functionality express kaleidoscopic phenotypes

    Get PDF
    Functionality of the accessory gene regulator (agr) quorum sensing system is an important factor promoting either acute or chronic infections by the notorious opportunistic human and veterinary pathogen Staphylococcus aureus. Spontaneous alterations of the agr system are known to frequently occur in human healthcare-associated S. aureus lineages. However, data on agr integrity and function are sparse regarding other major clonal lineages. Here we report on the agr system functionality and activity level in mecC-carrying methicillin resistant S. aureus (MRSA) of various animal origins (n = 33) obtained in Europe as well as in closely related human isolates (n = 12). Whole genome analysis assigned all isolates to four clonal complexes (CC) with distinct agr types (CC599 agr I, CC49 agr II, CC130 agr III and CC1943 agr IV). Agr functionality was assessed by a combination of phenotypic assays and proteome analysis. In each CC, isolates with varying agr activity levels were detected, including the presence of completely non-functional variants. Genomic comparison of the agr I–IV encoding regions associated these phenotypic differences with variations in the agrA and agrC genes. The genomic changes were detected independently in divergent lineages, suggesting that agr variation might foster viability and adaptation of emerging MRSA lineages to distinct ecological niches.Peer Reviewe

    Silence as a way of niche adaptation: mecC-MRSA with variations in the accessory gene regulator (agr) functionality express kaleidoscopic phenotypes

    Get PDF
    Functionality of the accessory gene regulator (agr) quorum sensing system is an important factor promoting either acute or chronic infections by the notorious opportunistic human and veterinary pathogen Staphylococcus aureus. Spontaneous alterations of the agr system are known to frequently occur in human healthcare-associated S. aureus lineages. However, data on agr integrity and function are sparse regarding other major clonal lineages. Here we report on the agr system functionality and activity level in mecC-carrying methicillin resistant S. aureus (MRSA) of various animal origins (n = 33) obtained in Europe as well as in closely related human isolates (n = 12). Whole genome analysis assigned all isolates to four clonal complexes (CC) with distinct agr types (CC599 agr I, CC49 agr II, CC130 agr III and CC1943 agr IV). Agr functionality was assessed by a combination of phenotypic assays and proteome analysis. In each CC, isolates with varying agr activity levels were detected, including the presence of completely non-functional variants. Genomic comparison of the agr I-IV encoding regions associated these phenotypic differences with variations in the agrA and agrC genes. The genomic changes were detected independently in divergent lineages, suggesting that agr variation might foster viability and adaptation of emerging MRSA lineages to distinct ecological niches

    Global analyses of protein complex localization, oligomerization, composition, and dynamics using quantitative mass spectrometry

    Get PDF
    Proteins perform essential processes in plant cells such as photosynthesis, protein translation, maintenance of metabolic flux, and signal transduction. Many of these functions could never be achieved using individual proteins. Consequently, many proteins oligomerize to form complexes that act as tunable molecular machines that perform work and transmit information. Cells contain thousands of protein complexes, yet the composition of the vast majority remain unknown. Previous high throughput approaches to identify protein interactions have relied on binary interactions or tagging individual proteins for purification. A new set of label free correlation profiling methods were developed that extracted native protein from Arabidopsis leaves for separation by liquid chromatography and mass spectrometry quantified the elution profiles for thousands of proteins in a single experiment. One new method expanded protein correlation profiling to membrane-associated proteins, which are normally discarded because of their insolubility. Hundreds of novel membrane-associated complexes were predicted based on their high apparent mass compared to the monomeric form. Dozens were dual localized proteins that partitioned between the cytosol and cell membranes: a small subset had oligomerization states that clearly differed as a function of localization. Protein correlation was then expanded to include an orthogonal separation of native complexes by charge. This added resolving power allowed us to predict protein complex composition. By creating a reproducible data mining and analysis pipeline, over 200 cytosolic and 120 chloroplast complexes were predicted. Validation included the accurate prediction of known protein complex compositions and novel subunits to known complexes. Using reverse genetics we discovered a new protein complex subunit AIM1PL, which appears to broadly affect protein complex assemblies that are involved in translation. In the last chapter the SEC-based protein correlation profiling method was used to broadly analyze how the proteome responds to a stress condition. Hundreds of interesting examples were discovered in which soluble and membrane-associated proteins are predicted to change in abundance, localization, and/or oligomerization state in response to metabolic stress. These analyses uncovered interesting biology that likely underlies the post-translational control of gene expression with the mechanisms by which diverse cellular activities are integrated during plant growth and development. Collectively, a suite of new methods are created that enable high throughput and broad analyses of how protein complexes in the cell and how the proteome responds to metabolic stress. These results are being used to generate testable hypotheses about how cellular systems respond to metabolic stress. These technologies have broad application to agriculture because they can be applied to any plant species with a well-annotated genome

    Expression data dnalysis and regulatory network inference by means of correlation patterns

    Get PDF
    With the advance of high-throughput techniques, the amount of available data in the bio-molecular field is rapidly growing. It is now possible to measure genome-wide aspects of an entire biological system as a whole. Correlations that emerge due to internal dependency structures of these systems entail the formation of characteristic patterns in the corresponding data. The extraction of these patterns has become an integral part of computational biology. By triggering perturbations and interventions it is possible to induce an alteration of patterns, which may help to derive the dependency structures present in the system. In particular, differential expression experiments may yield alternate patterns that we can use to approximate the actual interplay of regulatory proteins and genetic elements, namely, the regulatory network of a cell. In this work, we examine the detection of correlation patterns from bio-molecular data and we evaluate their applicability in terms of protein contact prediction, experimental artifact removal, the discovery of unexpected expression patterns and genome-scale inference of regulatory networks. Correlation patterns are not limited to expression data. Their analysis in the context of conserved interfaces among proteins is useful to estimate whether these may have co-evolved. Patterns that hint on correlated mutations would then occur in the associated protein sequences as well. We employ a conceptually simple sampling strategy to decide whether or not two pathway elements share a conserved interface and are thus likely to be in physical contact. We successfully apply our method to a system of ABC-transporters and two-component systems from the phylum of Firmicute bacteria. For spatially resolved gene expression data like microarrays, the detection of artifacts, as opposed to noise, corresponds to the extraction of localized patterns that resemble outliers in a given region. We develop a method to detect and remove such artifacts using a sliding-window approach. Our method is very accurate and it is shown to adapt to other platforms like custom arrays as well. Further, we developed Padesco as a way to reveal unexpected expression patterns. We extract frequent and recurring patterns that are conserved across many experiments. For a specific experiment, we predict whether a gene deviates from its expected behaviour. We show that Padesco is an effective approach for selecting promising candidates from differential expression experiments. In Chapter 5, we then focus on the inference of genome-scale regulatory networks from expression data. Here, correlation patterns have proven useful for the data-driven estimation of regulatory interactions. We show that, for reliable eukaryotic network inference, the integration of prior networks is essential. We reveal that this integration leads to an over-estimate of network-wide quality estimates and suggest a corrective procedure, CoRe, to counterbalance this effect. CoRe drastically improves the false discovery rate of the originally predicted networks. We further suggest a consensus approach in combination with an extended set of topological features to obtain a more accurate estimate of the eukaryotic regulatory network for yeast. In the course of this work we show how correlation patterns can be detected and how they can be applied for various problem settings in computational molecular biology. We develop and discuss competitive approaches for the prediction of protein contacts, artifact repair, differential expression analysis, and network inference and show their applicability in practical setups.Mit der Weiterentwicklung von Hochdurchsatztechniken steigt die Anzahl verfĂŒgbarer Daten im Bereich der Molekularbiologie rapide an. Es ist heute möglich, genomweite Aspekte eines ganzen biologischen Systems komplett zu erfassen. Korrelationen, die aufgrund der internen AbhĂ€ngigkeits-Strukturen dieser Systeme enstehen, fĂŒhren zu charakteristischen Mustern in gemessenen Daten. Die Extraktion dieser Muster ist zum integralen Bestandteil der Bioinformatik geworden. Durch geplante Eingriffe in das System ist es möglich Muster-Änderungen auszulösen, die helfen, die AbhĂ€ngigkeits-Strukturen des Systems abzuleiten. Speziell differentielle Expressions-Experimente können Muster-Wechsel bedingen, die wir verwenden können, um uns dem tatsĂ€chlichen Wechselspiel von regulatorischen Proteinen und genetischen Elementen anzunĂ€hern, also dem regulatorischen Netzwerk einer Zelle. In der vorliegenden Arbeit beschĂ€ftigen wir uns mit der Erkennung von Korrelations-Mustern in molekularbiologischen Daten und schĂ€tzen ihre praktische Nutzbarkeit ab, speziell im Kontext der Kontakt-Vorhersage von Proteinen, der Entfernung von experimentellen Artefakten, der Aufdeckung unerwarteter Expressions-Muster und der genomweiten Vorhersage regulatorischer Netzwerke. Korrelations-Muster sind nicht auf Expressions-Daten beschrĂ€nkt. Ihre Analyse im Kontext konservierter Schnittstellen zwischen Proteinen liefert nĂŒtzliche Hinweise auf deren Ko-Evolution. Muster die auf korrelierte Mutationen hinweisen, wĂŒrden in diesem Fall auch in den entsprechenden Proteinsequenzen auftauchen. Wir nutzen eine einfache Sampling-Strategie, um zu entscheiden, ob zwei Elemente eines Pathways eine gemeinsame Schnittstelle teilen, berechnen also die Wahrscheinlichkeit fĂŒr deren physikalischen Kontakt. Wir wenden unsere Methode mit Erfolg auf ein System von ABC-Transportern und Zwei-Komponenten-Systemen aus dem Firmicutes Bakterien-Stamm an. FĂŒr rĂ€umlich aufgelöste Expressions-Daten wie Microarrays enspricht die Detektion von Artefakten der Extraktion lokal begrenzter Muster. Im Gegensatz zur Erkennung von Rauschen stellen diese innerhalb einer definierten Region Ausreißer dar. Wir entwickeln eine Methodik, um mit Hilfe eines Sliding-Window-Verfahrens, solche Artefakte zu erkennen und zu entfernen. Das Verfahren erkennt diese sehr zuverlĂ€ssig. Zudem kann es auf Daten diverser Plattformen, wie Custom-Arrays, eingesetzt werden. Als weitere Möglichkeit unerwartete Korrelations-Muster aufzudecken, entwickeln wir Padesco. Wir extrahieren hĂ€ufige und wiederkehrende Muster, die ĂŒber Experimente hinweg konserviert sind. FĂŒr ein bestimmtes Experiment sagen wir vorher, ob ein Gen von seinem erwarteten Verhalten abweicht. Wir zeigen, dass Padesco ein effektives Vorgehen ist, um vielversprechende Kandidaten eines differentiellen Expressions-Experiments auszuwĂ€hlen. Wir konzentrieren uns in Kapitel 5 auf die Vorhersage genomweiter regulatorischer Netzwerke aus Expressions-Daten. Hierbei haben sich Korrelations-Muster als nĂŒtzlich fĂŒr die datenbasierte AbschĂ€tzung regulatorischer Interaktionen erwiesen. Wir zeigen, dass fĂŒr die Inferenz eukaryotischer Systeme eine Integration zuvor bekannter Regulationen essentiell ist. Unsere Ergebnisse ergeben, dass diese Integration zur ÜberschĂ€tzung netzwerkĂŒbergreifender QualitĂ€tsmaße fĂŒhrt und wir schlagen eine Prozedur - CoRe - zur Verbesserung vor, um diesen Effekt auszugleichen. CoRe verbessert die False Discovery Rate der ursprĂŒnglich vorhergesagten Netzwerke drastisch. Weiterhin schlagen wir einen Konsensus-Ansatz in Kombination mit einem erweiterten Satz topologischer Features vor, um eine prĂ€zisere Vorhersage fĂŒr das eukaryotische Hefe-Netzwerk zu erhalten. Im Rahmen dieser Arbeit zeigen wir, wie Korrelations-Muster erkannt und wie sie auf verschiedene Problemstellungen der Bioinformatik angewandt werden können. Wir entwickeln und diskutieren AnsĂ€tze zur Vorhersage von Proteinkontakten, Behebung von Artefakten, differentiellen Analyse von Expressionsdaten und zur Vorhersage von Netzwerken und zeigen ihre Eignung im praktischen Einsatz

    Expression data dnalysis and regulatory network inference by means of correlation patterns

    Get PDF
    With the advance of high-throughput techniques, the amount of available data in the bio-molecular field is rapidly growing. It is now possible to measure genome-wide aspects of an entire biological system as a whole. Correlations that emerge due to internal dependency structures of these systems entail the formation of characteristic patterns in the corresponding data. The extraction of these patterns has become an integral part of computational biology. By triggering perturbations and interventions it is possible to induce an alteration of patterns, which may help to derive the dependency structures present in the system. In particular, differential expression experiments may yield alternate patterns that we can use to approximate the actual interplay of regulatory proteins and genetic elements, namely, the regulatory network of a cell. In this work, we examine the detection of correlation patterns from bio-molecular data and we evaluate their applicability in terms of protein contact prediction, experimental artifact removal, the discovery of unexpected expression patterns and genome-scale inference of regulatory networks. Correlation patterns are not limited to expression data. Their analysis in the context of conserved interfaces among proteins is useful to estimate whether these may have co-evolved. Patterns that hint on correlated mutations would then occur in the associated protein sequences as well. We employ a conceptually simple sampling strategy to decide whether or not two pathway elements share a conserved interface and are thus likely to be in physical contact. We successfully apply our method to a system of ABC-transporters and two-component systems from the phylum of Firmicute bacteria. For spatially resolved gene expression data like microarrays, the detection of artifacts, as opposed to noise, corresponds to the extraction of localized patterns that resemble outliers in a given region. We develop a method to detect and remove such artifacts using a sliding-window approach. Our method is very accurate and it is shown to adapt to other platforms like custom arrays as well. Further, we developed Padesco as a way to reveal unexpected expression patterns. We extract frequent and recurring patterns that are conserved across many experiments. For a specific experiment, we predict whether a gene deviates from its expected behaviour. We show that Padesco is an effective approach for selecting promising candidates from differential expression experiments. In Chapter 5, we then focus on the inference of genome-scale regulatory networks from expression data. Here, correlation patterns have proven useful for the data-driven estimation of regulatory interactions. We show that, for reliable eukaryotic network inference, the integration of prior networks is essential. We reveal that this integration leads to an over-estimate of network-wide quality estimates and suggest a corrective procedure, CoRe, to counterbalance this effect. CoRe drastically improves the false discovery rate of the originally predicted networks. We further suggest a consensus approach in combination with an extended set of topological features to obtain a more accurate estimate of the eukaryotic regulatory network for yeast. In the course of this work we show how correlation patterns can be detected and how they can be applied for various problem settings in computational molecular biology. We develop and discuss competitive approaches for the prediction of protein contacts, artifact repair, differential expression analysis, and network inference and show their applicability in practical setups.Mit der Weiterentwicklung von Hochdurchsatztechniken steigt die Anzahl verfĂŒgbarer Daten im Bereich der Molekularbiologie rapide an. Es ist heute möglich, genomweite Aspekte eines ganzen biologischen Systems komplett zu erfassen. Korrelationen, die aufgrund der internen AbhĂ€ngigkeits-Strukturen dieser Systeme enstehen, fĂŒhren zu charakteristischen Mustern in gemessenen Daten. Die Extraktion dieser Muster ist zum integralen Bestandteil der Bioinformatik geworden. Durch geplante Eingriffe in das System ist es möglich Muster-Änderungen auszulösen, die helfen, die AbhĂ€ngigkeits-Strukturen des Systems abzuleiten. Speziell differentielle Expressions-Experimente können Muster-Wechsel bedingen, die wir verwenden können, um uns dem tatsĂ€chlichen Wechselspiel von regulatorischen Proteinen und genetischen Elementen anzunĂ€hern, also dem regulatorischen Netzwerk einer Zelle. In der vorliegenden Arbeit beschĂ€ftigen wir uns mit der Erkennung von Korrelations-Mustern in molekularbiologischen Daten und schĂ€tzen ihre praktische Nutzbarkeit ab, speziell im Kontext der Kontakt-Vorhersage von Proteinen, der Entfernung von experimentellen Artefakten, der Aufdeckung unerwarteter Expressions-Muster und der genomweiten Vorhersage regulatorischer Netzwerke. Korrelations-Muster sind nicht auf Expressions-Daten beschrĂ€nkt. Ihre Analyse im Kontext konservierter Schnittstellen zwischen Proteinen liefert nĂŒtzliche Hinweise auf deren Ko-Evolution. Muster die auf korrelierte Mutationen hinweisen, wĂŒrden in diesem Fall auch in den entsprechenden Proteinsequenzen auftauchen. Wir nutzen eine einfache Sampling-Strategie, um zu entscheiden, ob zwei Elemente eines Pathways eine gemeinsame Schnittstelle teilen, berechnen also die Wahrscheinlichkeit fĂŒr deren physikalischen Kontakt. Wir wenden unsere Methode mit Erfolg auf ein System von ABC-Transportern und Zwei-Komponenten-Systemen aus dem Firmicutes Bakterien-Stamm an. FĂŒr rĂ€umlich aufgelöste Expressions-Daten wie Microarrays enspricht die Detektion von Artefakten der Extraktion lokal begrenzter Muster. Im Gegensatz zur Erkennung von Rauschen stellen diese innerhalb einer definierten Region Ausreißer dar. Wir entwickeln eine Methodik, um mit Hilfe eines Sliding-Window-Verfahrens, solche Artefakte zu erkennen und zu entfernen. Das Verfahren erkennt diese sehr zuverlĂ€ssig. Zudem kann es auf Daten diverser Plattformen, wie Custom-Arrays, eingesetzt werden. Als weitere Möglichkeit unerwartete Korrelations-Muster aufzudecken, entwickeln wir Padesco. Wir extrahieren hĂ€ufige und wiederkehrende Muster, die ĂŒber Experimente hinweg konserviert sind. FĂŒr ein bestimmtes Experiment sagen wir vorher, ob ein Gen von seinem erwarteten Verhalten abweicht. Wir zeigen, dass Padesco ein effektives Vorgehen ist, um vielversprechende Kandidaten eines differentiellen Expressions-Experiments auszuwĂ€hlen. Wir konzentrieren uns in Kapitel 5 auf die Vorhersage genomweiter regulatorischer Netzwerke aus Expressions-Daten. Hierbei haben sich Korrelations-Muster als nĂŒtzlich fĂŒr die datenbasierte AbschĂ€tzung regulatorischer Interaktionen erwiesen. Wir zeigen, dass fĂŒr die Inferenz eukaryotischer Systeme eine Integration zuvor bekannter Regulationen essentiell ist. Unsere Ergebnisse ergeben, dass diese Integration zur ÜberschĂ€tzung netzwerkĂŒbergreifender QualitĂ€tsmaße fĂŒhrt und wir schlagen eine Prozedur - CoRe - zur Verbesserung vor, um diesen Effekt auszugleichen. CoRe verbessert die False Discovery Rate der ursprĂŒnglich vorhergesagten Netzwerke drastisch. Weiterhin schlagen wir einen Konsensus-Ansatz in Kombination mit einem erweiterten Satz topologischer Features vor, um eine prĂ€zisere Vorhersage fĂŒr das eukaryotische Hefe-Netzwerk zu erhalten. Im Rahmen dieser Arbeit zeigen wir, wie Korrelations-Muster erkannt und wie sie auf verschiedene Problemstellungen der Bioinformatik angewandt werden können. Wir entwickeln und diskutieren AnsĂ€tze zur Vorhersage von Proteinkontakten, Behebung von Artefakten, differentiellen Analyse von Expressionsdaten und zur Vorhersage von Netzwerken und zeigen ihre Eignung im praktischen Einsatz

    Automated sample preparation for streamlined proteomic profiling of clinical specimens

    Get PDF
    The genetic information of all life is encoded within DNA molecules that are translated into functional entities, so-called proteins. They are responsible for operating and controlling a vast array of molecular mechanisms in any biological system and ubiquitous in (patho)physiology as a result. Besides, proteins are the primary target of drugs and can have a central role as biomarkers for diagnostic, prognostic, or predictive purposes. Here, many regulatory mechanisms and spatiotemporal influences prevent an accurate prediction of a proteins’ abundance and its associated functionality based on the genome information alone. Nowadays, it has become possible to measure and quantify thousands of proteins simultaneously, however, involving comprehensive sample preparation procedures. Currently, no universally standardized method enables a routine application of proteome profiling in a clinical environment. In this thesis, an automated workflow for the efficient processing of the most common and quantity-limited specimens is described. In order to demonstrate the usefulness of the end-to- end pipeline, which was termed autoSP3, it was applied to the proteome profiling of histologically defined and WHO recognized growth patterns of pulmonary adenocarcinoma (ADC) that currently have a limited clinical implication. Secondly, we investigated the proteome composition of a molecularly well-defined cohort of Ependymoma (EPN) pediatric brain tumors. Despite the availability of substantial NGS data and their ability to differentiate nine distinct subgroups, the majority of tumors remained without a functional insight. Here, the proteome profiling could provide a missing link and emphasize several subgroup-specific protein targets. In summary, this thesis describes the optimization of SP3 and its automation into a robust and cost-efficient pipeline for quantity-limited sample preparation and biological insight into the proteome composition of ADC growth patterns and EPN tumor subgroups

    Detektion von Proteinmodifikationen durch rauschmodellbasierte Analysen von regulatorischer Information

    Get PDF
    In quantitative proteomics the amounts of individual peptides and proteins within differentially treated cells are compared by mass spectrometry. Occuring impreciseness of the measurements can adulterate the results and thus, formulation of hypotheses. Especially low signal intensities are affected since considerable percentages of those may be caused by noise. In this work, the observed intensity dependent noise within a defined quantitative mass spectrometry based workflow could be modelled by the development of a specific noise model. Both calculation of regulation factors of single peptides and calculation of such of peptide groups (e.g. all peptides identified within one protein) is derived from the noise model. In doing so, all calculations are weighted according to the robustness of the underlying data. The regulatory information obtained in this way, is visualised by likelihood curves presenting the likelihood of the most probable as well as alternative regulation factors. The reliability of the most suitable regulation factor - and consequently the robustness of the data - can be inferred from the shape of the curves. As the detection of novel post-translational modifications (PTM) is essential for the understanding of dynamic protein networks, many biological projects currently aim on quantitative analyses by mass spectrometry on the peptide level. Modified peptides appear regulated differentially due to the increase and decrease of the amounts of their modified and unmodified variants. The detection of differetially regulated peptides within the same protein is highly interesting for the investigation of new peptide modifications. For this purpose, besides calculation of regulatory information a clustering algorithm was developed in this work that is able to find differentially regulated peptides of a protein.In der quantitativen Proteomforschung werden durch massenspektrometrische Verfahren die vorhandenen Mengen einzelner Peptide und Proteine in unterschiedlich behandelten Zellen miteinander verglichen. Dabei kommt es zu Messungenauigkeiten, welche die Ergebnisse und somit die Hypothesenbildung verfĂ€lschen können. Davon betroffen sind hauptsĂ€chlich niedrige SignalintensitĂ€ten, bei welchen der Anteil des Rauschens einen signifikanten Anteil der gesamten SignalintensitĂ€t ausmachen kann. In der vorliegenden Arbeit ist es gelungen, das beobachtete Rauschen innerhalb eines definierten Analyseablaufes mit Hilfe eines spezifischen Rauschmodelles zu beurteilen. Das Modell ermöglicht eine der GlaubwĂŒrdigkeit entsprechende Berechnung einzelner Peptidregulationsfaktoren sowie eine gewichtete Berechnung von Regulationsfaktoren fĂŒr eine Gruppe von Peptiden, z.B. alle Peptide eines Proteins. Die so abgeleitete regulatorische Information wird durch Likelihoodkurven visualisiert, welche die Likelihood fĂŒr den wahrscheinlichsten sowie alternative Regulationsfaktoren darstellen. Anhand der Gestalt einer Likelihoodkurve kann auf die Robustheit der zu Grunde liegenden Daten geschlossen werden. Da die Entdeckung neuer post-translationaler Modifikationen essentiell fĂŒr das VerstĂ€ndnis dynamischer Proteinnetzwerke ist, sind quantitative massenspektrometrische Analysen auf der Peptidebene derzeit Ziel vieler biologischer Projekte. Modifizierte Peptide erscheinen infolge der Mengenzu- bzw. abnahme ihrer modifzierten bzw. unmodifizierten Form differentiell reguliert. Die Detektion solcher differentiell regulierten Peptide innerhalb eines Proteins ist von grĂ¶ĂŸtem Interesse, um so auf potentielle neue Modifikationen schließen zu können. Zu diesem Zweck ist in der vorliegenden Arbeit neben der Berechnung der regulatorischen Information ein Clusteringalgorithmus entwickelt worden, welcher (auf dieser basierend) nach differentiell regulierten Peptiden eines Proteins sucht
    • 

    corecore