Search CORE

28 research outputs found

A Bayesian Framework for Multivariate Differential Analysis accounting for Missing Data

Author: Chion Marie
Leroy Arthur
Publication venue
Publication date: 18/07/2023
Field of study

Current statistical methods in differential proteomics analysis generally leave aside several challenges, such as missing values, correlations between peptide intensities and uncertainty quantification. Moreover, they provide point estimates, such as the mean intensity for a given peptide or protein in a given condition. The decision of whether an analyte should be considered as differential is then based on comparing the p-value to a significance threshold, usually 5%. In the state-of-the-art limma approach, a hierarchical model is used to deduce the posterior distribution of the variance estimator for each analyte. The expectation of this distribution is then used as a moderated estimation of variance and is injected directly into the expression of the t-statistic. However, instead of merely relying on the moderated estimates, we could provide more powerful and intuitive results by leveraging a fully Bayesian approach and hence allow the quantification of uncertainty. The present work introduces this idea by taking advantage of standard results from Bayesian inference with conjugate priors in hierarchical models to derive a methodology tailored to handle multiple imputation contexts. Furthermore, we aim to tackle a more general problem of multivariate differential analysis, to account for possible inter-peptide correlations. By defining a hierarchical model with prior distributions on both mean and variance parameters, we achieve a global quantification of uncertainty for differential analysis. The inference is thus performed by computing the posterior distribution for the difference in mean peptide intensities between two experimental conditions. In contrast to more flexible models that can be achieved with hierarchical structures, our choice of conjugate priors maintains analytical expressions for direct sampling from posterior distributions without requiring expensive MCMC methods.Comment: 21 pages, 5 figure

arXiv.org e-Print Archive

Em torno de cena e da sequência: problemas de categorização

Author: AUMONT Jacques
BEAIRSTO Ric.
BONITZER Pascal
BORDWELL David
CHION Michel
COMPARATO Doc.
JOURNOT Marie-Thérèse
JULLIER Laurent
MACIEL Luiz Carlos
MAMET David
MCKEE Robert
METZ Christian
PUDOVKIN Vsevolod
VALE Eugene
VANOYE Francis
VANOYE Francis
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Développement de nouvelles méthodologies statistiques pour l'analyse de données de protéomique quantitative

Author: Chion Marie
Publication venue: HAL CCSD
Publication date: 16/12/2021
Field of study

Proteomic analysis consists of studying all the proteins expressed by a given biological system, at a given time and under given conditions. Recent technological advances in mass spectrometry and liquid chromatography make it possible to envisage large-scale and high-throughput proteomic studies.This thesis work focuses on developing statistical methodologies for the analysis of quantitative proteomics data and thus presents three main contributions. The first part proposes to use monotone spline regression models to estimate the amounts of all peptides detected in a sample using internal standards labelled for a subset of targeted peptides. The second part presents a strategy to account for the uncertainty induced by the multiple imputation process in the differential analysis, also implemented in the mi4p R package. Finally, the third part proposes a Bayesian framework for differential analysis, making it notably possible to consider the correlations between the intensities of peptides.L’analyse protéomique consiste à étudier l’ensemble des protéines exprimées par un système biologique donné, à un moment donné et dans des conditions données. Les récents progrès technologiques en spectrométrie de masse et en chromatographie liquide permettent d’envisager aujourd’hui des études protéomiques à large échelle et à haut débit. Ce travail de thèse porte sur le développement de méthodologies statistiques pour l’analyse des données de protéomique quantitative et présente ainsi trois principales contributions. La première partie propose d’utiliser des modèles de régression par spline monotone pour estimer les quantités de tous les peptides détectés dans un échantillon grâce à l'utilisation de standards internes marqués pour un sous-ensemble de peptides ciblés. La deuxième partie présente une stratégie de prise en compte de l’incertitude induite par le processus d’imputation multiple dans l’analyse différentielle, également implémentée dans le package R mi4p. Enfin, la troisième partie propose un cadre bayésien pour l’analyse différentielle, permettant notamment de tenir compte des corrélations entre les intensités des peptides

Développement de nouvelles méthodologies statistiques pour l'analyse de données de protéomique quantitative

Author: Chion Marie
Publication venue: HAL CCSD
Publication date: 16/12/2021
Field of study

HAL Descartes

Développement de nouvelles méthodologies statistiques pour l'analyse de données de protéomique quantitative

Author: Chion Marie
Publication venue: HAL CCSD
Publication date: 16/12/2021
Field of study

HAL-IN2P3

Development of new statistical methodologies for quantitative proteomics data analysis

Author: Chion Marie
Publication venue
Publication date: 16/12/2021
Field of study

L’analyse protéomique consiste à étudier l’ensemble des protéines exprimées par un système biologique donné, à un moment donné et dans des conditions données. Les récents progrès technologiques en spectrométrie de masse et en chromatographie liquide permettent d’envisager aujourd’hui des études protéomiques à large échelle et à haut débit. Ce travail de thèse porte sur le développement de méthodologies statistiques pour l’analyse des données de protéomique quantitative et présente ainsi trois principales contributions. La première partie propose d’utiliser des modèles de régression par spline monotone pour estimer les quantités de tous les peptides détectés dans un échantillon grâce à l'utilisation de standards internes marqués pour un sous-ensemble de peptides ciblés. La deuxième partie présente une stratégie de prise en compte de l’incertitude induite par le processus d’imputation multiple dans l’analyse différentielle, également implémentée dans le package R mi4p. Enfin, la troisième partie propose un cadre bayésien pour l’analyse différentielle, permettant notamment de tenir compte des corrélations entre les intensités des peptidesProteomic analysis consists of studying all the proteins expressed by a given biological system, at a given time and under given conditions. Recent technological advances in mass spectrometry and liquid chromatography make it possible to envisage large-scale and high-throughput proteomic studies.This thesis work focuses on developing statistical methodologies for the analysis of quantitative proteomics data and thus presents three main contributions. The first part proposes to use monotone spline regression models to estimate the amounts of all peptides detected in a sample using internal standards labelled for a subset of targeted peptides. The second part presents a strategy to account for the uncertainty induced by the multiple imputation process in the differential analysis, also implemented in the mi4p R package. Finally, the third part proposes a Bayesian framework for differential analysis, making it notably possible to consider the correlations between the intensities of peptides

Thèses-Unistra : thèses et mémoires électroniques de l'Université de Strasbourg

Theses.fr

Développement de nouvelles méthodologies statistiques pour l'analyse de données de protéomique quantitative

Author: Chion Marie
Publication venue: HAL CCSD
Publication date: 16/12/2021
Field of study

Thèses en Ligne

Dealing with imputation-caused variance using moderated t-test

Author: Bertrand Frédéric
Carapito Christine
Chion Marie
Publication venue: HAL CCSD
Publication date: 01/07/2020
Field of study

Cancelled.International audienc

HAL-IN2P3

HAL Descartes

Hal-Diderot