Bayesian methods for data-driven characterization of cells

Abstract

Modern biology abounds with questions on cell characterization, such as genetic and molecular composition and metabolic regulation. To elucidate these questions, the experimentation computation cycle is used, where on one side of the cycle, data is generated in experiments and on the other side, data analysis is used to draw conclusions from the data. In this thesis, we use Bayesian statistics to address data analysis challenges in the characterization of cells. In particular, we address the inference of intracellular metabolic reaction rates (fluxes) from measurements of 13C enrichments in metabolites (fluxomics) and cell lineage reconstruction from time lapse microscopy images of cell colonies. In both cases, the focus of the work lies on challenges where some entities, central to the data analysis, remain non-determined by the data. In the fluxomics case, this means that the data is insufficient for determining a unique metabolic network model, and in the single-cell analysis case that the images are insufficient for reconstructing the cell lineage with certainty. Apart from deriving the appropriate statistical formalisms for addressing the above challenges, computational strategies for performing the practical calculations are proposed. For the fluxomics calculations, Markov Chain Monte Carlo methods for single model challenges and a Reversible Jump Markov Chain Monte Carlo method for challenges with model uncertainty are developed, tailored for the typically linearly constrained spaces of flux parameters. For handling uncertainty in cell lineage reconstruction, a particle filtering approach is developed

    Similar works