Recent developments in single cell sequencing allow us to elucidate
processes of individual cells in unprecedented detail. This detail
provides new insights into the progress of cells during cell type
differentiation. Cell type heterogeneity shows the complexity of cells
working together to produce organ function on a macro level. The
understanding of single cell transcriptomics promises to lead to the
ultimate goal of understanding the function of individual cells and
their contribution to higher level function in their environment.
Characterizing the transcriptome of single cells requires us to
understand and be able to model the latent processes of cell functions
that explain biological variance and richness of gene expression
measurements. In this thesis, we describe ways of jointly modelling
biological function and unwanted technical and biological confounding
variation using Gaussian process latent variable models. In addition
to mathematical modelling of latent processes, we provide insights
into the understanding of research code and the significance of
computer science in development of techniques for single cell
experiments.
We will describe the process of understanding complex
machine learning algorithms and translating them into usable
software. We then proceed to applying these algorithms. We show how
proper research software design underlying the implementation can lead
to a large user base in other areas of expertise, such as single cell gene
expression. To show the worth of properly designed software underlying
a research project, we show other software packages built upon the
software developed during this thesis and how they can be applied to
single cell gene expression experiments.
Understanding the underlying function of cells seems within reach
through these new techniques that allow us to unravel the
transcriptome of single cells. We describe probabilistic techniques of
identifying the latent functions of cells, while focusing on the
software and ease-of-use aspects of supplying proper research code to
be applied by other researchers