thesis

Minimal models of evolution: germline fitness effects of cancer mutations and stochastic tunneling under strong recombination

Abstract

In a time where data on the genetic make-up of organisms is available in abundance, the theory of evolution is of immediate importance to answer key questions of biology: How can one explain the variation seen in the DNA of different organisms and species? What are the effects of changes in the DNA on the function of cells? What are the driving mechanisms of diseases with a genetic component such as cancer? Minimal mathematical models of evolution provide a basis for the interpretation of DNA data. The explanations they offer are concrete and testable, their assumptions and limitations explicit. The application and further development of minimal evolution models is the main theme of this work. In the first part, the functional effects of mutations found in cancer cells are analyzed from the perspective of germline evolution. This is the process that produced the DNA of organisms as we see it today. Mutations have an effect on the fitness of healthy cells. This impact can be estimated from the variation seen in the sequences of protein domains. It is found that this evolutionarily informed conservation score has utility to identify cancer driver genes, especially if they are tumor suppressor genes. The relevance of this fitness scale for cancer mutations is demonstrated on a data set of mutations in protein kinase genes. This analysis is followed by an application of Hidden Markov Models (HMM) to the detection of signals of positive selection in cancer mutation data. Cancer as an evolutionary process of cells is markedly different from the process of germline evolution. Cancer-specific selection can be seen in genes, whose activity or lack thereof is essential for the progress of cancer. These cancer genes exhibit an increased rate of amino acid changing mutations, beyond the level expected by chance. The identification of these genes is a statistical task for which HMM are shown to be most suitable. Finally, an extended mathematical model of evolution is analyzed which describes the adaptation of a sexually reproducing population to a global fitness maximum via compensatory mutations. In a two-locus/two-allele model, the compound effects of mutation, selection, genetic drift, recombination and sign epistasis lead to the interesting situation of adaption via the crossing of a fitness valley in genotype space. This bottleneck can be overcome by rare large fluctuations in the allele frequencies overcoming the effect of recombinatorial reshuffling. The relevant time scales are derived for a parameter regime that includes large recombination

    Similar works