126 research outputs found

    Kreuzvalidierung angewandt auf Approxiamte Bayesian Computation

    Get PDF
    Approximate Bayesian Computation (ABC) ist eine moderne Technik zur Simulation der a-posteriori-Verteilung, wenn die Likelihood nicht analytisch bestimmbar ist. Anwendung findet ABC derzeit vor allem in der Populationsgenetik. Eine wichtige und noch nicht ausreichend beantwortete Frage in der Anwendung von ABC ist, wie die Akzeptanzschwelle in der Simulation der a-posteriori-Verteilung gewählt werden soll. In dieser Arbeit wird überprüft, ob Kreuzvalidierung ein Werkzeug dafür sein kann, die Akzeptanzschwelle auszuwählen

    Deficiency of nucleotide excision repair is associated with mutational signature observed in cancer

    Get PDF
    Nucleotide excision repair (NER) is one of the main DNA repair pathways that protect cells against genomic damage. Disruption of this pathway can contribute to the development of cancer and accelerate aging. Mutational characteristics of NER-deficiency may reveal important diagnostic opportunities, as tumors deficient in NER are more sensitive to certain treatments. Here, we analyzed the genome-wide somatic mutational profiles of adult stem cells (ASCs) from NER-deficient Ercc1−/Δ mice. Our results indicate that NER-deficiency increases the base substitution load twofold in liver but not in small intestinal ASCs, which coincides with the tissue-specific aging pathology observed in these mice. Moreover, NER-deficient ASCs of both tissues show an increased contribution of Signature 8 mutations, which is a mutational pattern with unknown etiology that is recurrently observed in various cancer types. The scattered genomic distribution of the base substitutions indicates that deficiency of global-genome NER (GG-NER) underlies the observed mutational consequences. In line with this, we observe increased Signature 8 mutations in a GG-NER-deficient human organoid culture, in which XPC was deleted using CRISPR-Cas9 gene-editing. Furthermore, genomes of NER-deficient breast tumors show an increased contribution of Signature 8 mutations compared with NER-proficient tumors. Elevated levels of Signature 8 mutations could therefore contribute to a predictor of NER-deficiency based on a patient's mutational profile

    Sex differences in oncogenic mutational processes.

    Get PDF
    Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here we report a pan-cancer analysis of sex differences in whole genomes of 1983 tumours of 28 subtypes as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in molecular cancer research

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts

    An approximate maximum likelihood algorithm with case studies

    No full text
    Die Likelihood-Funktion ist die Basis vieler statistischer Schätzmethoden, sowohl in der Bayesianischen wie auch in der klassischen Statistik. In vielen Anwendungsbereichen jedoch werden komplexe stochastische Modelle verwendet, für die die Likelihood nicht analytisch hergeleitet werden kann. Beispiele dafür sind stochastische Modelle in der Populationsgenetik, Systembiologie, Epidemiologie, Warteschlangentheorie und Spatial Statistics. Immer schnellere Computer führten in den letzten Jahren zur Entwicklung alternativer Schätzmethoden, die auf Simulationen basieren, wie zum Beispiel Indirect Inference und Approximate Bayesian Computation. In dieser Dissertation wird ein alternativer Algorithmus vorgeschlagen und untersucht, der den Maximum-Likelihood-Schätzer mit Hilfe von stochastischen Gradientenmethoden approximiert. Dabei werden die Anstiegsrichtungen durch Simulationen ermittelt. Der Algorithmus konvergiert gegen den Maximum-Likelihood-Schätzer (bzw. äquivalent dazu, gegen das Maximum der A-posteriori-Verteilung). Damit wird die Anzahl der Simulationen in Regionen des Parameterraumes mit sehr niedriger Likelihood reduziert. Außerdem ist der Algorithmus flexibel auf verschiedenste Modelle anwendbar. Es werden Bedingungen hergeleitet, unter denen der approximative Maximum-Likelihood-Algorithmus fast sicher gegen den Maximum-Likelihood-Schätzer konvergiert. Weiters wird die praktische Anwendbarkeit des Algorithmus untersucht. Zunächst wird der Algorithmus um Maßnahmen, die die Robustheit der Methode erhöhen, ergänzt. Nach ersten Untersuchungen der Eigenschaften des approximativen Maximum-Likelihood-Schätzers anhand von normalverteilten Daten wird er an zwei Beispiel\-en mit komplexen Likelihood-Funktionen angewandt. Das erste ist eine Anwendung zur Parameterschätzung eines Warteschlangenprozesses. Zweitens wird der Algorithmus dazu verwendet die evolutionäre Geschichte der Orang-Utan-Populationen aus Borneo und Sumatra zu rekonstruieren.The likelihood function is the basis of many statistical inference procedures. However, in various areas like population genetics, systems biology, epidemiology, queuing systems and spatial statistics, complex statistical models are required for which the likelihood cannot be obtained analytically. In recent years, increasing computing power has allowed to circumvent this problem by simulation-based methods like Indirect Inference and Approximate Bayesian Computation. %In its most basic form, ABC involves sampling from the parameter space and keeping those parameters that produce data that fit sufficiently well to the actually observed data. Exploring the whole parameter space, however, makes this approach inefficient in high dimensional problems. This led to the proposal of more sophisticated iterative methods of inference such as particle filters. Here, we propose an alternative approach that is based on stochastic gradient methods. By moving along a simulated gradient, the algorithm produces a sequence of estimates that will eventually converge to the maximum likelihood estimate (or, equivalently, to the maximum of the posterior). This approach reduces the number of simulations in regions of low likelihood while being flexibly applicable to a large variety of problems. We present a set of conditions under which the algorithm converges to the maximum likelihood estimate \textit{w. p. 1} and we also explore the properties of the resulting estimator in practical applications. To this end we first propose a set of tuning guidelines that improve the robustness of the algorithm against too noisy simulation results. Then, we investigate the performance of our approach in simulation studies and apply our algorithm to two models with intractable likelihood functions. First, we present an application in the context of queuing systems. Second, we re-analyse population genetic data and estimate parameters describing the demographic history of Bornean and Sumatran orang-utan populations

    Can secondary contact following range expansion be distinguished from barriers to gene flow?

    Get PDF
    International audienceSecondary contact is the reestablishment of gene flow between sister populations that have diverged. For instance, at the end of the Quaternary glaciations in Europe, secondary contact occurred during the northward expansion of the populations which had found refugia in the southern peninsulas. With the advent of multi-locus markers, secondary contact can be investigated using various molecular signatures including gradients of allele frequency, admixture clines, and local increase of genetic differentiation. We use coalescent simulations to investigate if molecular data provide enough information to distinguish between secondary contact following range expansion and an alternative evolutionary scenario consisting of a barrier to gene flow in an isolation-by-distance model. We find that an excess of linkage disequilibrium and of genetic diversity at the suture zone is a unique signature of secondary contact. We also find that the directionality index ψ, which was proposed to study range expansion, is informative to distinguish between the two hypotheses. However, although evidence for secondary contact is usually conveyed by statistics related to admixture coefficients, we find that they can be confounded by isolation-by-distance. We recommend to account for the spatial repartition of individuals when investigating secondary contact in order to better reflect the complex spatio-temporal evolution of populations and species
    corecore