126 research outputs found
Kreuzvalidierung angewandt auf Approxiamte Bayesian Computation
Approximate Bayesian Computation (ABC) ist eine moderne Technik zur Simulation der a-posteriori-Verteilung, wenn die Likelihood nicht analytisch bestimmbar ist. Anwendung findet ABC derzeit vor allem in der Populationsgenetik.
Eine wichtige und noch nicht ausreichend beantwortete Frage in der Anwendung von ABC ist, wie die Akzeptanzschwelle in der Simulation der a-posteriori-Verteilung gewählt werden soll. In dieser Arbeit wird überprüft, ob Kreuzvalidierung ein Werkzeug dafür sein kann, die Akzeptanzschwelle auszuwählen
Interaction of manganese with striatal dopamine turnover in human alpha-synuclein transgenic mice
Deficiency of nucleotide excision repair is associated with mutational signature observed in cancer
Nucleotide excision repair (NER) is one of the main DNA repair pathways that protect cells against genomic damage. Disruption of this pathway can contribute to the development of cancer and accelerate aging. Mutational characteristics of NER-deficiency may reveal important diagnostic opportunities, as tumors deficient in NER are more sensitive to certain treatments. Here, we analyzed the genome-wide somatic mutational profiles of adult stem cells (ASCs) from NER-deficient Ercc1−/Δ mice. Our results indicate that NER-deficiency increases the base substitution load twofold in liver but not in small intestinal ASCs, which coincides with the tissue-specific aging pathology observed in these mice. Moreover, NER-deficient ASCs of both tissues show an increased contribution of Signature 8 mutations, which is a mutational pattern with unknown etiology that is recurrently observed in various cancer types. The scattered genomic distribution of the base substitutions indicates that deficiency of global-genome NER (GG-NER) underlies the observed mutational consequences. In line with this, we observe increased Signature 8 mutations in a GG-NER-deficient human organoid culture, in which XPC was deleted using CRISPR-Cas9 gene-editing. Furthermore, genomes of NER-deficient breast tumors show an increased contribution of Signature 8 mutations compared with NER-proficient tumors. Elevated levels of Signature 8 mutations could therefore contribute to a predictor of NER-deficiency based on a patient's mutational profile
Recommended from our members
Author Correction: Pan-cancer analysis of whole genomes
In the published version of this paper, the list of members of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium and their affiliations contained minor errors in the affiliations. The original Article has been corrected to include the corrected list
Sex differences in oncogenic mutational processes.
Sex differences have been observed in multiple facets of cancer epidemiology, treatment and biology, and in most cancers outside the sex organs. Efforts to link these clinical differences to specific molecular features have focused on somatic mutations within the coding regions of the genome. Here we report a pan-cancer analysis of sex differences in whole genomes of 1983 tumours of 28 subtypes as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium. We both confirm the results of exome studies, and also uncover previously undescribed sex differences. These include sex-biases in coding and non-coding cancer drivers, mutation prevalence and strikingly, in mutational signatures related to underlying mutational processes. These results underline the pervasiveness of molecular sex differences and strengthen the call for increased consideration of sex in molecular cancer research
Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples
Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
An approximate maximum likelihood algorithm with case studies
Die Likelihood-Funktion ist die Basis vieler statistischer Schätzmethoden, sowohl in der Bayesianischen wie auch in der klassischen Statistik. In vielen Anwendungsbereichen jedoch werden komplexe stochastische Modelle verwendet, für die die Likelihood nicht analytisch hergeleitet werden kann. Beispiele dafür sind stochastische Modelle in der Populationsgenetik, Systembiologie, Epidemiologie, Warteschlangentheorie und Spatial Statistics. Immer schnellere Computer führten in den letzten Jahren zur Entwicklung alternativer Schätzmethoden, die auf Simulationen basieren, wie zum Beispiel Indirect Inference und Approximate Bayesian Computation.
In dieser Dissertation wird ein alternativer Algorithmus vorgeschlagen und untersucht, der den Maximum-Likelihood-Schätzer mit Hilfe von stochastischen Gradientenmethoden approximiert. Dabei werden die Anstiegsrichtungen durch Simulationen ermittelt. Der Algorithmus konvergiert gegen den Maximum-Likelihood-Schätzer (bzw. äquivalent dazu, gegen das Maximum der A-posteriori-Verteilung). Damit wird die Anzahl der Simulationen in Regionen des Parameterraumes mit sehr niedriger Likelihood reduziert. Außerdem ist der Algorithmus flexibel auf verschiedenste Modelle anwendbar.
Es werden Bedingungen hergeleitet, unter denen der approximative Maximum-Likelihood-Algorithmus fast sicher gegen den Maximum-Likelihood-Schätzer konvergiert. Weiters wird die praktische Anwendbarkeit des Algorithmus untersucht. Zunächst wird der Algorithmus um Maßnahmen, die die Robustheit der Methode erhöhen, ergänzt. Nach ersten Untersuchungen der Eigenschaften des approximativen Maximum-Likelihood-Schätzers anhand von normalverteilten Daten wird er an zwei Beispiel\-en mit komplexen Likelihood-Funktionen angewandt. Das erste ist eine Anwendung zur Parameterschätzung eines Warteschlangenprozesses. Zweitens wird der Algorithmus dazu verwendet die evolutionäre Geschichte der Orang-Utan-Populationen aus Borneo und Sumatra zu rekonstruieren.The likelihood function is the basis of many statistical inference procedures. However, in various areas like population genetics, systems biology, epidemiology, queuing systems and spatial statistics, complex statistical models are required for which the likelihood cannot be obtained analytically. In recent years, increasing computing power has allowed to circumvent this problem by simulation-based methods like Indirect Inference and Approximate Bayesian Computation.
%In its most basic form, ABC involves sampling from the parameter space and keeping those parameters that produce data that fit sufficiently well to the actually observed data. Exploring the whole parameter space, however, makes this approach inefficient in high dimensional problems. This led to the proposal of more sophisticated iterative methods of inference such as particle filters.
Here, we propose an alternative approach that is based on stochastic gradient methods. By moving along a simulated gradient, the algorithm produces a sequence of estimates that will eventually converge to the maximum likelihood estimate (or, equivalently, to the maximum of the posterior). This approach reduces the number of simulations in regions of low likelihood while being flexibly applicable to a large variety of problems.
We present a set of conditions under which the algorithm converges to the maximum likelihood estimate \textit{w. p. 1} and we also explore the properties of the resulting estimator in practical applications. To this end we first propose a set of tuning guidelines that improve the robustness of the algorithm against too noisy simulation results. Then, we investigate the performance of our approach in simulation studies and apply our algorithm to two models with intractable likelihood functions. First, we present an application in the context of queuing systems. Second, we re-analyse population genetic data and estimate parameters describing the demographic history of Bornean and Sumatran orang-utan populations
Can secondary contact following range expansion be distinguished from barriers to gene flow?
International audienceSecondary contact is the reestablishment of gene flow between sister populations that have diverged. For instance, at the end of the Quaternary glaciations in Europe, secondary contact occurred during the northward expansion of the populations which had found refugia in the southern peninsulas. With the advent of multi-locus markers, secondary contact can be investigated using various molecular signatures including gradients of allele frequency, admixture clines, and local increase of genetic differentiation. We use coalescent simulations to investigate if molecular data provide enough information to distinguish between secondary contact following range expansion and an alternative evolutionary scenario consisting of a barrier to gene flow in an isolation-by-distance model. We find that an excess of linkage disequilibrium and of genetic diversity at the suture zone is a unique signature of secondary contact. We also find that the directionality index ψ, which was proposed to study range expansion, is informative to distinguish between the two hypotheses. However, although evidence for secondary contact is usually conveyed by statistics related to admixture coefficients, we find that they can be confounded by isolation-by-distance. We recommend to account for the spatial repartition of individuals when investigating secondary contact in order to better reflect the complex spatio-temporal evolution of populations and species
- …
