Search CORE

8,982 research outputs found

Inference of Ancestral Recombination Graphs through Topological Data Analysis

Author: Camara Pablo G.
Levine Arnold J.
Rabadan Raul
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Gal\'apagos Islands.Comment: 33 pages, 12 figures. The accompanying software, instructions and example files used in the manuscript can be obtained from https://github.com/RabadanLab/TARGe

arXiv.org e-Print Archive

Princeton University Open Access Repository

Columbia University Academic Commons

Directory of Open Access Journals

PubMed Central

FigShare

A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks

Author: Chan Jeffrey
Jenkins Paul A.
Mathieson Sara
Perrone Valerio
Song Yun S.
Spence Jeffrey P.
Publication venue
Publication date: 01/01/2018
Field of study

An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.Comment: 9 pages, 8 figure

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Haverford College: Haverford Scholarship

Automated unique input output sequence generation for conformance testing of FSMs

Author: Derderian K
Harman M
Hierons RM
Qiang G
Publication venue: 'Oxford University Press (OUP)'
Publication date: 19/12/2005
Field of study

This paper describes a method for automatically generating unique input output (UIO) sequences for FSM conformance testing. UIOs are used in conformance testing to verify the end state of a transition sequence. UIO sequence generation is represented as a search problem and genetic algorithms are used to search this space. Empirical evidence indicates that the proposed method yields considerably better (up to 62% better) results compared with random UIO sequence generation

CiteSeerX

Crossref

UCL Discovery

King's Research Portal

Brunel University Research Archive

When two trees go to war

Author: Kelk Steven
van Iersel Leo
Publication venue
Publication date: 01/01/2010
Field of study

Rooted phylogenetic networks are often constructed by combining trees, clusters, triplets or characters into a single network that in some well-defined sense simultaneously represents them all. We review these four models and investigate how they are related. In general, the model chosen influences the minimum number of reticulation events required. However, when one obtains the input data from two binary trees, we show that the minimum number of reticulations is independent of the model. The number of reticulations necessary to represent the trees, triplets, clusters (in the softwired sense) and characters (with unrestricted multiple crossover recombination) are all equal. Furthermore, we show that these results also hold when not the number of reticulations but the level of the constructed network is minimised. We use these unification results to settle several complexity questions that have been open in the field for some time. We also give explicit examples to show that already for data obtained from three binary trees the models begin to diverge

arXiv.org e-Print Archive

Maastricht University Research Portal

Elsevier - Publisher Connector

CWI's Institutional Repository

Pure OAI Repository