3 research outputs found

    Minimizing recombinations in consensus networks for phylogeographic studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks <it>G</it><sub>1 </sub>and <it>G</it><sub>2</sub>, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network <it>G</it><sub>3 </sub>with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within <it>ϵ </it>= <it>sz</it>(<it>G</it><sub>1</sub>, <it>G</it><sub>2</sub>) (function <it>sz </it>is a well-behaved function of the sizes and topologies of <it>G</it><sub>1 </sub>and <it>G</it><sub>2</sub>) of the optimal <it>number </it>of recombinations. To date, this is the best known result for a network consensus problem.</p> <p>Results</p> <p>Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations.</p> <p>Conclusion</p> <p>We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology.</p

    A New Method to Reconstruct Recombination Events at a Genomic Scale

    Get PDF
    Recombination is one of the main forces shaping genome diversity, but the information it generates is often overlooked. A recombination event creates a junction between two parental sequences that may be transmitted to the subsequent generations. Just like mutations, these junctions carry evidence of the shared past of the sequences. We present the IRiS algorithm, which detects past recombination events from extant sequences and specifies the place of each recombination and which are the recombinants sequences. We have validated and calibrated IRiS for the human genome using coalescent simulations replicating standard human demographic history and a variable recombination rate model, and we have fine-tuned IRiS parameters to simultaneously optimize for false discovery rate, sensitivity, and accuracy in placing the recombination events in the sequence. Newer recombinations overwrite traces of past ones and our results indicate more recent recombinations are detected by IRiS with greater sensitivity. IRiS analysis of the MS32 region, previously studied using sperm typing, showed good concordance with estimated recombination rates. We also applied IRiS to haplotypes for 18 X-chromosome regions in HapMap Phase 3 populations. Recombination events detected for each individual were recoded as binary allelic states and combined into recotypes. Principal component analysis and multidimensional scaling based on recotypes reproduced the relationships between the eleven HapMap Phase III populations that can be expected from known human population history, thus further validating IRiS. We believe that our new method will contribute to the study of the distribution of recombination events across the genomes and, for the first time, it will allow the use of recombination as genetic marker to study human genetic variation

    A minimal descriptor of an ancestral recombinations graph

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ancestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations.</p> <p>Results</p> <p>In this paper, we identify structure-preserving and samples-preserving core of an ARG <it>G</it> and call it the minimal descriptor ARG of <it>G</it>. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of <it>G</it> and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of <it>G</it>. We also prove that even an unbounded <it>G</it> has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of <it>G</it> and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential.</p> <p>Conclusions</p> <p>Based on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).</p
    corecore