Search CORE

7 research outputs found

Minimum Segmentation for Pan-genomic Founder Reconstruction in Linear Time

Author: Cazaux Bastien
Kosolobov Dmitry
Mäkinen Veli
Norri Tuukka
Publication venue: Schloss Dagstuhl Leibniz Center for Informatics
Publication date: 01/01/2018
Field of study

Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Minimum Segmentation for Pan-genomic Founder Reconstruction in Linear Time

Author: Cazaux Bastien
Kosolobov Dmitry
Norri Tuukka
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Publication date: 01/01/2018
Field of study

Given a threshold L and a set R = {R_1, ..., R_m} of m strings (haplotype sequences), each having length n, the minimum segmentation problem for founder reconstruction is to partition [1,n] into set P of disjoint segments such that each segment [a,b] in P has length at least L and the number d(a,b)=|{R_i[a,b] : 1 <= i <= m}| of distinct substrings at segment [a,b] is minimized over [a,b] in P. The distinct substrings in the segments represent founder blocks that can be concatenated to form max{d(a,b) : [a,b] in P} founder sequences representing the original R such that crossovers happen only at segment boundaries. We give an optimal O(mn) time algorithm to solve the problem, improving over earlier O(mn^2). This improvement enables to exploit the algorithm on a pan-genomic setting of input strings being aligned haplotype sequences of complete human chromosomes, with a goal of finding a representative set of references that can be indexed for read alignment and variant calling. We implemented the new algorithm and give some experimental evidence on the practicality of the approach on this pan-genomic setting

Dagstuhl Research Online Publication Server

Linear Time Construction of Indexable Founder Block Graphs

Author: Cazaux Bastien
Equi Massimo
Mäkinen Veli
Norri Tuukka
Tomescu Alexandru
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik
Publication date: 01/01/2020
Field of study

Peer reviewe

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

Constructing Founder Sets Under Allelic and Non-Allelic Homologous Recombination

Author: Bonnet Konstantinn
Marschall Tobias
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Publication date: 01/01/2022
Field of study

Homologous recombination between the maternal and paternal copies of a chromosome is a key mechanism for human inheritance and shapes population genetic properties of our species. However, a similar mechanism can also act between different copies of the same sequence, then called non-allelic homologous recombination (NAHR). This process can result in genomic rearrangements - including deletion, duplication, and inversion - and is underlying many genomic disorders. Despite its importance for genome evolution and disease, there is a lack of computational models to study genomic loci prone to NAHR. In this work, we propose such a computational model, providing a unified framework for both (allelic) homologous recombination and NAHR. Our model represents a set of genomes as a graph, where human haplotypes correspond to walks through this graph. We formulate two founder set problems under our recombination model, provide flow-based algorithms for their solution, and demonstrate scalability to problem instances arising in practice

Dagstuhl Research Online Publication Server

A randomized iterated greedy algorithm for the founder sequence reconstruction problem

Author: Benedettini Stefano
Blum Christian
Roli Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2010
Field of study

The problem of inferring ancestral genetic information in terms of a set of founders of a given population arises in various biological contexts. In optimization terms, this problem can be formulated as a combinatorial string problem. The main problem of existing techniques, both exact and heuristic, is that their time complexity scales exponentially, which makes them impractical for solving large-scale instances. Basing our work on previous ideas outlined in [1], we developed a randomized iterated greedy algorithm that is able to provide good solutions in a short time span. The experimental evaluation shows that our algorithm is currently the best approximate technique, especially when large problem instances are concerned.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

International Evaluation of Research and Doctoral Training at the University of Helsinki 2005-2010 : RC-Specific Evaluation of ALKO - Algorithms and Data Analysis

Author
Publication venue
Publication date: 01/01/2012
Field of study

Helsingin yliopiston digitaalinen arkisto

Haplotype Inference via Hierarchical Genotype Parsing

Author: Esko Ukkonen
Pasi Rastas
Publication venue
Publication date
Field of study

Abstract. The within-species genetic variation due to recombinations leads to a mosaic-like structure of DNA. This structure can be modeled, e.g. by parsing sample sequences of current DNA with respect to a small number of founders. The founders represent the ancestral sequence material from which the sample was created in a sequence of recombination steps. This scenario has recently been successfully applied on developing probabilistic Hidden Markov Methods for haplotyping genotypic data. In this paper we introduce a combinatorial method for haplotyping that is based on a similar parsing idea. We formulate a polynomial-time parsing algorithm that finds minimum cross-over parse in a simplified ‘flat’ parsing model that ignores the historical hierarchy of recombinations. The problem of constructing optimal founders that would give minimum possible parse for given genotypic sequences is shown NP-hard. A heuristic locally-optimal algorithm is given for founder construction. Combined with flat parsing this already gives quite good haplotyping results. Improved haplotyping is obtained by using a hierarchical parsing that properly models the natural recombination process. For finding short hierarchical parses a greedy polynomial-time algorithm is given. Empirical haplotyping results on HapMap data are reported.

CiteSeerX