Search CORE

507 research outputs found

Haplotype inference in general pedigrees with two sites

Author: B Reed
BMY Chan
D Gusfield
D Qian
DD Doan
Duong D Doan
J Guo
J Li
J Li
J Li
J Xiao
J Xiao
K Zhang
L Liu
Patricia A Evans
R Niedermeier
RG Downey
RM Karp
S Xu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with two sites for all members. Results We show that this NP-hard problem can be parametrically reduced to the Bipartization by Edge Removal problem and therefore can be solved by an <it>O</it>(2<it>k</it> · <it>n</it>2) exact algorithm, where <it>n</it> is the number of members and <it>k</it> is the number of recombination events. Conclusions Our work can therefore be useful for genetic disease studies to track down how changes in haplotypes such as recombinations relate to genetic disease.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

An FPT haplotyping algorithm on pedigrees with a small number of sites

Author: BMY Chan
D Qian
DD Doan
DD Doan
Duong D Doan
F Huffner
J Guo
J Li
J Xiao
JC Picard
L Liu
Patricia A Evans
R Niedermeier
RM Karp
S Xu
TH Cormen
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Genetic disease studies investigate relationships between changes in chromosomes and genetic diseases. Single haplotypes provide useful information for these studies but extracting single haplotypes directly by biochemical methods is expensive. A computational method to infer haplotypes from genotype data is therefore important. We investigate the problem of computing the minimum number of recombination events for general pedigrees with a small number of sites for all members. Results We show that this NP-hard problem can be parametrically reduced to the Bipartization by Edge Removal problem with additional parity constraints. We solve this problem with an exact algorithm that runs in <inline-formula><graphic file="1748-7188-6-8-i1.gif"/></inline-formula> time, where <it>n </it>is the number of members, <it>m </it>is the number of sites, and <it>k </it>is the number of recombination events. Conclusions This algorithm infers haplotypes for a small number of sites, which can be useful for genetic disease studies to track down how changes in haplotypes such as recombinations relate to genetic disease.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Parsimony-based genetic algorithm for haplotype resolution and block partitioning

Author: Sazonova Nadezhda A.
Publication venue: The Research Repository @ WVU
Publication date: 01/12/2007
Field of study

This dissertation proposes a new algorithm for performing simultaneous haplotype resolution and block partitioning. The algorithm is based on genetic algorithm approach and the parsimonious principle. The multiloculs LD measure (Normalized Entropy Difference) is used as a block identification criterion. The proposed algorithm incorporates missing data is a part of the model and allows blocks of arbitrary length. In addition, the algorithm provides scores for the block boundaries which represent measures of strength of the boundaries at specific positions. The performance of the proposed algorithm was validated by running it on several publicly available data sets including the HapMap data and comparing results to those of the existing state-of-the-art algorithms. The results show that the proposed genetic algorithm provides the accuracy of haplotype decomposition within the range of the same indicators shown by the other algorithms. The block structure output by our algorithm in general agrees with the block structure for the same data provided by the other algorithms. Thus, the proposed algorithm can be successfully used for block partitioning and haplotype phasing while providing some new valuable features like scores for block boundaries and fully incorporated treatment of missing data. In addition, the proposed algorithm for haplotyping and block partitioning is used in development of the new clustering algorithm for two-population mixed genotype samples. The proposed clustering algorithm extracts from the given genotype sample two clusters with substantially different block structures and finds haplotype resolution and block partitioning for each cluster

The Research Repository @ WVU (West Virginia University)

An Efficient Algorithm for Haplotype Inference on Pedigrees with a Small Number of Recombinants

Author: Jing Xiao
Tao Jiang
Tiancheng Lou
Publication venue: Springer Nature
Publication date: 01/01/2011
Field of study

Springer - Publisher Connector

Recommended from our members

Haplotype Inference through Sequential Monte Carlo

Author: Iliadis Alexandros
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2013
Field of study

Technological advances in the last decade have given rise to large Genome Wide Studies which have helped researchers get better insights in the genetic basis of many common diseases. As the number of samples and genome coverage has increased dramatically it is currently typical that individuals are genotyped using high throughput platforms to more than 500,000 Single Nucleotide Polymorphisms. At the same time theoretical and empirical arguments have been made for the use of haplotypes, i.e. combinations of alleles at multiple loci in individual chromosomes, as opposed to genotypes so the problem of haplotype inference is particularly relevant. Existing haplotyping methods include population based methods, methods for pooled DNA samples and methods for family and pedigree data. Furthermore, the vast amount of available data pose new challenges for haplotyping algorithms. Candidate methods should scale well to the size of the datasets as the number of loci and the number of individuals are well to the thousands. In addition, as genotyping can be performed routinely, researchers encounter a number of specific new scenarios, which can be seen as hybrid between the population and pedigree inference scenarios and require special care to incorporate the maximum amount of information. In this thesis we present a Sequential Monte Carlo framework (TDS) and tailor it to address instances of haplotype inference and frequency estimation problems. Specifically, we first adjust our framework to perform haplotype inference in trio families resulting in a methodology that demonstrates an excellent tradeoff between speed and accuracy. Consequently, we extend our method to handle general nuclear families and demonstrate the gain using our approach as opposed to alternative scenarios. We further address the problem of haplotype inference in pooling data in which we show that our method achieves improved performance over existing approaches in datasets with large number of markers. We finally present a framework to handle the haplotype inference problem in regions of CNV/SNP data. Using our approach we can phase datasets where the ploidy of an individual can vary along the region and each individual can have different breakpoints

Columbia University Academic Commons

Statistical physics methods in computational biology

Author: Zagordi Osvaldo
Publication venue: place:Trieste
Publication date: 03/07/2007
Field of study

The interest of statistical physics for combinatorial optimization is not new, it suffices to think of a famous tool as simulated annealing. Recently, it has also resorted to statistical inference to address some "hard" optimization problems, developing a new class of message passing algorithms. Three applications to computational biology are presented in this thesis, namely: 1) Boolean networks, a model for gene regulatory networks; 2) haplotype inference, to study the genetic information present in a population; 3) clustering, a general machine learning tool

Sissa Digital Library

Estimating genealogies from linked marker data: a Bayesian approach

Author: Arjas Elja
Gasbarra Dario
Pirinen Matti
Sillanpää Mikko J
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Answers to several fundamental questions in statistical genetics would ideally require knowledge of the ancestral pedigree and of the gene flow therein. A few examples of such questions are haplotype estimation, relatedness and relationship estimation, gene mapping by combining pedigree and linkage disequilibrium information, and estimation of population structure. Results We present a probabilistic method for genealogy reconstruction. Starting with a group of genotyped individuals from some population isolate, we explore the state space of their possible ancestral histories under our Bayesian model by using Markov chain Monte Carlo (MCMC) sampling techniques. The main contribution of our work is the development of sampling algorithms in the resulting vast state space with highly dependent variables. The main drawback is the computational complexity that limits the time horizon within which explicit reconstructions can be carried out in practice. Conclusion The estimates for IBD (identity-by-descent) and haplotype distributions are tested in several settings using simulated data. The results appear to be promising for a further development of the method.</p

Crossref

Directory of Open Access Journals

Julkari

PubMed Central

Modelling dependencies in genetic-marker data and its application to haplotype analysis

Author: Schouten Michael T.
Publication venue: The University of Edinburgh
Publication date: 01/01/2008
Field of study

The objective of this thesis is to develop new methods to reconstruct haplotypes from phaseunknown genotypes. The need for new methodologies is motivated by the increasing avail¬ ability of high-resolution marker data for many species. Such markers typically exhibit correlations, a phenomenon known as Linkage Disequilibrium (LD). It is believed that re¬ constructed haplotypes for markers in high LD can be valuable for a variety of application areas in population genetics, including reconstructing population history and identifying genetic disease variantsTraditionally, haplotype reconstruction methods can be categorized according to whether they operate on a single pedigree or a collection of unrelated individuals. The thesis begins with a critical assessment of the limitations of existing methods, and then presents a uni¬ fied statistical framework that can accommodate pedigree data, unrelated individuals and tightly linked markers. The framework makes use of graphical models, where inference entails representing the relevant joint probability distribution as a graph and then using associated algorithms to facilitate computation. The graphical model formalism provides invaluable tools to facilitate model specification, visualization, and inference.Once the unified framework is developed, a broad range of simulation studies are conducted using previously published haplotype data. Important contributions include demonstrating the different ways in which the haplotype frequency distribution can impact the accuracy of both the phase assignments and haplotype frequency estimates; evaluating the effectiveness of using family data to improve accuracy for different frequency profiles; and, assessing the dangers of treating related individuals as unrelated in an association study

Edinburgh Research Archive

Heuristic exploitation of genetic structure in marker-assisted gene pyramiding problems

Author
Publication venue: BioMed Central
Publication date: 30/01/2015
Field of study

Springer - Publisher Connector

The Parameterized Complexity of the Shared Center Problem

Author: B. Ma
D. Marx
I. Leykin
J. Gramm
J. Li
K. Doi
L. Wang
R. Impagliazzo
R.G. Downey
W. Ma
Z.-Z. Chen
Z.-Z. Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref