Search CORE

The Cluster Variation Method for Efficient Linkage Analysis on Extended Pedigrees

Author: A Rangarajan
A Thomas
C Jensen
Cornelis A Albers
E Sobel
EA Thompson
ES Lander
FX Du
G An
HA Bethe
Hilbert J Kappen
J Fishelson
J Pearl
Jensen
JR O'Connell
JS Yedidia
JS Yedidia
K Lange
K Lange
L Kruglyak
M Chavira
M Welling
Martijn AR Leisink
N Friedman
NA Sheehan
NE Morton
R Kikuchi
RC Elston
SL Lauritzen
T Heskes
T Morita
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Computing exact multipoint LOD scores for extended pedigrees rapidly becomes infeasible as the number of markers and untyped individuals increase. When markers are excluded from the computation, significant power may be lost. Therefore accurate approximate methods which take into account all markers are desirable. METHODS: We present a novel method for efficient estimation of LOD scores on extended pedigrees. Our approach is based on the Cluster Variation Method, which deterministically estimates likelihoods by performing exact computations on tractable subsets of variables (clusters) of a Bayesian network. First a distribution over inheritances on the marker loci is approximated with the Cluster Variation Method. Then this distribution is used to estimate the LOD score for each location of the trait locus. RESULTS: First we demonstrate that significant power may be lost if markers are ignored in the multi-point analysis. On a set of pedigrees where exact computation is possible we compare the estimates of the LOD scores obtained with our method to the exact LOD scores. Secondly, we compare our method to a state of the art MCMC sampler. When both methods are given equal computation time, our method is more efficient. Finally, we show that CVM scales to large problem instances. CONCLUSION: We conclude that the Cluster Variation Method is as accurate as MCMC and generally is more efficient. Our method is a promising alternative to approaches based on MCMC sampling

arXiv.org e-Print Archive

Radboud Repository

Computing Individual Risks based on Family History in Genetic Disease in the Presence of Competing Risks

Author: Bouaziz O
Lefebvre Antoine
Nuel G
Publication venue
Publication date: 14/09/2017
Field of study

When considering a genetic disease with variable age at onset (ex: diabetes , familial amyloid neuropathy, cancers, etc.), computing the individual risk of the disease based on family history (FH) is of critical interest both for clinicians and patients. Such a risk is very challenging to compute because: 1) the genotype X of the individual of interest is in general unknown; 2) the posterior distribution P(X|FH, T > t) changes with t (T is the age at disease onset for the targeted individual); 3) the competing risk of death is not negligible. In this work, we present a modeling of this problem using a Bayesian network mixed with (right-censored) survival outcomes where hazard rates only depend on the genotype of each individual. We explain how belief propagation can be used to obtain posterior distribution of genotypes given the FH, and how to obtain a time-dependent posterior hazard rate for any individual in the pedigree. Finally, we use this posterior hazard rate to compute individual risk, with or without the competing risk of death. Our method is illustrated using the Claus-Easton model for breast cancer (BC). This model assumes an autosomal dominant genetic risk factor such as non-carriers (genotype 00) have a BC hazard rate

\lambda

0 (t) while carriers (genotypes 01, 10 and 11) have a (much greater) hazard rate

\lambda

1 (t). Both hazard rates are assumed to be piecewise constant with known values (cuts at 20, 30,. .. , 80 years). The competing risk of death is derived from the national French registry

HAL Descartes

Hal-Diderot

Parallel computations on pedigree data through mapping to configurable computing devices

Author: Henshall John M
Little Bryce Alvin
Publication venue: BioMed Central
Publication date: 01/04/2005
Field of study

Pedigree data structures have a number of applications in genetics, including the estimation of allelic or haplotype probabilities in humans and agricultural species, and the estimation of breeding values in agricultural species. Sequential algorithms for general purpose CPU-based computers are commonly used, but are inadequate for some tasks on large data sets. We show that pedigree data can be directly represented on Field Programmable Gate Arrays (FPGA), allowing highly efficient massively parallel simulation of the flow of genes. Operating on the whole pedigree in parallel, the transmission of genes can occur for all individuals in a single clock cycle. By using FPGA, the algorithms to estimate inbreeding coefficients and allelic probabilities are shown to operate hundreds to thousands of times faster than the corresponding sequentially based algorithms. Where problems can be largely represented in an integer form, FPGA provide an efficient platform for computations on pedigree data

EDP Sciences OAI-PMH repository (1.2.0)

Springer

An efficient algorithm to compute marginal posterior genotype probabilities for every member of a pedigree with loops

Author: Abraham Joseph
Fernando Rohan L
Totir Liviu R
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Marginal posterior genotype probabilities need to be computed for genetic analyses such as geneticcounseling in humans and selective breeding in animal and plant species. Methods In this paper, we describe a peeling based, deterministic, exact algorithm to compute efficiently genotype probabilities for every member of a pedigree with loops without recourse to junction-tree methods from graph theory. The efficiency in computing the likelihood by peeling comes from storing intermediate results in multidimensional tables called cutsets. Computing marginal genotype probabilities for individual <it>i </it>requires recomputing the likelihood for each of the possible genotypes of individual <it>i</it>. This can be done efficiently by storing intermediate results in two types of cutsets called anterior and posterior cutsets and reusing these intermediate results to compute the likelihood. Examples A small example is used to illustrate the theoretical concepts discussed in this paper, and marginal genotype probabilities are computed at a monogenic disease locus for every member in a real cattle pedigree.</p

Sampling genotypes in large pedigrees with loops

Author: Alicia L. Carriquiry
Bernt Guldbrandtsen
Liviu R. Totir
Rohan L. Fernando
Soledad A. Fern�ndez
Publication venue: 'EDP Sciences'
Publication date: 01/01/2003
Field of study

Most parsimonious haplotype allele sharing determination

Author: Cai Zhipeng
Goebel Randy
Lin Guohui
Sabaa Hadi
Stothard Paul
Wang Yining
Wang Zhiquan
Xu Jiaofen
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The "common disease – common variant" hypothesis and genome-wide association studies have achieved numerous successes in the last three years, particularly in genetic mapping in human diseases. Nevertheless, the power of the association study methods are still low, in particular on quantitative traits, and the description of the full allelic spectrum is deemed still far from reach. Given increasing density of single nucleotide polymorphisms available and suggested by the block-like structure of the human genome, a popular and prosperous strategy is to use haplotypes to try to capture the correlation structure of SNPs in regions of little recombination. The key to the success of this strategy is thus the ability to unambiguously determine the haplotype allele sharing status among the members. The association studies based on haplotype sharing status would have significantly reduced degrees of freedom and be able to capture the combined effects of tightly linked causal variants. Results For pedigree genotype datasets of medium density of SNPs, we present two methods for haplotype allele sharing status determination among the pedigree members. Extensive simulation study showed that both methods performed nearly perfectly on breakpoint discovery, mutation haplotype allele discovery, and shared chromosomal region discovery. Conclusion For pedigree genotype datasets, the haplotype allele sharing status among the members can be deterministically, efficiently, and accurately determined, even for very small pedigrees. Given their excellent performance, the presented haplotype allele sharing status determination programs can be useful in many downstream applications including haplotype based association studies.</p

CiteSeerX

arXiv.org e-Print Archive

The EM Algorithm in Genetics, Genomics and Public Health

Author: Laird Nan M.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 11/04/2011
Field of study

The popularity of the EM algorithm owes much to the 1977 paper by Dempster, Laird and Rubin. That paper gave the algorithm its name, identified the general form and some key properties of the algorithm and established its broad applicability in scientific research. This review gives a nontechnical introduction to the algorithm for a general scientific audience, and presents a few examples characteristic of its application.Comment: Published in at http://dx.doi.org/10.1214/08-STS270 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org