Search CORE

560 research outputs found

Analysis of concordance of different haplotype block partitioning algorithms

Author: Indap Amit R
Marth Gabor T
Olivier Michael
Struble Craig A
Tonellato Peter
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. RESULTS: We simulated 1000 haplotypes using the standard coalescent for three world populations – European, African American, and East Asian – and applied three classes of block partitioning algorithms – diversity based, LD based, and information theoretic. We assessed algorithm differences in number, size, and coverage of blocks inferred under different conditions of SNP density, allele frequency, and sample size. Each algorithm inferred blocks differing in number, size, and coverage under different density and allele frequency conditions. Different partitions had few if any matching block boundaries. However they still overlapped and a high percentage of total chromosomal region was common to all methods. This percentage was generally higher with a higher density of SNPs and when rarer markers were included. CONCLUSION: A gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Ding Li
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Digital Commons@Becker

Global haplotype partitioning for maximal associated SNP pairs

Author: A Indap
A Smith
Ali Katanforoush
B Huang
BL Browning
C Bardel
C Carlson
C Coulonges
C Durrant
C Li
C Pattaro
C Zapata
D Hinds
E Feingold
EC Anderson
Elahe Elahi
F Yates
G Hellenthal
G McVean
Hamid Pezeshk
J He
J Hwang
J Li
J Wall
JC Barrett
K Ding
K Ding
K Zhang
M Nothnagel
Mehdi Sadeghi
MJ Daly
N Patil
N Wang
N Zhou
P Fearnhead
RC Lewontin
RR Hudson
S Climer
S Gabriel
S Gu
S Lydersen
S Myers
The International HapMap Consortium
V Hasselblad
Y Zhao
Z Ding
Z Qin
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm. Results In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots. Conclusion Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Haplotype-aware Diplotyping from Noisy Long Reads

Author: Ebler J.
Haukness M.
Marschall T.
Paten B.
Pesout T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

MPG.PuRe

Genome-wide association studies using single-nucleotide polymorphisms versus haplotypes: an empirical comparison with data from the North American Rheumatoid Arthritis Consortium

Author: Chun Hyonho
Engelman Corinne D
Payseur Bret A
Shim Heejung
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

The high genomic density of the single-nucleotide polymorphism (SNP) sets that are typically surveyed in genome-wide association studies (GWAS) now allows the application of haplotype-based methods. Although the choice of haplotype-based vs. individual-SNP approaches is expected to affect the results of association studies, few empirical comparisons of method performance have been reported on the genome-wide scale in the same set of individuals. To measure the relative ability of the two strategies to detect associations, we used a large dataset from the North American Rheumatoid Arthritis Consortium to: 1) partition the genome into haplotype blocks, 2) associate haplotypes with disease, and 3) compare the results with individual-SNP association mapping. Although some associations were shared across methods, each approach uniquely identified several strong candidate regions. Our results suggest that the application of both haplotype-based and individual-SNP testing to GWAS should be adopted as a routine procedure

Crossref

Springer - Publisher Connector

PubMed Central

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Guryev Victor
Lansdorp Peter
Porubský David
Spierings Diana
Publication venue
Publication date: 23/09/2017
Field of study

ARTS repository - University of Groningen

Evaluation of sample size effect on the identification of haplotype blocks

Author: Inoue Hiroshi
Itakura Mitsuo
Keshavarz Parvaneh
Kunika Kiyoshi
Moritani Maki
Nakamura Naoto
Nomura Kyoko
Osabe Dai
Shinohara Shuichi
Shiota Hiroshi
Tanahashi Toshihito
Yamaguchi Yuka
Yoshikawa Toshikazu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Genome-wide maps of linkage disequilibrium (LD) and haplotypes have been created for different populations. Substantial sharing of the boundaries and haplotypes among populations was observed, but haplotype variations have also been reported across populations. Conflicting observations on the extent and distribution of haplotypes require careful examination. The mechanisms that shape haplotypes have not been fully explored, although the effect of sample size has been implicated. We present a close examination of the effect of sample size on haplotype blocks using an original computational simulation. Results A region spanning 19.31 Mb on chromosome 20q was genotyped for 1,147 SNPs in 725 Japanese subjects. One region of 445 kb exhibiting a single strong LD value (average |D'|; 0.94) was selected for the analysis of sample size effect on haplotype structure. Three different block definitions (recombination-based, LD-based, and diversity-based) were exploited to create simulations for block identification with <it>θ </it>value from real genotyping data. As a result, it was quite difficult to estimate a haplotype block for data with less than 200 samples. Attainment of a reliable haplotype structure with 50 samples was not possible, although the simulation was repeated 10,000 times. Conclusion These analyses underscored the difficulties of estimating haplotype blocks. To acquire a reliable result, it would be necessary to increase sample size more than 725 and to repeat the simulation 3,000 times. Even in one genomic region showing a high LD value, the haplotype block might be fragile. We emphasize the importance of applying careful confidence measures when using the estimated haplotype structure in biomedical research.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Evaluation of sample size effect on the identification of haplotype blocks

Author: Inoue Hiroshi
Itakura Mitsuo
Keshavarz Parvaneh
Kunika Kiyoshi
Moritani Maki
Nakamura Naoto
Nomura Kyoko
Osabe Dai
Shinohara Shuichi
Shiota Hiroshi
Tanahashi Toshihito
Yamaguchi Yuka
Yoshikawa Toshikazu
Publication venue: BioMed Central|Springer Nature
Publication date: 22/08/2023
Field of study

Background: Genome-wide maps of linkage disequilibrium (LD) and haplotypes have been created for different populations. Substantial sharing of the boundaries and haplotypes among populations was observed, but haplotype variations have also been reported across populations. Conflicting observations on the extent and distribution of haplotypes require careful examination. The mechanisms that shape haplotypes have not been fully explored, although the effect of sample size has been implicated. We present a close examination of the effect of sample size on haplotype blocks using an original computational simulation. Results: A region spanning 19.31 Mb on chromosome 20q was genotyped for 1,147 SNPs in 725 Japanese subjects. One region of 445 kb exhibiting a single strong LD value (average |D'|; 0.94) was selected for the analysis of sample size effect on haplotype structure. Three different block definitions (recombination-based, LD-based, and diversity-based) were exploited to create simulations for block identification with θ value from real genotyping data. As a result, it was quite difficult to estimate a haplotype block for data with less than 200 samples. Attainment of a reliable haplotype structure with 50 samples was not possible, although the simulation was repeated 10,000 times. Conclusion: These analyses underscored the difficulties of estimating haplotype blocks. To acquire a reliable result, it would be necessary to increase sample size more than 725 and to repeat the simulation 3,000 times. Even in one genomic region showing a high LD value, the haplotype block might be fragile. We emphasize the importance of applying careful confidence measures when using the estimated haplotype structure in biomedical research

Tokushima University Institutional Repository

Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data

Author: Che-Lun Hung
Chen
Gabriel
Guan-Jie Hua
Hudson
Huiru Zheng
Johnson
Patil
Reif
Suh-Jen Tsai
Wen-Pei Chen
Yaw-Ling Lin
Zahirib
Publication venue: 'MDPI AG'
Publication date: 01/01/2015
Field of study

Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

PubMed Central

Ulster University's Research Portal