Search CORE

15 research outputs found

Strobe sequence design for haplotype assembly

Author: A Ritz
Ali Bashir
BV Halldórsson
Christine Lo
D Altshuler
D He
DE Reich
ER Mardis
F Aversa
J Eid
J Marchini
J Shendure
JC Roach
L Ma
MA Levenstien
P Erdos
T Shiina
V Bafna
V Bansal
V Bansal
Vikas Bansal
Vineet Bafna
Z Guo
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Humans are diploid, carrying two copies of each chromosome, one from each parent. Separating the paternal and maternal chromosomes is an important component of genetic analyses such as determining genetic association, inferring evolutionary scenarios, computing recombination rates, and detecting cis-regulatory events. As the pair of chromosomes are mostly identical to each other, linking together of alleles at heterozygous sites is sufficient to phase, or separate the two chromosomes. In Haplotype Assembly, the linking is done by sequenced fragments that overlap two heterozygous sites. While there has been a lot of research on correcting errors to achieve accurate haplotypes via assembly, relatively little work has been done on designing sequencing experiments to get long haplotypes. Here, we describe the different design parameters that can be adjusted with next generation and upcoming sequencing technologies, and study the impact of design choice on the length of the haplotype. Results We show that a number of parameters influence haplotype length, with the most significant one being the advance length (distance between two fragments of a clone). Given technologies like strobe sequencing that allow for large variations in advance lengths, we design and implement a simulated annealing algorithm to sample a large space of distributions over advance-lengths. Extensive simulations on individual genomic sequences suggest that a non-trivial distribution over advance lengths results a 1-2 order of magnitude improvement in median haplotype length. Conclusions Our results suggest that haplotyping of large, biologically important genomic regions is feasible with current technologies

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Haplotype Reconstruction Error as a Classical Misclassification Problem: Introducing Sensitivity and Specificity as Error Measures

Author: AG Clark
B Devlin
C Spinka
Claudia Lamina
D Fallin
D Thomas
DJ Schaid
DJ Schaid
DO Stram
DR Nyholt
DV Zaykin
DY Lin
ER Martin
Friedhelm Bongardt
GCL Johnson
H Kuchenhoff
H Xu
HE Wichmann
Helmut Küchenhoff
IM Heid
Iris M. Heid
J Akey
L Excoffier
LP Zhao
M Stephens
M Stephens
MA Levenstien
MJ Daly
MJ Morrissey
MP Epstein
P Kraft
RJ Carroll
RJA Little
RM Adkins
RW Morris
SL Lake
T Illig
T Illig
T Niu
Vincent Macaulay
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

BACKGROUND: Statistically reconstructing haplotypes from single nucleotide polymorphism (SNP) genotypes, can lead to falsely classified haplotypes. This can be an issue when interpreting haplotype association results or when selecting subjects with certain haplotypes for subsequent functional studies. It was our aim to quantify haplotype reconstruction error and to provide tools for it. METHODS AND RESULTS: By numerous simulation scenarios, we systematically investigated several error measures, including discrepancy, error rate, and R(2), and introduced the sensitivity and specificity to this context. We exemplified several measures in the KORA study, a large population-based study from Southern Germany. We find that the specificity is slightly reduced only for common haplotypes, while the sensitivity was decreased for some, but not all rare haplotypes. The overall error rate was generally increasing with increasing number of loci, increasing minor allele frequency of SNPs, decreasing correlation between the alleles and increasing ambiguity. CONCLUSIONS: We conclude that, with the analytical approach presented here, haplotype-specific error measures can be computed to gain insight into the haplotype uncertainty. This method provides the information, if a specific risk haplotype can be expected to be reconstructed with rather no or high misclassification and thus on the magnitude of expected bias in association estimates. We also illustrate that sensitivity and specificity separate two dimensions of the haplotype reconstruction error, which completely describe the misclassification matrix and thus provide the prerequisite for methods accounting for misclassification

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

PuSH

Computing Power and Sample Size for Case-Control Association Studies with Copy Number Polymorphism: Application of Mixture-Based Likelihood Ratio Test

Author: A Agresti
A Kolmogoroff
A Tenenbein
A Tenenbein
AW van der Vaart
B Frank
B Jones
BJ Wegscheider
C Fraley
CE Yu
D Gordon
D Gordon
D Gordon
D Gordon
D Gordon
D Titterington
Derek Gordon
E Gonzalez
F Corbiere
F Ji
F Pompanon
FY Hsieh
FY Hsieh
GJ McLachlan
J Gudmundsson
J Healy
J Ott
J Sebat
J Sebat
JA Lee
JL Freeman
Jonathan Sebat
K Ahn
K Ozaki
Kenny Q. Ye
KH Cheung
KH Cheung
LT Amundadottir
M Fanciulli
MA Levenstien
MV Osier
N Smirnoff
P Armitage
Peter Heutink
R Lucito
R Redon
R Sladek
RJ Hathaway
RJ Hathaway
RJ Klein
RL Pollex
RL Pollex
SA McCarroll
SJ Kang
SJ Kang
SJ Kang
SJ White
SK Mitra
Stephen J. Finch
SV Goverdhan
T Walsh
TJ Aitman
TW Anderson
VL Mote
WG Cochran
Wonkuk Kim
Y Yang
Publication venue: Public Library of Science
Publication date: 22/10/2008
Field of study

Recent studies suggest that copy number polymorphisms (CNPs) may play an important role in disease susceptibility and onset. Currently, the detection of CNPs mainly depends on microarray technology. For case-control studies, conventionally, subjects are assigned to a specific CNP category based on the continuous quantitative measure produced by microarray experiments, and cases and controls are then compared using a chi-square test of independence. The purpose of this work is to specify the likelihood ratio test statistic (LRTS) for case-control sampling design based on the underlying continuous quantitative measurement, and to assess its power and relative efficiency (as compared to the chi-square test of independence on CNP counts). The sample size and power formulas of both methods are given. For the latter, the CNPs are classified using the Bayesian classification rule. The LRTS is more powerful than this chi-square test for the alternatives considered, especially alternatives in which the at-risk CNP categories have low frequencies. An example of the application of the LRTS is given for a comparison of CNP distributions in individuals of Caucasian or Taiwanese ethnicity, where the LRTS appears to be more powerful than the chi-square test, possibly due to misclassification of the most common CNP category into a less common category

Public Library of Science (PLOS)

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

Recommended from our members

Clinical delineation and localization to chromosome 9p13.3-p12 of a unique dominant disorder in four families: hereditary inclusion body myopathy, Paget disease of bone, and frontotemporal dementia.

Author: Al-Lozi MT
Florence J
Gelber D
Glosser G
Gregg G
Khardori R
Kimonis VE
Kovach MJ
Leal SM
Levenstien MA
Lopate G
Miller T
Morris JC
Pestronk A
Rakowicz W
Shanks CA
Simmons Z
Waggoner B
Whyte MP
Publication venue: eScholarship, University of California
Publication date: 01/12/2001
Field of study

Autosomal dominant myopathy, Paget disease of bone, and dementia constitute a unique disorder (MIM 605382). Here we describe the clinical, biochemical, radiological, and pathological characteristics of 49 affected (23 male, 26 female) individuals from four unrelated United States families. Among these affected individuals 90% have myopathy, 43% have Paget disease of bone, and 37% have premature frontotemporal dementia. EMG shows myopathic changes and muscle biopsy reveals nonspecific myopathic changes or blue-rimmed vacuoles. After candidate loci were excluded, a genome-wide screen in the large Illinois family showed linkage to chromosome 9 (maximum LOD score 3.64 with marker D9S301). Linkage analysis with a high density of chromosome 9 markers generated a maximum two-point LOD score of 9.29 for D9S1791, with a maximum multipoint LOD score of 12.24 between D9S304 and D9S1788. Subsequent evaluation of three additional families demonstrating similar clinical characteristics confirmed this locus, refined the critical region, and further delineated clinical features of this unique disorder. Hence, autosomal dominant inclusion body myopathy (HIBM), Paget disease of bone (PDB), and frontotemporal dementia (FTD) localizes to a 1.08-6.46 cM critical interval on 9p13.3-12 in the region of autosomal recessive IBM2

eScholarship - University of California