Search CORE

Online Research @ Cardiff

Directory of Open Access Journals

Edinburgh Research Explorer

UCL Discovery

King's Research Portal

ResearchOnline@GCU

Use of a targeted, combinatorial next-generation sequencing approach for the study of bicuspid aortic valve

Author: David Newsom
Don Corsmeier
Elizabeth M Bonachea
Gloria Zender
Kim L McBride
Peter White
Sara Fitzgerald-Butt
Vidu Garg
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

BACKGROUND: Bicuspid aortic valve (BAV) is the most common type of congenital heart disease with a population prevalence of 1-2%. While BAV is known to be highly heritable, mutations in single genes (such as GATA5 and NOTCH1) have been reported in few human BAV cases. Traditional gene sequencing methods are time and labor intensive, while next-generation high throughput sequencing remains costly for large patient cohorts and requires extensive bioinformatics processing. Here we describe an approach to targeted multi-gene sequencing with combinatorial pooling of samples from BAV patients. METHODS: We studied a previously described cohort of 78 unrelated subjects with echocardiogram-identified BAV. Subjects were identified as having isolated BAV or BAV associated with coarctation of aorta (BAV-CoA). BAV cusp fusion morphology was defined as right-left cusp fusion, right non-coronary cusp fusion, or left non-coronary cusp fusion. Samples were combined into 19 pools using a uniquely overlapping combinatorial design; a given mutation could be attributed to a single individual on the basis of which pools contained the mutation. A custom gene capture of 97 candidate genes was sequenced on the Illumina HiSeq 2000. Multistep bioinformatics processing was performed for base calling, variant identification, and in-silico analysis of putative disease-causing variants. RESULTS: Targeted capture identified 42 rare, non-synonymous, exonic variants involving 35 of the 97 candidate genes. Among these variants, in-silico analysis classified 33 of these variants as putative disease-causing changes. Sanger sequencing confirmed thirty-one of these variants, found among 16 individuals. There were no significant differences in variant burden among BAV fusion phenotypes or isolated BAV versus BAV-CoA. Pathway analysis suggests a role for the WNT signaling pathway in human BAV. CONCLUSION: We successfully developed a pooling and targeted capture strategy that enabled rapid and cost effective next generation sequencing of target genes in a large patient cohort. This approach identified a large number of putative disease-causing variants in a cohort of patients with BAV, including variants in 26 genes not previously associated with human BAV. The data suggest that BAV heritability is complex and polygenic. Our pooling approach saved over $39,350 compared to an unpooled, targeted capture sequencing strategy

Genotyping common and rare variation using overlapping pool sequencing

Author: Eskin Eleazar
Halperin Eran
He Dan
Pasaniuc Bogdan
Zaitlen Noah
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants. Results In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications. Conclusions Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.</p

Directory of Open Access Journals

eScholarship - University of California

poolMC: Smart pooling of mRNA samples in microarray experiments

Author: An M Kainkaryam
Angela Bruex
Anna C Gilbert
J Woolf
John Schiefelbein
John Schiefelbein
Peter J Woolf
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Typically, pooling of mRNA samples in microarray experiments implies mixing mRNA from several biological-replicate samples before hybridization onto a microarray chip. Here we describe an alternative smart pooling strategy in which different samples, not necessarily biological replicates, are pooled in an information theoretic efficient way. Further, each sample is tested on multiple chips, but always in pools made up of different samples. The end goal is to exploit the compressibility of microarray data to reduce the number of chips used and increase the robustness to noise in measurements. Results: A theoretical framework to perform smart pooling of mRNA samples in microarray experiments was established and the software implementation of the pooling and decoding algorithms was developed in MATLAB. A proof-of-concept smart pooled experiment was performed using validated biological samples on commercially available gene chips. Conclusions: The theoretical developments and experimental demonstration in this paper provide a useful starting point to investigate smart pooling of mRNA samples in microarray experiments. Important conditions for its successful implementation include linearity of measurements, sparsity in data, and large experiment size.

CiteSeerX

Deep Blue Documents at the University of Michigan

A statistical method for the detection of variants from next-generation resequencing of DNA pools

Author: Bentley
Druley
Ingman
Kim
Langmead
Levy
Maher
Manolio
Out
Rumble
Sham
Stratton
V. Bansal
Wang
Wheeler
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing

SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data

Author: Bansal
Benjamini
Benjamini
Bodmer
Calvo
Cirulli
Depristo
Druley
Frazer
Gholson J. Lyon
Hakon Hakonarson
Hayden
Hindorff
Ingman
Koboldt
Kullback
Lander
Lee
Li
Li
Li
Li
Manolio
Mardis
McKenna
Momozawa
Nejentsev
Norton
Out
Pingzhao Hu
Prabhu
Sarin
Service
Sham
Shen
Sun
Vallania
Wei
Wei Wang
Zhi Wei
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

We develop a statistical tool SNVer for calling common and rare variants in analysis of pooled or individual next-generation sequencing (NGS) data. We formulate variant calling as a hypothesis testing problem and employ a binomial–binomial model to test the significance of observed allele frequency against sequencing error. SNVer reports one single overall P-value for evaluating the significance of a candidate locus being a variant based on which multiplicity control can be obtained. This is particularly desirable because tens of thousands loci are simultaneously examined in typical NGS experiments. Each user can choose the false-positive error rate threshold he or she considers appropriate, instead of just the dichotomous decisions of whether to ‘accept or reject the candidates’ provided by most existing methods. We use both simulated data and real data to demonstrate the superior performance of our program in comparison with existing methods. SNVer runs very fast and can complete testing 300 K loci within an hour. This excellent scalability makes it feasible for analysis of whole-exome sequencing data, or even whole-genome sequencing data using high performance computing cluster. SNVer is freely available at http://snver.sourceforge.net/

Cold Spring Harbor Laboratory Institutional Repository

The Application of Pooled DNA Sequencing in Disease Association Study

Author: Chang-Yun Lin
Tao Wang
Publication venue: 'IntechOpen'
Publication date: 20/04/2012
Field of study

IntechOpen

Estimating population size via line graph reconstruction

Author: Bjarni V Halldórsson
Roded Sharan
Publication venue
Publication date
Field of study

Background: We propose a novel graph theoretic method to estimate haplotype population size from genotype data. The method considers only the potential sharing of haplotypes between individuals and is based on transforming the graph of potential haplotype sharing into a line graph using a minimum number of edge and vertex deletions. Results: We show that the resulting line graph deletion problems are NP complete and provide exact integer programming solutions for them. We test our approach using extensive simulations of multiple population evolution and genotypes sampling scenarios. Our results also indicate that the method may be useful in comparing populations and it may be used as a first step in a method for haplotype phasing. Conclusions: Our computational experiments show that when most of the sharings are true sharings the problem can be solved very fast and the estimated size is very close to the true size; when many of the potential sharings do not stem from true haplotype sharing, our method gives reasonable lower bounds on the underlying number of haplotypes. In comparison, a naive approach of phasing the input genotypes provides trivial upper bounds of twice the number of genotypes

CiteSeerX

Statistical Mutation Calling from Sequenced Overlapping DNA Pools in TILLING Experiments

Author: Comai Luca
Filkov Vladimir
Missirian Victor
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background TILLING (Targeting induced local lesions IN genomes) is an efficient reverse genetics approach for detecting induced mutations in pools of individuals. Combined with the high-throughput of next-generation sequencing technologies, and the resolving power of overlapping pool design, TILLING provides an efficient and economical platform for functional genomics across thousands of organisms. Results We propose a probabilistic method for calling TILLING-induced mutations, and their carriers, from high throughput sequencing data of overlapping population pools, where each individual occurs in two pools. We assign a probability score to each sequence position by applying Bayes' Theorem to a simplified binomial model of sequencing error and expected mutations, taking into account the coverage level. We test the performance of our method on variable quality, high-throughput sequences from wheat and rice mutagenized populations. Conclusions We show that our method effectively discovers mutations in large populations with sensitivity of 92.5% and specificity of 99.8%. It also outperforms existing SNP detection methods in detecting real mutations, especially at higher levels of coverage variability across sequenced pools, and in lower quality short reads sequence data. The implementation of our method is available from: <url>http://www.cs.ucdavis.edu/filkov/CAMBa/</url>.</p

Directory of Open Access Journals