Search CORE

3,015 research outputs found

Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly

Author: Depristo
Durbin
Gingeras
H. Li
Homer
Idury
Iqbal
Lam
Levy
Myers
Myers
Myers
Peltola
Pevzner
Staden
Zerbino
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2012
Field of study

Motivation: Eugene Myers in his string graph paper (Myers, 2005) suggested that in a string graph or equivalently a unitig graph, any path spells a valid assembly. As a string/unitig graph also encodes every valid assembly of reads, such a graph, provided that it can be constructed correctly, is in fact a lossless representation of reads. In principle, every analysis based on whole-genome shotgun sequencing (WGS) data, such as SNP and insertion/deletion (INDEL) calling, can also be achieved with unitigs. Results: To explore the feasibility of using de novo assembly in the context of resequencing, we developed a de novo assembler, fermi, that assembles Illumina short reads into unitigs while preserving most of information of the input reads. SNPs and INDELs can be called by mapping the unitigs against a reference genome. By applying the method on 35-fold human resequencing data, we showed that in comparison to the standard pipeline, our approach yields similar accuracy for SNP calling and better results for INDEL calling. It has higher sensitivity than other de novo assembly based methods for variant calling. Our work suggests that variant calling with de novo assembly be a beneficial complement to the standard variant calling pipeline for whole-genome resequencing. In the methodological aspects, we proposed FMD-index for forward-backward extension of DNA sequences, a fast algorithm for finding all super-maximal exact matches and one-pass construction of unitigs from an FMD-index. Availability: http://github.com/lh3/fermi Contact: [email protected]: Rev2: submitted version with minor improvements; 7 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Rotational Correction on the Morse Potential Through the Pekeris Approximation and Nikiforov-Uvarov Method

Author: Bag
Chen
Cüneyt Berkdemir
DePristo
Dong
Filho
Flügge
Han
Jiaguang Han
Killingbeck
Morales
Morse
Nikiforov
Pekeris
Szego
Varshni
Publication venue: 'Elsevier BV'
Publication date: 28/02/2005
Field of study

The Nikiforov-Uvarov method is employed to calculate the the Schrodinger equation with a rotation Morse potential. The bound state energy eigenvalues and the corresponding eigenfunction are obtained. All of these calculation present an effective and clear method under a Pekeris approximation to solve a rotation Morse model. Meanwhile the results got here are in a good agreement with ones before.Comment: 11 pages, no figure, submitted to Chemical Physics Letters, (2005

arXiv.org e-Print Archive

Crossref

CERN Document Server

Erciyes University - AVESIS

Performance analysis of a parallel, multi-node pipeline for DNA sequencing

Author: A Hatem
A McKenna
D Decap
GA Van der Auwera
H Li
H Li
J Dean
MA Depristo
ST Sherry
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Post-sequencing DNA analysis typically consists of read mapping followed by variant calling and is very time-consuming, even on a multi-core machine. Recently, we proposed Halvade, a parallel, multi-node implementation of a DNA sequencing pipeline according to the GATK Best Practices recommendations. The MapReduce programming model is used to distribute the workload among different workers. In this paper, we study the impact of different hardware configurations on the performance of Halvade. Benchmarks indicate that especially the lack of good multithreading capabilities in the existing tools (BWA, SAMtools, Picard, GATK) cause suboptimal scaling behavior. We demonstrate that it is possible to circumvent this bottleneck by using multiprocessing on high-memory machines rather than using multithreading. Using a 15-node cluster with 360 CPU cores in total, this results in a runtime of 1 h 31 min. Compared to a single-threaded runtime of similar to 12 days, this corresponds to an overall parallel efficiency of 53%

Crossref

Ghent University Academic Bibliography

Soluble oligomerization provides a beneficial fitness effect on destabilizing mutations

Author: Benkovic
Bershtein
Dams
DePristo
Ding
Dobson
Drummond
E. I. Shakhnovich
Elena
Fernandez
Gronenborn
Ispolatov
Lukatsky
Malevanets
Park
Parsell
Privalov
Privalov
Robertson
Rousseau
Rousseau
S. Bershtein
Shakhnovich
Soskine
Taniguchi
Taverna
W. Mu
Weinreich
Wright
Yang
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2012
Field of study

Mutations create the genetic diversity on which selective pressures can act, yet also create structural instability in proteins. How, then, is it possible for organisms to ameliorate mutation-induced perturbations of protein stability while maintaining biological fitness and gaining a selective advantage? Here we used a new technique of site-specific chromosomal mutagenesis to introduce a selected set of mostly destabilizing mutations into folA - an essential chromosomal gene of E. coli encoding dihydrofolate reductase (DHFR) - to determine how changes in protein stability, activity and abundance affect fitness. In total, 27 E.coli strains carrying mutant DHFR were created. We found no significant correlation between protein stability and its catalytic activity nor between catalytic activity and fitness in a limited range of variation of catalytic activity observed in mutants. The stability of these mutants is strongly correlated with their intracellular abundance; suggesting that protein homeostatic machinery plays an active role in maintaining intracellular concentrations of proteins. Fitness also shows a significant correlation with intracellular abundance of soluble DHFR in cells growing at 30oC. At 42oC, on the other hand, the picture was mixed, yet remarkable: a few strains carrying mutant DHFR proteins aggregated rendering them nonviable, but, intriguingly, the majority exhibited fitness higher than wild type. We found that mutational destabilization of DHFR proteins in E. coli is counterbalanced at 42oC by their soluble oligomerization, thereby restoring structural stability and protecting against aggregation

arXiv.org e-Print Archive

Crossref

Harvard University - DASH

Fluconazole Monotherapy Is a Suboptimal Option for Initial Treatment of Cryptococcal Meningitis Because of Emergence of Resistance.

Author: Aller
Altamirano
Arechavala
Bicanic
Bicanic
Boeva
Chang
Chen
Dannaoui
DePristo
Drusano
Harrison
Li
Longley
Loyse
Martin
Mayanja-Kizza
McKenna
Molloy
Mondon
Neely
Nixon
Nussbaum
Saag
Sionov
Sionov
Stone
Stott
Stott
Sudan
Wiederhold
Witt
Publication venue: 'American Society for Microbiology'
Publication date: 01/12/2019
Field of study

Cryptococcal meningitis is a lethal disease with few therapeutic options. Induction therapy with fluconazole has been consistently demonstrated to be associated with suboptimal microbiological and clinical outcomes. Exposure to fluconazole causes dynamic changes in antifungal susceptibility, which are associated with the development of aneuploidy. The implications of this phenomenon for pharmacodynamics of fluconazole for cryptococcal meningitis are poorly understood. The pharmacodynamics of fluconazole were studied using a hollow-fiber infection model (HFIM) and a well-characterized murine model of cryptococcal meningoencephalitis. The relationship between drug exposure and both antifungal killing and the emergence of resistance was quantified. The same relationships were further evaluated in a recently described group of patients with cryptococcal meningitis undergoing induction therapy with fluconazole at 800 to 1,200 mg/day. The pattern of emergence of fluconazole resistance followed an "inverted U." Resistance amplification was maximal and suppressed at ratios of the area under the concentration-time curve for the free, unbound fraction of the drug to the MIC (fAUC:MIC) of 34.5 to 138 and 305.6, respectively. Emergence of resistance was observed in vivo with an fAUC:MIC of 231.4. Aneuploidy with duplication of chromosome 1 was demonstrated to be the underlying mechanism in both experimental models. The pharmacokinetic (PK)-pharmacodynamic model accurately described the PK, antifungal killing, and emergence of resistance. Monte Carlo simulations from the clinical pharmacokinetic-pharmacodynamic model showed that only 12.8% of simulated patients receiving fluconazole at 1,200 mg/day achieved sterilization of the cerebrospinal fluid (CSF) after 2 weeks and that 83.4% had a persistent subpopulation that was resistant to fluconazole. Fluconazole is primarily ineffective due to the emergence of resistance. Treatment with 1,200 mg/day leads to the killing of a susceptible subpopulation but is compromised by the emergence of resistance.IMPORTANCE Cryptococcal meningitis is a lethal disease with few treatment options. The incidence remains high and intricately linked with the HIV/AIDS epidemic. In many parts of the world, fluconazole is the only agent that is available for the initial treatment of cryptococcal meningitis despite considerable evidence that it is associated with suboptimal microbiological and clinical outcomes. Fluconazole has a fungistatic mode of action: it predominantly inhibits growth rather than causing fungal killing. Our work shows that the pattern of fluconazole activity is caused by the emergence of resistance in Cryptococcus not detected by standard susceptibility tests, with chromosomal duplication/aneuploidy as the main mechanism. Resistance emergence is related to drug exposure and occurs with the use of clinically relevant regimens. Hence, fluconazole (and potentially other agents that target 14-alpha-demethylase) is compromised by an intrinsic property that limits its effectiveness. However, this resistance may be potentially overcome by dosage escalation or the use of combination therapy

University of Liverpool Repository

Crossref

Directory of Open Access Journals

St George's Online Research Archive

Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

Author: Citation Flannick
David Altshuler
David Altshuler
Eric Banks
Eric Banks
George B. Grant
George B. Grant
Jason Flannick
Joshua M. Korn
Joshua M. Korn
Mark A. Depristo
Mark A. Depristo
Pierre Fontanillas
Pierre Fontanillas
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF <5%), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling

CiteSeerX

Public Library of Science (PLOS)

DSpace@MIT

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

FigShare

Workshop—Predicting the Structure of Biological Molecules

Author: Abbott
Blackburne
Burke
Calladine
Cavallo
Damian Counsell
de Bakker
DePristo
DePristo
Fernandez-Recio
Fraternali
Johannissen
Jones
Laskowski
Lin
Lindorff-Larsen
Madera
Mizuguchi
Shi
Shirai
Shirai
Soyer
Stebbings
Taylor
Taylor
Taylor
Vendruscolo
Ward
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2004
Field of study

This April, in Cambridge (UK), principal investigators from the Mathematical Biology Group of the Medical Research Council's National Institute of Medical Research organized a workshop in structural bioinformatics at the Centre for Mathematical Sciences. Bioinformatics researchers of several nationalities from labs around the country presented and discussed their computational work in biomolecular structure prediction and analysis, and in protein evolution. The meeting was intensive and lively and gave attendees an overview of the healthy state of protein bioinformatics in the UK

Crossref

Directory of Open Access Journals

PubMed Central

Illuminating Choices for Library Prep: A Comparison of Library Preparation Methods for Whole Genome Sequencing of Cryptococcus neoformans Using Illumina HiSeq.

Author: A Adey
A McKenna
BJ Loftus
EL van Dijk
GA Van der Auwera
H Li
H Li
H Li
J Dabney
JD McPherson
Johanna Rhodes
Kirsten Nielsen
L DeFrancesco
M Eisenstein
MA DePristo
MA Quail
Mathew A. Beale
Matthew C. Fisher
R Marine
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 24/10/2014
Field of study

The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

St George's Online Research Archive

Epistasis not needed to explain low dN/dS

Author: AL Halpern
AS Kondrashov
AU Tamuri
David M. McCandlish
DM Fowler
Etienne Rajon
J da Silva
Joshua B. Plotkin
MA DePristo
MLM Salverda
MS Breen
N Rodrigue
Premal Shah
S Kryazhimskiy
SC Choi
TF Hansen
WH Li
Yang Ding
Z Yang
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/12/2012
Field of study

An important question in molecular evolution is whether an amino acid that occurs at a given position makes an independent contribution to fitness, or whether its effect depends on the state of other loci in the organism's genome, a phenomenon known as epistasis. In a recent letter to Nature, Breen et al. (2012) argued that epistasis must be "pervasive throughout protein evolution" because the observed ratio between the per-site rates of non-synonymous and synonymous substitutions (dN/dS) is much lower than would be expected in the absence of epistasis. However, when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed in a protein alignment at any particular position have equal fitness. Here, we relax this unrealistic assumption and show that any dN/dS value can in principle be achieved at a site, without epistasis. Furthermore, for all nuclear and chloroplast genes in the Breen et al. dataset, we show that the observed dN/dS values and the observed patterns of amino acid diversity at each site are jointly consistent with a non-epistatic model of protein evolution.Comment: This manuscript is in response to "Epistasis as the primary factor in molecular evolution" by Breen et al. Nature 490, 535-538 (2012

arXiv.org e-Print Archive

Crossref

Cold Spring Harbor Laboratory Institutional Repository

INRIA a CCSD electronic archive server

HAL Descartes