3,015 research outputs found

    Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly

    Full text link
    Motivation: Eugene Myers in his string graph paper (Myers, 2005) suggested that in a string graph or equivalently a unitig graph, any path spells a valid assembly. As a string/unitig graph also encodes every valid assembly of reads, such a graph, provided that it can be constructed correctly, is in fact a lossless representation of reads. In principle, every analysis based on whole-genome shotgun sequencing (WGS) data, such as SNP and insertion/deletion (INDEL) calling, can also be achieved with unitigs. Results: To explore the feasibility of using de novo assembly in the context of resequencing, we developed a de novo assembler, fermi, that assembles Illumina short reads into unitigs while preserving most of information of the input reads. SNPs and INDELs can be called by mapping the unitigs against a reference genome. By applying the method on 35-fold human resequencing data, we showed that in comparison to the standard pipeline, our approach yields similar accuracy for SNP calling and better results for INDEL calling. It has higher sensitivity than other de novo assembly based methods for variant calling. Our work suggests that variant calling with de novo assembly be a beneficial complement to the standard variant calling pipeline for whole-genome resequencing. In the methodological aspects, we proposed FMD-index for forward-backward extension of DNA sequences, a fast algorithm for finding all super-maximal exact matches and one-pass construction of unitigs from an FMD-index. Availability: http://github.com/lh3/fermi Contact: [email protected]: Rev2: submitted version with minor improvements; 7 page

    Rotational Correction on the Morse Potential Through the Pekeris Approximation and Nikiforov-Uvarov Method

    Full text link
    The Nikiforov-Uvarov method is employed to calculate the the Schrodinger equation with a rotation Morse potential. The bound state energy eigenvalues and the corresponding eigenfunction are obtained. All of these calculation present an effective and clear method under a Pekeris approximation to solve a rotation Morse model. Meanwhile the results got here are in a good agreement with ones before.Comment: 11 pages, no figure, submitted to Chemical Physics Letters, (2005

    Performance analysis of a parallel, multi-node pipeline for DNA sequencing

    Get PDF
    Post-sequencing DNA analysis typically consists of read mapping followed by variant calling and is very time-consuming, even on a multi-core machine. Recently, we proposed Halvade, a parallel, multi-node implementation of a DNA sequencing pipeline according to the GATK Best Practices recommendations. The MapReduce programming model is used to distribute the workload among different workers. In this paper, we study the impact of different hardware configurations on the performance of Halvade. Benchmarks indicate that especially the lack of good multithreading capabilities in the existing tools (BWA, SAMtools, Picard, GATK) cause suboptimal scaling behavior. We demonstrate that it is possible to circumvent this bottleneck by using multiprocessing on high-memory machines rather than using multithreading. Using a 15-node cluster with 360 CPU cores in total, this results in a runtime of 1 h 31 min. Compared to a single-threaded runtime of similar to 12 days, this corresponds to an overall parallel efficiency of 53%

    Soluble oligomerization provides a beneficial fitness effect on destabilizing mutations

    Full text link
    Mutations create the genetic diversity on which selective pressures can act, yet also create structural instability in proteins. How, then, is it possible for organisms to ameliorate mutation-induced perturbations of protein stability while maintaining biological fitness and gaining a selective advantage? Here we used a new technique of site-specific chromosomal mutagenesis to introduce a selected set of mostly destabilizing mutations into folA - an essential chromosomal gene of E. coli encoding dihydrofolate reductase (DHFR) - to determine how changes in protein stability, activity and abundance affect fitness. In total, 27 E.coli strains carrying mutant DHFR were created. We found no significant correlation between protein stability and its catalytic activity nor between catalytic activity and fitness in a limited range of variation of catalytic activity observed in mutants. The stability of these mutants is strongly correlated with their intracellular abundance; suggesting that protein homeostatic machinery plays an active role in maintaining intracellular concentrations of proteins. Fitness also shows a significant correlation with intracellular abundance of soluble DHFR in cells growing at 30oC. At 42oC, on the other hand, the picture was mixed, yet remarkable: a few strains carrying mutant DHFR proteins aggregated rendering them nonviable, but, intriguingly, the majority exhibited fitness higher than wild type. We found that mutational destabilization of DHFR proteins in E. coli is counterbalanced at 42oC by their soluble oligomerization, thereby restoring structural stability and protecting against aggregation

    Fluconazole Monotherapy Is a Suboptimal Option for Initial Treatment of Cryptococcal Meningitis Because of Emergence of Resistance.

    Get PDF
    Cryptococcal meningitis is a lethal disease with few therapeutic options. Induction therapy with fluconazole has been consistently demonstrated to be associated with suboptimal microbiological and clinical outcomes. Exposure to fluconazole causes dynamic changes in antifungal susceptibility, which are associated with the development of aneuploidy. The implications of this phenomenon for pharmacodynamics of fluconazole for cryptococcal meningitis are poorly understood. The pharmacodynamics of fluconazole were studied using a hollow-fiber infection model (HFIM) and a well-characterized murine model of cryptococcal meningoencephalitis. The relationship between drug exposure and both antifungal killing and the emergence of resistance was quantified. The same relationships were further evaluated in a recently described group of patients with cryptococcal meningitis undergoing induction therapy with fluconazole at 800 to 1,200 mg/day. The pattern of emergence of fluconazole resistance followed an "inverted U." Resistance amplification was maximal and suppressed at ratios of the area under the concentration-time curve for the free, unbound fraction of the drug to the MIC (fAUC:MIC) of 34.5 to 138 and 305.6, respectively. Emergence of resistance was observed in vivo with an fAUC:MIC of 231.4. Aneuploidy with duplication of chromosome 1 was demonstrated to be the underlying mechanism in both experimental models. The pharmacokinetic (PK)-pharmacodynamic model accurately described the PK, antifungal killing, and emergence of resistance. Monte Carlo simulations from the clinical pharmacokinetic-pharmacodynamic model showed that only 12.8% of simulated patients receiving fluconazole at 1,200 mg/day achieved sterilization of the cerebrospinal fluid (CSF) after 2 weeks and that 83.4% had a persistent subpopulation that was resistant to fluconazole. Fluconazole is primarily ineffective due to the emergence of resistance. Treatment with 1,200 mg/day leads to the killing of a susceptible subpopulation but is compromised by the emergence of resistance.IMPORTANCE Cryptococcal meningitis is a lethal disease with few treatment options. The incidence remains high and intricately linked with the HIV/AIDS epidemic. In many parts of the world, fluconazole is the only agent that is available for the initial treatment of cryptococcal meningitis despite considerable evidence that it is associated with suboptimal microbiological and clinical outcomes. Fluconazole has a fungistatic mode of action: it predominantly inhibits growth rather than causing fungal killing. Our work shows that the pattern of fluconazole activity is caused by the emergence of resistance in Cryptococcus not detected by standard susceptibility tests, with chromosomal duplication/aneuploidy as the main mechanism. Resistance emergence is related to drug exposure and occurs with the use of clinically relevant regimens. Hence, fluconazole (and potentially other agents that target 14-alpha-demethylase) is compromised by an intrinsic property that limits its effectiveness. However, this resistance may be potentially overcome by dosage escalation or the use of combination therapy

    Efficiency and Power as a Function of Sequence Coverage, SNP Array Density, and Imputation

    Get PDF
    High coverage whole genome sequencing provides near complete information about genetic variation. However, other technologies can be more efficient in some settings by (a) reducing redundant coverage within samples and (b) exploiting patterns of genetic variation across samples. To characterize as many samples as possible, many genetic studies therefore employ lower coverage sequencing or SNP array genotyping coupled to statistical imputation. To compare these approaches individually and in conjunction, we developed a statistical framework to estimate genotypes jointly from sequence reads, array intensities, and imputation. In European samples, we find similar sensitivity (89%) and specificity (99.6%) from imputation with either 1× sequencing or 1 M SNP arrays. Sensitivity is increased, particularly for low-frequency polymorphisms (MAF <5%), when low coverage sequence reads are added to dense genome-wide SNP arrays — the converse, however, is not true. At sites where sequence reads and array intensities produce different sample genotypes, joint analysis reduces genotype errors and identifies novel error modes. Our joint framework informs the use of next-generation sequencing in genome wide association studies and supports development of improved methods for genotype calling

    Workshop—Predicting the Structure of Biological Molecules

    Get PDF
    This April, in Cambridge (UK), principal investigators from the Mathematical Biology Group of the Medical Research Council's National Institute of Medical Research organized a workshop in structural bioinformatics at the Centre for Mathematical Sciences. Bioinformatics researchers of several nationalities from labs around the country presented and discussed their computational work in biomolecular structure prediction and analysis, and in protein evolution. The meeting was intensive and lively and gave attendees an overview of the healthy state of protein bioinformatics in the UK

    Illuminating Choices for Library Prep: A Comparison of Library Preparation Methods for Whole Genome Sequencing of Cryptococcus neoformans Using Illumina HiSeq.

    Get PDF
    The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use

    Epistasis not needed to explain low dN/dS

    Full text link
    An important question in molecular evolution is whether an amino acid that occurs at a given position makes an independent contribution to fitness, or whether its effect depends on the state of other loci in the organism's genome, a phenomenon known as epistasis. In a recent letter to Nature, Breen et al. (2012) argued that epistasis must be "pervasive throughout protein evolution" because the observed ratio between the per-site rates of non-synonymous and synonymous substitutions (dN/dS) is much lower than would be expected in the absence of epistasis. However, when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed in a protein alignment at any particular position have equal fitness. Here, we relax this unrealistic assumption and show that any dN/dS value can in principle be achieved at a site, without epistasis. Furthermore, for all nuclear and chloroplast genes in the Breen et al. dataset, we show that the observed dN/dS values and the observed patterns of amino acid diversity at each site are jointly consistent with a non-epistatic model of protein evolution.Comment: This manuscript is in response to "Epistasis as the primary factor in molecular evolution" by Breen et al. Nature 490, 535-538 (2012
    corecore