8 research outputs found

    Viral population estimation using pyrosequencing

    Get PDF
    The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an EM algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.Comment: 23 pages, 13 figure

    When simple sequence comparison fails: the cryptic case of the shared domains of the bacterial replication initiation proteins DnaB and DnaD

    Get PDF
    DnaD and DnaB are essential DNA-replication-initiation proteins in low-G+C content Gram-positive bacteria. Here we use sensitive Hidden Markov Model-based techniques to show that the DnaB and DnaD proteins share a common structure that is evident across all their structural domains, termed DDBH1 and DDBH2 (DnaD DnaB Homology 1 and 2). Despite strong sequence divergence, many of the DNA-binding and oligomerization properties of these domains have been conserved. Although eluding simple sequence comparisons, the DDBH2 domains share the only strong sequence motif; an extremely highly conserved YxxxIxxxW sequence that contributes to DNA binding. Sequence alignments of DnaD alone fail to identify another key part of the DNA-binding module, since it includes a poorly conserved sequence, a solvent-exposed and somewhat unstable helix and a mobile segment. We show by NMR, in vitro mutagenesis and in vivo complementation experiments that the DNA-binding module of Bacillus subtilis DnaD comprises the YxxxIxxxW motif, the unstable helix and a portion of the mobile region, the latter two being essential for viability. These structural insights lead us to a re-evaluation of the oligomerization and DNA-binding properties of the DnaD and DnaB proteins
    corecore