2 research outputs found

    DisCVR: rapid viral diagnosis from high-throughput sequencing data

    Get PDF
    High-throughput sequencing (HTS) enables most pathogens in a clinical sample to be detected from a single analysis, thereby providing novel opportunities for diagnosis, surveillance, and epidemiology. However, this powerful technology is difficult to apply in diagnostic laboratories because of its computational and bioinformatic demands. We have developed DisCVR, which detects known human viruses in clinical samples by matching sample k-mers (twenty-two nucleotide sequences) to k-mers from taxonomically labeled viral genomes. DisCVR was validated using published HTS data for eighty-nine clinical samples from adults with upper respiratory tract infections. These samples had been tested for viruses metagenomically and also by real-time polymerase chain reaction assay, which is the standard diagnostic method. DisCVR detected human viruses with high sensitivity (79%) and specificity (100%), and was able to detect mixed infections. Moreover, it produced results comparable to those in a published metagenomic analysis of 177 blood samples from patients in Nigeria. DisCVR has been designed as a user-friendly tool for detecting human viruses from HTS data using computers with limited RAM and processing power, and includes a graphical user interface to help users interpret and validate the output. It is written in Java and is publicly available from http://bioinformatics.cvr.ac.uk/discvr.php

    Identifying the genetic basis of viral spillover using Lassa virus as a test case

    Get PDF
    The rate at which zoonotic viruses spill over into the human population varies significantly over space and time. Remarkably, we do not yet know how much of this variation is attributable to genetic variation within viral populations. This gap in understanding arises because we lack methods of genetic analysis that can be easily applied to zoonotic viruses, where the number of available viral sequences is often limited, and opportunistic sampling introduces significant population stratification. Here, we explore the feasibility of using patterns of shared ancestry to correct for population stratification, enabling genome-wide association methods to identify genetic substitutions associated with spillover into the human population. Using a combination of phylogenetically structured simulations and Lassa virus sequences collected from humans and rodents in Sierra Leone, we demonstrate that existing methods do not fully correct for stratification, leading to elevated error rates. We also demonstrate, however, that the Type I error rate can be substantially reduced by confining the analysis to a less-stratified region of the phylogeny, even in an already-small dataset. Using this method, we detect two candidate single-nucleotide polymorphisms associated with spillover in the Lassa virus polymerase gene and provide generalized recommendations for the collection and analysis of zoonotic viruses
    corecore