6 research outputs found

    A Modified RNA-Seq Approach for Whole Genome Sequencing of RNA Viruses from Faecal and Blood Samples

    Get PDF
    <div><p>To date, very large scale sequencing of many clinically important RNA viruses has been complicated by their high population molecular variation, which creates challenges for polymerase chain reaction and sequencing primer design. Many RNA viruses are also difficult or currently not possible to culture, severely limiting the amount and purity of available starting material. Here, we describe a simple, novel, high-throughput approach to Norovirus and Hepatitis C virus whole genome sequence determination based on RNA shotgun sequencing (also known as RNA-Seq). We demonstrate the effectiveness of this method by sequencing three Norovirus samples from faeces and two Hepatitis C virus samples from blood, on an Illumina MiSeq benchtop sequencer. More than 97% of reference genomes were recovered. Compared with Sanger sequencing, our method had no nucleotide differences in 14,019 nucleotides (nt) for Noroviruses (from a total of 2 Norovirus genomes obtained with Sanger sequencing), and 8 variants in 9,542 nt for Hepatitis C virus (1 variant per 1,193 nt). The three Norovirus samples had 2, 3, and 2 distinct positions called as heterozygous, while the two Hepatitis C virus samples had 117 and 131 positions called as heterozygous. To confirm that our sample and library preparation could be scaled to true high-throughput, we prepared and sequenced an additional 77 Norovirus samples in a single batch on an Illumina HiSeq 2000 sequencer, recovering >90% of the reference genome in all but one sample. No discrepancies were observed across 118,757 nt compared between Sanger and our custom RNA-Seq method in 16 samples. By generating viral genomic sequences that are not biased by primer-specific amplification or enrichment, this method offers the prospect of large-scale, affordable studies of RNA viruses which could be adapted to routine diagnostic laboratory workflows in the near future, with the potential to directly characterize within-host viral diversity.</p></div

    Evolutionary tree created by BEAST (Bayesian evolutionary analysis sampling trees) depicting all the full genomic sequences with relatedness (61 sequences, excluding repeated pairs).

    No full text
    <p>Clusters of genomes are visible among viruses sampled at similar points in time. Whole genome sequencing gives adequate resolution to distinguish potential divergent viral strains within the same time, as illustrated in clusters from January 2010, February 2011 and March 2011. WO = ward outbreak. Each node and branch has been coloured depicting the posterior probability supporting that clade calculated by Bayesian analysis (Dark Blue = 1 (high); Light Red = 0 (low)). Analysis was performed using BEAST v.1.7.5 combining two random number seed chains (10 million iterations each, saving 1 in 1000 iterations, with a 1 million iteration burn-in) using: HKY substitution; estimated frequency; strict clock; and constant population size coalescent tree prior. This maximum clade credibility tree was computed using TreeAnnotator v.1.7.5 and plotted with Figtree v.1.4.0.</p

    A comparison of workflows and consumable costs for various viral sequencing approaches.

    No full text
    <p>The ‘Custom’ protocol refers to the modified RNA-Seq method used in this study to create larger insert fragments. Alternative methods include Amplicon-seq and hybridisation capture (SureSelect Target Enrichment for Illumina Paired-End mRNA-Seq Library Prep; version 1.1). Failure rates are determined by failure to sequence at least one amplicon (<86% of the genome). The failure rate for SureSelect is not given as it was not performed in our study. Consumable costs are list price per sample and exclude sequencing.</p>*<p>failure rate based on that observed with Sanger Sequencing.</p>**<p>estimated cost for probes only. Extra cost incurred for Agilent library preparation kit plus additional reagents recommended by Agilent.</p>***<p>linear amplification during SPIA reverse transcription has not been accounted for.</p

    Schematic representation of different strategies for viral genome resequencing.

    No full text
    <p>A) Total RNA library: all the RNA species present in the sample are sequenced, no assumption on which genome is present, B) Hybridisation capture of a mRNA library: a good reference genome is needed to design the probes for capture, C) PCR enrichment: the desired genome is amplified from cDNA, a reference genome is needed to design specific oligos. Red lines, genomes of interest; Blue segments, Illumina adapters; Black lines, other RNA species.</p
    corecore