Search CORE

13 research outputs found

Examples of large insertion-deletion polymorphisms within single hosts.

A) 23.9 kb deletion of a Panton-Valentine leukocidin-encoding prophage in four colonies isolated from participant J (contig C618:c65). B) 3.5 kb indel knocking out adhE in three colonies isolated from participant F (contig C608:c44). In both panels, the deleted region is indicated in red. The presence of coding sequences (CDS, dark blue), tRNA (dark red), rRNA (purple) and other features (gray) are indicated by filled rectangles. Sliding windows are shown indicating GC content (black), and positive (green) or negative (purple) GC skew. Positions are indicated relative to the concatenated Velvet assemblies of the host-specific reference genomes. Figures extracted from circular chromosome plot generated using CGView <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061319#pone.0061319-Stothard1" target="_blank">[81]</a>.</p

FigShare

A Modified RNA-Seq Approach for Whole Genome Sequencing of RNA Viruses from Faecal and Blood Samples

<div>To date, very large scale sequencing of many clinically important RNA viruses has been complicated by their high population molecular variation, which creates challenges for polymerase chain reaction and sequencing primer design. Many RNA viruses are also difficult or currently not possible to culture, severely limiting the amount and purity of available starting material. Here, we describe a simple, novel, high-throughput approach to Norovirus and Hepatitis C virus whole genome sequence determination based on RNA shotgun sequencing (also known as RNA-Seq). We demonstrate the effectiveness of this method by sequencing three Norovirus samples from faeces and two Hepatitis C virus samples from blood, on an Illumina MiSeq benchtop sequencer. More than 97% of reference genomes were recovered. Compared with Sanger sequencing, our method had no nucleotide differences in 14,019 nucleotides (nt) for Noroviruses (from a total of 2 Norovirus genomes obtained with Sanger sequencing), and 8 variants in 9,542 nt for Hepatitis C virus (1 variant per 1,193 nt). The three Norovirus samples had 2, 3, and 2 distinct positions called as heterozygous, while the two Hepatitis C virus samples had 117 and 131 positions called as heterozygous. To confirm that our sample and library preparation could be scaled to true high-throughput, we prepared and sequenced an additional 77 Norovirus samples in a single batch on an Illumina HiSeq 2000 sequencer, recovering >90% of the reference genome in all but one sample. No discrepancies were observed across 118,757 nt compared between Sanger and our custom RNA-Seq method in 16 samples. By generating viral genomic sequences that are not biased by primer-specific amplification or enrichment, this method offers the prospect of large-scale, affordable studies of RNA viruses which could be adapted to routine diagnostic laboratory workflows in the near future, with the potential to directly characterize within-host viral diversity.</div

Directory of Open Access Journals

PubMed Central

FigShare

Genomic diversity of Staphylococcus aureus in 13 singly-colonized nasal carriers.

For each carriage study participant (A–M) a representation of the maximum likelihood tree is shown relating all colonies isolated and sequenced from that host. Gray circles represent observed genotypes, where area is proportional to sample frequency, and small black circles represent hypothetical intermediate genotypes. Edges (branches) represent mutations, color-coded as follows: synonymous (green), non-synonymous (orange), premature stop (red), non-coding (grey), structural variant (black). Solid edges represent SNPs and dashed edges represent indels. The ordering of mutations along a branch is arbitrary.</p

FigShare

Evidence for natural selection on the Staphylococcus aureus genome during asymptomatic carriage.

A) The relative number of synonymous versus non-synonymous SNPs on all branches of the within-host genealogies relating colonies sampled from hosts A-M. Each pie represents a branch in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061319#pone-0061319-g001" target="_blank">Figure 1</a>, divided into segments according to the proportion of synonymous (green) and non-synonymous (orange) mutations on that branch. The area of the pie is proportional to the number of synonymous and non-synonymous mutations on that branch. The solid line is the uncorrected dN/dS ratio estimated from SNPs within hosts, which was significantly greater than the uncorrected dN/dS ratio estimated from SNPs between hosts (dashed line, McDonald-Kreitman test p<0.001). B) The sample frequency of SNPs, represented by the minor (less frequent) allele. Bars are color-coded according to SNP type: synonymous (green), non-synonymous (orange), nonsense (red) and intergenic (grey). C) The expected and observed number of within-host mutations per gene (solid and hatched bars respectively), combined across participants A–M, Q and R.</p

FigShare

Genomic diversity in asymptomatically carried nasal populations of Staphylococcus aureus.

aRecent antibiotic use: ★ amoxicillin, ★★ antibiotic with expected anti-staphylococcal activity (trimethoprim or ciprofloxacin). bAverage SNP divergence between colonies. cSingle-locus variant of ST22.MLST: multilocus sequence type, MRSA: methicillin resistant Staphylococcus aureus, syn: synonymous, stop: premature stop codon, CDS: coding sequence, SNP: single nucleotide polymorphism, indel: insertion/deletion, ST: sequence type, CC: clonal complex.</p

FigShare

Estimating the number of transmission events from genomic divergence.

We used a simple population genetics model to calculate the probability of the number of mutational differences between two bacterial genomes, conditional on the number of transmission events that have occurred since their most recent common ancestor, under A) slow transmission (0.3 transmissions per year) and B) rapid transmission (1.2 transmissions per year). We employed the model to estimate the Bayesian posterior probability of the number of transmission events conditional on the observed number of mutational differences, under C) slow and D) rapid transmission. E) When we applied the model to CC22 genomes under the slow transmission model, we detected evidence for very recent transmission between some pairs of hosts, including the possibility of direct transmission. In A) and B), the lines are color-coded according to the number of transmission events, indicated by the key. In C) and D), the magnitude of the posterior probability is indicated by the intensity of the shading, as shown by the key. In E), the ten pairs of hosts with most evidence for recent transmission are shown. The colors distinguish transmission pairs.</p

FigShare

Genes affected by multiple mutations among hosts A–M, Q and R.

Genes affected by multiple mutations among hosts A–M, Q and R.</p

FigShare

Large structural variation within hosts.

aUnderscoring indicates contigs that were present in the host-specific reference.</p

FigShare

Coverage across the genome for two Hepatitis C samples sequenced directly from RNA.

Coverage across the genome for two Hepatitis C samples sequenced directly from RNA.</p

FigShare

Coverage profiles of one Norovirus sample from amplicon and direct RNA sequencing.

A – Coverage across the genome for one Norovirus sample sequenced from PCR amplicons (others similar). Green and orange dotted lined mark the locations of the PCR primers used to generate the amplicons. B – coverage across the genome for the same Norovirus sample sequenced directly from RNA.</p

FigShare

Examples of large insertion-deletion polymorphisms within single hosts.

A Modified RNA-Seq Approach for Whole Genome Sequencing of RNA Viruses from Faecal and Blood Samples

Genomic diversity of <i>Staphylococcus aureus</i> in 13 singly-colonized nasal carriers.

Evidence for natural selection on the <i>Staphylococcus aureus</i> genome during asymptomatic carriage.

Genomic diversity in asymptomatically carried nasal populations of <i>Staphylococcus aureus</i>.

Estimating the number of transmission events from genomic divergence.

Genes affected by multiple mutations among hosts A–M, Q and R.

Large structural variation within hosts.

Coverage across the genome for two Hepatitis C samples sequenced directly from RNA.

Coverage profiles of one Norovirus sample from amplicon and direct RNA sequencing.