20 research outputs found
Microbiome profiling by Illumina sequencing of combinatorial sequence-tagged PCR products
We developed a low-cost, high-throughput microbiome profiling method that
uses combinatorial sequence tags attached to PCR primers that amplify the rRNA
V6 region. Amplified PCR products are sequenced using an Illumina paired-end
protocol to generate millions of overlapping reads. Combinatorial sequence
tagging can be used to examine hundreds of samples with far fewer primers than
is required when sequence tags are incorporated at only a single end. The
number of reads generated permitted saturating or near-saturating analysis of
samples of the vaginal microbiome. The large number of reads al- lowed an
in-depth analysis of errors, and we found that PCR-induced errors composed the
vast majority of non-organism derived species variants, an ob- servation that
has significant implications for sequence clustering of similar high-throughput
data. We show that the short reads are sufficient to assign organisms to the
genus or species level in most cases. We suggest that this method will be
useful for the deep sequencing of any short nucleotide region that is
taxonomically informative; these include the V3, V5 regions of the bac- terial
16S rRNA genes and the eukaryotic V9 region that is gaining popularity for
sampling protist diversity.Comment: 28 pages, 13 figure
Next-Generation Phylogeography: A Targeted Approach for Multilocus Sequencing of Non-Model Organisms
The field of phylogeography has long since realized the need and utility of incorporating nuclear DNA (nDNA) sequences into analyses. However, the use of nDNA sequence data, at the population level, has been hindered by technical laboratory difficulty, sequencing costs, and problematic analytical methods dealing with genotypic sequence data, especially in non-model organisms. Here, we present a method utilizing the 454 GS-FLX Titanium pyrosequencing platform with the capacity to simultaneously sequence two species of sea star (Meridiastra calcar and Parvulastra exigua) at five different nDNA loci across 16 different populations of 20 individuals each per species. We compare results from 3 populations with traditional Sanger sequencing based methods, and demonstrate that this next-generation sequencing platform is more time and cost effective and more sensitive to rare variants than Sanger based sequencing. A crucial advantage is that the high coverage of clonally amplified sequences simplifies haplotype determination, even in highly polymorphic species. This targeted next-generation approach can greatly increase the use of nDNA sequence loci in phylogeographic and population genetic studies by mitigating many of the time, cost, and analytical issues associated with highly polymorphic, diploid sequence markers
The Mitochondrial Genome of the Legume Vigna radiata and the Analysis of Recombination across Short Mitochondrial Repeats
The mitochondrial genomes of seed plants are exceptionally fluid in size, structure, and sequence content, with the accumulation and activity of repetitive sequences underlying much of this variation. We report the first fully sequenced mitochondrial genome of a legume, Vigna radiata (mung bean), and show that despite its unexceptional size (401,262 nt), the genome is unusually depauperate in repetitive DNA and "promiscuous" sequences from the chloroplast and nuclear genomes. Although Vigna lacks the large, recombinationally active repeats typical of most other seed plants, a PCR survey of its modest repertoire of short (38–297 nt) repeats nevertheless revealed evidence for recombination across all of them. A set of novel control assays showed, however, that these results could instead reflect, in part or entirely, artifacts of PCR-mediated recombination. Consequently, we recommend that other methods, especially high-depth genome sequencing, be used instead of PCR to infer patterns of plant mitochondrial recombination. The average-sized but repeat- and feature-poor mitochondrial genome of Vigna makes it ever more difficult to generalize about the factors shaping the size and sequence content of plant mitochondrial genomes
Tiny vampires in ancient seas: evidence for predation via perforation in fossils from the 780–740 million-year-old Chuar Group, Grand Canyon, USA
One explanation for the Early Neoproterozoic expansion of eukaryotes is the appearance of eukaryovorous predators-i.e. protists that preyed on other protists. Evidence for eukaryovory at this time, however, is indirect, based on inferences from character state reconstructions and molecular clocks, and on the presence of possible defensive structures in some protistan fossils. Here I describe 0.1-3.4 µm circular holes in seven species of organic-walled microfossils from the 780-740 million-year-old Chuar Group, Grand Canyon, Arizona, USA, that are similar to those formed today by predatory protists that perforate the walls of their prey to consume the contents inside. Although best known in the vampyrellid amoebae, this 'vampire-like' behaviour is widespread among eukaryotes, making it difficult to infer confidently the identity of the predator. Nonetheless, the identity of the prey is clear: some-and perhaps all-of the fossils are eukaryotes. These holes thus provide the oldest direct evidence for predation on eukaryotes. Larger circular and half-moon-shaped holes in vase-shaped microfossils from the upper part of the unit may also be the work of 'tiny vampires', suggesting a diversity of eukaryovorous predators lived in the ancient Chuar sea
Denoising PCR-amplified metagenome data
<p>Abstract</p> <p>Background</p> <p>PCR amplification and high-throughput sequencing theoretically enable the characterization of the finest-scale diversity in natural microbial and viral populations, but each of these methods introduces random errors that are difficult to distinguish from genuine biological diversity. Several approaches have been proposed to denoise these data but lack either speed or accuracy.</p> <p>Results</p> <p>We introduce a new denoising algorithm that we call <it>DADA</it> (Divisive Amplicon Denoising Algorithm). Without training data, <it>DADA</it> infers both the sample genotypes and error parameters that produced a metagenome data set. We demonstrate performance on control data sequenced on Roche’s <it>454</it> platform, and compare the results to the most accurate denoising software currently available, <it>AmpliconNoise</it>.</p> <p>Conclusions</p> <p><it>DADA</it> is more accurate and over an order of magnitude faster than <it>AmpliconNoise</it>. It eliminates the need for training data to establish error parameters, fully utilizes sequence-abundance information, and enables inclusion of context-dependent PCR error rates. It should be readily extensible to other sequencing platforms such as <it>Illumina</it>.</p