10 research outputs found
Recommended from our members
Phylogeography and conservation of Pedicularis (Orobanchaceae) in the Hengduan Mountains of SW China and Tibet
The Hengduan Mountains Region (HMR) of southwest China is a temperate biodiversity hotspot characterized by high rates of plant endemism, including approximately 300 endemic species of Pedicularis (Orobanchaceae). Intersecting processes of mountain uplift during the early Oligocene (~30 Mya), followed by monsoon intensification during the mid-Miocene (~8-10 Mya) and glacial cycles throughout the Quaternary, have contributed to high rates of speciation within this genus due to population isolation in allopatry. In order to accurately identify the major geographic barriers influencing this speciation process, genomic data (RADseq) and comparative phylogeographic methods were used to characterize the history of population divergence among six widespread species of Pedicularis. Three biogeographic regions, which are delineated by four main tributaries of the Upper Yangtze River system, were identified within the HMR. In recent decades, infrastructure development has substantially altered this mountain landscape - connecting previously isolated areas by roads and tunnels - with the potential to threaten endemic biodiversity through genetic homogenization. This inferred phylogeographic history of Pedicularis within the HMR can be used in future research to determine whether recent gene flow has occurred across historical barriers due to human influence, thus aiding plant conservation efforts in this biodiversity hotspot
A High-Quality Reference Genome for the Invasive Mosquitofish Gambusia affinis Using a Chicago Library
The western mosquitofish, Gambusia affinis, is a freshwater poecilid fish native to the southeastern United States but with a global distribution due to widespread human introduction. Gambusia affinis has been used as a model species for a broad range of evolutionary and ecological studies. We sequenced the genome of a male G. affinis to facilitate genetic studies in diverse fields including invasion biology and comparative genetics. We generated Illumina short read data from paired-end libraries and in vitro proximity-ligation libraries. We obtained 54.9× coverage, N50 contig length of 17.6 kb, and N50 scaffold length of 6.65 Mb. Compared to two other species in the Poeciliidae family, G. affinis has slightly fewer genes that have shorter total, exon, and intron length on average. Using a set of universal single-copy orthologs in fish genomes, we found 95.5% of these genes were complete in the G. affinis assembly. The number of transposable elements in the G. affinis assembly is similar to those of closely related species. The high-quality genome sequence and annotations we report will be valuable resources for scientists to map the genetic architecture of traits of interest in this species
Adapterama IV: Sequence Capture of Dual-digest RADseq Libraries with Identifiable Duplicates (RADcap)
AbstractMolecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods such as restriction site associated DNA sequencing (RADseq) and sequence capture are constrained by costs associated with inefficient use of sequencing data and sample preparation, respectively. Here, we demonstrate RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. The RADcap approach uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design, and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96 to 384Wisteriaplants. Our results demonstrate that our RADcap method: 1) can methodologically reduce (to <5%) and computationally remove PCR duplicate reads from data; (2) achieves 80-90% reads-on-target in 11 of 12 enrichments; (3) returns consistent coverage (≥4x) across >90% of individuals at up to 99.9% of the targeted loci; (4) produces consistently high occupancy matrices of genotypes across hundreds of individuals; and (5) is inexpensive, with reagent and sequencing costs totaling <$6/sample and adapter and primer costs of only a few hundred dollars.</jats:p
Adapterama II: universal amplicon sequencing on Illumina platforms (TaggiMatrix)
Next-generation sequencing (NGS) of amplicons is used in a wide variety of contexts. In many cases, NGS amplicon sequencing remains overly expensive and inflexible, with library preparation strategies relying upon the fusion of locus-specific primers to full-length adapter sequences with a single identifying sequence or ligating adapters onto PCR products. In , we presented universal stubs and primers to produce thousands of unique index combinations and a modifiable system for incorporating them into Illumina libraries. Here, we describe multiple ways to use the system and other approaches for amplicon sequencing on Illumina instruments. In the variant we use most frequently for large-scale projects, we fuse partial adapter sequences (TruSeq or Nextera) onto the 5\u27 end of locus-specific PCR primers with variable-length tag sequences between the adapter and locus-specific sequences. These fusion primers can be used combinatorially to amplify samples within a 96-well plate (8 forward primers + 12 reverse primers yield 8 × 12 = 96 combinations), and the resulting amplicons can be pooled. The initial PCR products then serve as template for a second round of PCR with dual-indexed iTru or iNext primers (also used combinatorially) to make full-length libraries. The resulting quadruple-indexed amplicons have diversity at most base positions and can be pooled with any standard Illumina library for sequencing. The number of sequencing reads from the amplicon pools can be adjusted, facilitating deep sequencing when required or reducing sequencing costs per sample to an economically trivial amount when deep coverage is not needed. We demonstrate the utility and versatility of our approaches with results from six projects using different implementations of our protocols. Thus, we show that these methods facilitate amplicon library construction for Illumina instruments at reduced cost with increased flexibility. A simple web page to design fusion primers compatible with iTru primers is available at: http://baddna.uga.edu/tools-taggi.html. A fast and easy to use program to demultiplex amplicon pools with internal indexes is available at: https://github.com/lefeverde/Mr_Demuxy
Adapterama III: Quadruple-indexed, double/triple-enzyme RADseq libraries (2RAD/3RAD)
Molecular ecologists frequently use genome reduction strategies that rely upon restriction enzyme digestion of genomic DNA to sample consistent portions of the genome from many individuals (e.g., RADseq, GBS). However, researchers often find the existing methods expensive to initiate and/or difficult to implement consistently, especially because it is difficult to multiplex sufficient numbers of samples to fill entire sequencing lanes. Here, we introduce a low-cost and highly robust approach for the construction of dual-digest RADseq libraries that build on adapters and primers designed in . Major features of our method include: (1) minimizing the number of processing steps; (2) focusing on a single strand of sample DNA for library construction, allowing the use of a non-phosphorylated adapter on one end; (3) ligating adapters in the presence of active restriction enzymes, thereby reducing chimeras; (4) including an optional third restriction enzyme to cut apart adapter-dimers formed by the phosphorylated adapter, thus increasing the efficiency of adapter ligation to sample DNA, which is particularly effective when only low quantity/quality DNA samples are available; (5) interchangeable adapter designs; (6) incorporating variable-length internal indexes within the adapters to increase the scope of sample indexing, facilitate pooling, and increase sequence diversity; (7) maintaining compatibility with universal dual-indexed primers and thus, Illumina sequencing reagents and libraries; and, (8) easy modification for the identification of PCR duplicates. We present eight adapter designs that work with 72 restriction enzyme combinations. We demonstrate the efficiency of our approach by comparing it with existing methods, and we validate its utility through the discovery of many variable loci in a variety of non-model organisms. Our 2RAD/3RAD method is easy to perform, has low startup costs, has increased utility with low-concentration input DNA, and produces libraries that can be highly-multiplexed and pooled with other Illumina libraries
Supplemental Material for Hoffberg et al., 2018
Figure
S1: Comparison of the size distribution of library inserts in the Meraculous and
HiRise assemblies.<div><br></div><div>Figure S2: The frequency
of kmers at each kmer length. </div><div><br></div><div>Figure S3: The distribution of scaffold lengths in the HiRise assembly. </div><div><br></div><div>Figure S4: The cumulative percent of the
assembly for a given scaffold size in the Meraculous and HiRise
assemblies. </div><div><br></div><div>Table S1: A detailed
list of the number of copies and percent of the assembly of transposons and
repeatable elements. </div><div><br></div><p>File S1: Submission
script for MAKER.</p><p><br></p>
<p>File S2: MAKER
executable file (maker_exe.ctl).</p><p><br></p>
<p>File S3: Specifications
for downstream filtering of BLAST and Exonerate alignments (maker_bopts.ctl).</p><p><br></p>
<p>File S4: Primary
configuration of MAKER specific options (maker_opts.ctl).</p><p><br></p>
<p>File S5: Commands for
training SNAP.</p>
<p><br></p><p>File S6: Submission
script for BLAST comparing <i>Gambusia affinis</i> with related fish.</p>
<p><br></p><p>File S7: Submission
script for BUSCO.</p>
<p><br></p><p>File S8: Submission
script for predicting ncRNAs.</p>
<p><br></p><p>File S9: Illumina reads mapped to the reference in BAM format.</p><p><br></p><p>File S10: Sequence of
tRNAs.</p>
<p><br></p><p>File S11: Structure of
tRNAs.</p>
<div><br></div><div>File S12: rRNA, snRNA, snoRNA, and miRNA sequences.</div><div><br></div
Adapterama III: Quadruple-indexed, double/triple-enzyme RADseq libraries (2RAD/3RAD)
Molecular ecologists frequently use genome reduction strategies that rely upon restriction enzyme digestion of genomic DNA to sample consistent portions of the genome from many individuals (e.g., RADseq, GBS). However, researchers often find the existing methods expensive to initiate and/or difficult to implement consistently, especially because it is difficult to multiplex sufficient numbers of samples to fill entire sequencing lanes. Here, we introduce a low-cost and highly robust approach for the construction of dual-digest RADseq libraries that build on adapters and primers designed in Adapterama I. Major features of our method include: (1) minimizing the number of processing steps; (2) focusing on a single strand of sample DNA for library construction, allowing the use of a non-phosphorylated adapter on one end; (3) ligating adapters in the presence of active restriction enzymes, thereby reducing chimeras; (4) including an optional third restriction enzyme to cut apart adapter-dimers formed by the phosphorylated adapter, thus increasing the efficiency of adapter ligation to sample DNA, which is particularly effective when only low quantity/quality DNA samples are available; (5) interchangeable adapter designs; (6) incorporating variable-length internal indexes within the adapters to increase the scope of sample indexing, facilitate pooling, and increase sequence diversity; (7) maintaining compatibility with universal dual-indexed primers and thus, Illumina sequencing reagents and libraries; and, (8) easy modification for the identification of PCR duplicates. We present eight adapter designs that work with 72 restriction enzyme combinations. We demonstrate the efficiency of our approach by comparing it with existing methods, and we validate its utility through the discovery of many variable loci in a variety of non-model organisms. Our 2RAD/3RAD method is easy to perform, has low startup costs, has increased utility with low-concentration input DNA, and produces libraries that can be highly-multiplexed and pooled with other Illumina libraries
Recommended from our members
Adapterama II: universal amplicon sequencing on Illumina platforms (TaggiMatrix).
Next-generation sequencing (NGS) of amplicons is used in a wide variety of contexts. In many cases, NGS amplicon sequencing remains overly expensive and inflexible, with library preparation strategies relying upon the fusion of locus-specific primers to full-length adapter sequences with a single identifying sequence or ligating adapters onto PCR products. In Adapterama I, we presented universal stubs and primers to produce thousands of unique index combinations and a modifiable system for incorporating them into Illumina libraries. Here, we describe multiple ways to use the Adapterama system and other approaches for amplicon sequencing on Illumina instruments. In the variant we use most frequently for large-scale projects, we fuse partial adapter sequences (TruSeq or Nextera) onto the 5' end of locus-specific PCR primers with variable-length tag sequences between the adapter and locus-specific sequences. These fusion primers can be used combinatorially to amplify samples within a 96-well plate (8 forward primers + 12 reverse primers yield 8 × 12 = 96 combinations), and the resulting amplicons can be pooled. The initial PCR products then serve as template for a second round of PCR with dual-indexed iTru or iNext primers (also used combinatorially) to make full-length libraries. The resulting quadruple-indexed amplicons have diversity at most base positions and can be pooled with any standard Illumina library for sequencing. The number of sequencing reads from the amplicon pools can be adjusted, facilitating deep sequencing when required or reducing sequencing costs per sample to an economically trivial amount when deep coverage is not needed. We demonstrate the utility and versatility of our approaches with results from six projects using different implementations of our protocols. Thus, we show that these methods facilitate amplicon library construction for Illumina instruments at reduced cost with increased flexibility. A simple web page to design fusion primers compatible with iTru primers is available at: http://baddna.uga.edu/tools-taggi.html. A fast and easy to use program to demultiplex amplicon pools with internal indexes is available at: https://github.com/lefeverde/Mr_Demuxy
Adapterama II: universal amplicon sequencing on Illumina platforms (TaggiMatrix)
Next-generation sequencing (NGS) of amplicons is used in a wide variety of contexts. In many cases, NGS amplicon sequencing remains overly expensive and inflexible, with library preparation strategies relying upon the fusion of locus-specific primers to full-length adapter sequences with a single identifying sequence or ligating adapters onto PCR products. In Adapterama I, we presented universal stubs and primers to produce thousands of unique index combinations and a modifiable system for incorporating them into Illumina libraries. Here, we describe multiple ways to use the Adapterama system and other approaches for amplicon sequencing on Illumina instruments. In the variant we use most frequently for large-scale projects, we fuse partial adapter sequences (TruSeq or Nextera) onto the 5′ end of locus-specific PCR primers with variable-length tag sequences between the adapter and locus-specific sequences. These fusion primers can be used combinatorially to amplify samples within a 96-well plate (8 forward primers + 12 reverse primers yield 8 × 12 = 96 combinations), and the resulting amplicons can be pooled. The initial PCR products then serve as template for a second round of PCR with dual-indexed iTru or iNext primers (also used combinatorially) to make full-length libraries. The resulting quadruple-indexed amplicons have diversity at most base positions and can be pooled with any standard Illumina library for sequencing. The number of sequencing reads from the amplicon pools can be adjusted, facilitating deep sequencing when required or reducing sequencing costs per sample to an economically trivial amount when deep coverage is not needed. We demonstrate the utility and versatility of our approaches with results from six projects using different implementations of our protocols. Thus, we show that these methods facilitate amplicon library construction for Illumina instruments at reduced cost with increased flexibility. A simple web page to design fusion primers compatible with iTru primers is available at: http://baddna.uga.edu/tools-taggi.html. A fast and easy to use program to demultiplex amplicon pools with internal indexes is available at: https://github.com/lefeverde/Mr_Demuxy