30 research outputs found
A reference library for Canadian invertebrates with 1.5 million barcodes, voucher specimens, and DNA samples
The synthesis of this dataset was enabled by funding from the Canada Foundation for Innovation, from Genome Canada through Ontario Genomics, from NSERC, and from the Ontario Ministry of Research, Innovation and Science in support of the International Barcode of Life project. It was also enabled by philanthropic support from the Gordon and Betty Moore Foundation and from Ann McCain Evans and Chris Evans. The release of the data on GGBN was supported by a GGBN ā Global Genome Initiative Award and we thank G. Droege, L. Loo, K. Barker, and J. Coddington for their support. Our work depended heavily on the analytical capabilities of the Barcode of Life Data Systems (BOLD, www.boldsystems.org). We also thank colleagues at the CBG for their support, including S. Adamowicz, S. Bateson, E. Berzitis, V. Breton, V. Campbell, A. Castillo, C. Christopoulos, J. Cossey, C. Gallant, J. Gleason, R. Gwiazdowski, M. Hajibabaei, R. Hanner, K. Hough, P. Janetta, A. Pawlowski, S. Pedersen, J. Robertson, D. Roes, K. Seidle, M. A. Smith, B. St. Jacques, A. Stoneham, J. Stahlhut, R. Tabone, J.Topan, S. Walker, and C. Wei. For bioblitz-related assistance, we are grateful to D. Ireland, D. Metsger, A. Guidotti, J. Quinn and other members of Bioblitz Canada and Ontario Bioblitz. For our work in Canadaās national parks, we thank S. Woodley and J. Waithaka for their lead role in organizing permits and for the many Parks Canada staff who facilitated specimen collections, including M. Allen, D. Amirault-Langlais, J. Bastick, C. Belanger, C. Bergman, J.-F. Bisaillon, S. Boyle, J. Bridgland, S. Butland, L. Cabrera, R. Chapman, J. Chisholm, B. Chruszcz, D. Crossland, H. Dempsey, N. Denommee, T. Dobbie, C. Drake, J. Feltham, A. Forshner, K. Forster, S. Frey, L. Gardiner, P. Giroux, T. Golumbia, D. Guedo, N. Guujaaw, S. Hairsine, E. Hansen, C. Harpur, S. Hayes, J. Hofman, S. Irwin, B. Johnston, V. Kafa, N. Kang, P. Langan, P. Lawn, M. Mahy, D. Masse, D. Mazerolle, C. McCarthy, I. McDonald, J. McIntosh, C. McKillop, V. Minelga, C. Ouimet, S. Parker, N. Perry, J. Piccin, A. Promaine, P. Roy, M. Savoie, D. Sigouin, P. Sinkins, R. Sissons, C. Smith, R. Smith, H. Stewart, G. Sundbo, D. Tate, R. Tompson, E. Tremblay, Y. Troutet, K. Tulk, J. Van Wieren, C. Vance, G. Walker, D. Whitaker, C. White, R. Wissink, C. Wong, and Y. Zharikov. For our work near Canadaās ports in Vancouver, Toronto, Montreal, and Halifax, we thank R. Worcester, A. Chreston, M. Larrivee, and T. Zemlak, respectively. Many other organizations improved coverage in the reference library by providing access to specimens ā they included the Canadian National Collection of Insects, Arachnids and Nematodes, Smithsonian Institutionās National Museum of Natural History, the Canadian Museum of Nature, the University of Guelph Insect Collection, the Royal British Columbia Museum, the Royal Ontario Museum, the Pacifc Forestry Centre, the Northern Forestry Centre, the Lyman Entomological Museum, the Churchill Northern Studies Centre, and rare Charitable Research Reserve. We also thank the many taxonomic specialists who identifed specimens, including A. Borkent, B. Brown, M. Buck, C. Carr, T. Ekrem, J. Fernandez Triana, C. Guppy, K. Heller, J. Huber, L. Jacobus, J. Kjaerandsen, J. Klimaszewski, D. Lafontaine, J-F. Landry, G. Martin, A. Nicolai, D. Porco, H. Proctor, D. Quicke, J. Savage, B. C. Schmidt, M. Sharkey, A. Smith, E. Stur, A. Tomas, J. Webb, N. Woodley, and X. Zhou. We also thank K. Kerr and T. Mason for facilitating collections at Toronto Zoo and D. Iles for servicing the trap at Wapusk National Park. This paper contributes to the University of Guelphās Food from Thought research program supported by the Canada First Research Excellence Fund. The Barcode of Life Data System (BOLD; www.boldsystems.org)8 was used as the primary workbench for creating, storing, analyzing, and validating the specimen and sequence records and the associated data resources48. The BOLD platform has a private, password-protected workbench for the steps from specimen data entry to data validation (see details in Data Records), and a public data portal for the release of data in various formats. The latter is accessible through an API (http://www.boldsystems.org/index.php/resources/api?type=webservices) that can also be controlled through R75 with the package āboldā76.Peer reviewedPublisher PD
CCSReads_DS-PBBC4
Sequel CCS reads for 99%, 99.9%, and 99.99% partitions of data set DS-PBBC4
Data from: A sequel to Sanger: amplicon sequencing that scales
Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658 bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system. By examining templates from more than 5,000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL reduces costs 40-fold from Sanger analysis. Reflecting the capacity of each instrument to recover sequences from more than five million DNA extracts a year, this platform facilitates massive amplicon characterization
A Sequel to Sanger: amplicon sequencing that scales
Abstract Background Although high-throughput sequencers (HTS) have largely displaced their Sanger counterparts, the short read lengths and high error rates of most platforms constrain their utility for amplicon sequencing. The present study tests the capacity of single molecule, real-time (SMRT) sequencing implemented on the SEQUEL platform to overcome these limitations, employing 658Ā bp amplicons of the mitochondrial cytochrome c oxidase I gene as a model system. Results By examining templates from more than 5000 species and 20,000 specimens, the performance of SMRT sequencing was tested with amplicons showing wide variation in GC composition and varied sequence attributes. SMRT and Sanger sequences were very similar, but SMRT sequencing provided more complete coverage, especially for amplicons with homopolymer tracts. Because it can characterize amplicon pools from 10,000 DNA extracts in a single run, the SEQUEL can reduce greatly reduce sequencing costs in comparison to first (Sanger) and second generation platforms (Illumina, Ion). Conclusions SMRT analysis generates high-fidelity sequences from amplicons with varying GC content and is resilient to homopolymer tracts. Analytical costs are low, substantially less than those for first or second generation sequencers. When implemented on the SEQUEL platform, SMRT analysis enables massive amplicon characterization because each instrument can recover sequences from more than 5 million DNA extracts a year
Primer sets employed in amplification of the COI barcode region from Lepidoptera in ANIC.
<p>Primer sets employed in amplification of the COI barcode region from Lepidoptera in ANIC.</p
Variation in percentage of specimens yielding a sequence versus collection year and the length of these records for Lepidoptera from ANIC.
<p>Variation in percentage of specimens yielding a sequence versus collection year and the length of these records for Lepidoptera from ANIC.</p
Heat map showing the collection sites for the 41,650 specimens of Lepidoptera analyzed in this study.
<p>The numbers indicate the count for each state or territory.</p
The tiers of PCR employed to recover barcode sequences from Lepidoptera specimens in ANIC.
<p>The tiers of PCR employed to recover barcode sequences from Lepidoptera specimens in ANIC.</p
The quality of sequences recovered from ANIC specimens in three age categories as measured by trace scores and by the number of uncertain base calls per kilobase.
<p>The quality of sequences recovered from ANIC specimens in three age categories as measured by trace scores and by the number of uncertain base calls per kilobase.</p