39 research outputs found
Hyperoliid MYbaits-3 custom probe set
The MYcroarray MYbaits-3 custom bait library (MYcroarray) design. There are 60,179 120mer baits in this file, allowing for 2x tiling (every 60 bp) of the 5,060 sequences. The kit allows 60,060 probes, therefore 119 probes were randomly dropped for final kit design
Hyperolius balfouri annotated transcriptome
Whole RNA from a portion of liver sample preserved in RNA Later was extracted using the RNeasy Protect Mini Kit (Qiagen). Sequencing libraries were prepared using half reactions of the TruSeq RNA Library Preparation Kit V2 (Illumina), beginning with Poly-A selection for samples with high RIN scores (> 7.0) and Ribo-Zero Magnetic Gold (Epicentre) ribosomal RNA removal for samples with low RIN scores (< 7.0). Libraries were sequenced on an Illumina HiSeq2500 with 100 bp paired-end reads. Transcriptomic data were cleaned following Singhal (2013). Cleaned data were assembled using TRINITY (Grabherr et al. 2011) and annotated with Xenopus tropicalis (Ensembl) as a reference genome using reciprocal BLASTX (Altschul et al. 1997) and EXONERATE (Slater & Birney 2005)
Hyperolius riggenbachi annotated transcriptome
Whole RNA from a portion of liver sample preserved in RNA Later was extracted using the RNeasy Protect Mini Kit (Qiagen). Sequencing libraries were prepared using half reactions of the TruSeq RNA Library Preparation Kit V2 (Illumina), beginning with Poly-A selection for samples with high RIN scores (> 7.0) and Ribo-Zero Magnetic Gold (Epicentre) ribosomal RNA removal for samples with low RIN scores (< 7.0). Libraries were sequenced on an Illumina HiSeq2500 with 100 bp paired-end reads. Transcriptomic data were cleaned following Singhal (2013). Cleaned data were assembled using TRINITY (Grabherr et al. 2011) and annotated with Xenopus tropicalis (Ensembl) as a reference genome using reciprocal BLASTX (Altschul et al. 1997) and EXONERATE (Slater & Birney 2005)
Hyperoliid Orthologous Transcript Set
Marker set consisting of 1,265 orthologous transcripts (trimmed to 500-850 bp) from four species of hyperoliid frogs (5,060 total sequences). We compared annotated transcripts from the four species to search for orthologs via BLAST (Altschul et al. 1990). We removed mitochondrial loci from the transcripts. We only kept transcripts with a GC between 40%-70% because extreme GC content causes a reduced capture efficiency for the targets (Bi et al. 2012). Orthologous transcripts with a minimum length of 500 base pairs (bp) were identified across all four samples, resulting in the identification of 2,444 shared transcripts. Transcripts exceeding 850 bp were arbitrarily trimmed to this length for probe design, reflecting a trade-off decision between locus length and the total number of loci included in the experiment. The orthologous transcripts were subjected to additional filtering steps before a final gene set was chosen. The initial filtering step applied upper and lower limits on average transcript divergence, eliminating loci with low variation (< 5.0% average divergence) and exceptionally high variation (> 15.0% average divergence), resulting in the removal of 266 genes. The remaining 2,178 genes were examined for repetitive elements, short repeats, and low complexity regions, which are problematic for probe design and capture. The four sets of transcripts per gene (totaling 8,712 sequences) were screened using the REPEATMASKER Web Server (Smit et al. 2015). This step resulted in the masking of repetitive elements or low complexity regions in 929 sequences, with 7,783 sequences passing the filters. To be conservative, if any of the four transcripts for a gene contained masked sites, that gene was removed from the final marker set, which resulted in the removal of an additional 468 markers. From this reduced set of 1,710 markers, 400 markers with the highest divergence were selected (average divergence ranging from 10.4% to 14.9%) followed by 860 randomly drawn markers from the remaining subset. This marker set was supplemented with five positive controls, which consisted of nuclear sequence data generated using Sanger sequencing for five loci: POMC (624 bp), RAG-1 (777 bp), TYR (573 bp), FICD (524 bp), and KIAA2013 (540 bp). The final marker set selected for probe design included 1,265 genes from four species and 5,060 individual sequences
Kassina decorata annotated transcriptome
Whole RNA from a portion of liver sample preserved in RNA Later was extracted using the RNeasy Protect Mini Kit (Qiagen). Sequencing libraries were prepared using half reactions of the TruSeq RNA Library Preparation Kit V2 (Illumina), beginning with Poly-A selection for samples with high RIN scores (> 7.0) and Ribo-Zero Magnetic Gold (Epicentre) ribosomal RNA removal for samples with low RIN scores (< 7.0). Libraries were sequenced on an Illumina HiSeq2500 with 100 bp paired-end reads. Transcriptomic data were cleaned following Singhal (2013). Cleaned data were assembled using TRINITY (Grabherr et al. 2011) and annotated with Xenopus tropicalis (Ensembl) as a reference genome using reciprocal BLASTX (Altschul et al. 1997) and EXONERATE (Slater & Birney 2005)
Afrixalus paradorsalis annotated transcriptome
Whole RNA from a portion of liver sample preserved in RNA Later was extracted using the RNeasy Protect Mini Kit (Qiagen). Sequencing libraries were prepared using half reactions of the TruSeq RNA Library Preparation Kit V2 (Illumina), beginning with Poly-A selection for samples with high RIN scores (> 7.0) and Ribo-Zero Magnetic Gold (Epicentre) ribosomal RNA removal for samples with low RIN scores (< 7.0). Libraries were sequenced on an Illumina HiSeq2500 with 100 bp paired-end reads. Transcriptomic data were cleaned following Singhal (2013). Cleaned data were assembled using TRINITY (Grabherr et al. 2011) and annotated with Xenopus tropicalis (Ensembl) as a reference genome using reciprocal BLASTX (Altschul et al. 1997) and EXONERATE (Slater & Birney 2005)
Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing
<div><p>For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.</p></div
Phylogenetic inference using ND2 sequence data.
<p>Maximum-likelihood tree inferred from ND2 sequence alignment of the formalin-fixed sample (MVZ 214979), the Anocar2.0 reference genome (Anocar2.0), four <i>Anolis carolinensis</i> collected from Louisiana, USA, eight other <i>Anolis</i> species, and <i>Oplurus cyclurus</i>. <i>A</i>. <i>carolinensis</i> image printed under a CC BY license with permission of the original photographer and copyright owner J. Losos.</p
Patterns of mismatches in MVZ 214979 sequences.
<p>The frequencies of the 12 types of mismatches (y-axis) are plotted as a function of distance from the 5′ and 3′ ends of the sequence reads (x-axis). The frequencies of each mismatch type are coded in different colors and line patterns. ‘After cleaning’ shows mismatch frequencies after deleting the first 50 bp form the end of each read.</p
Nuclear coverage.
<p>(a) Read count and depth in 10 Kbp bins along the length of the six largest <i>A</i>. <i>carolinensis</i> chromosomes (green bars) using the MVZ 214979 library. The green line indicates a read count of 100, and coverage of 1X. (b) Read count and depth is shown in 1 Kb bins along a randomly selected 5 Mbp segment of chromosome 1 using the MVZ 214979 library.</p