2 research outputs found

    DNA sonication inverse PCR for genome scale analysis of uncharacterized flanking sequences

    Get PDF
    1. There are few available tools to comprehensively and economically identify uncharacterized flanking regions that are not extremely labour intensive and which exploit the advantages of emerging long‐read sequencing platforms. 2. We describe SIP; a sonication‐based inverse PCR high‐throughput sequencing strategy to investigate uncharacterized flanking region sequences, including those flanking mobile DNA. SIP combines unbiased fragmentation by sonication and target enrichment by coupling outward facing PCR priming with long‐read sequencing technologies. 3. We demonstrate the effectiveness of SIP by determining retroviral integrations which are high copy and challenging to characterize. We further describe SIP's workflow, examine retroviral (proviral) enrichment and characterize viral structural variants identified. When SIP was coupled with long‐read sequencing using the PacBio RS II platform, proviral integration was extensively characterized at high sequence depth per integration. By interrogating the sequence data, we were also able to test several intrinsic factors including SIP's propensity to form chimeric sequences and adapter ligation efficiencies. 4. SIP is an adaption of a traditional molecular biology technique that can be used to characterize any unknown genomic flanking sequence or to extend any sequence for which only minimal sequence information is available. SIP can be applied broadly to study complex biological systems such as mobile genetic elements with high throughput

    Herbarium genomics: plastome sequence assembly from a range of herbarium specimens using an Iterative Organanelle Genome Assembly pipeline

    No full text
    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different Angiosperm families, 73 of which were from herbarium material with ages up to 146 years old. For 84 specimens, a sufficient number of paired-end reads were generated (in total 9.4 × 1012 nucleotides), yielding successful plastome assemblies for 74 specimens. Those derived from herbarium specimens have lower fractions of plastome-derived reads compared with those from fresh and silica-gel-dried specimens, but total herbarium assembly lengths are only slightly shorter. Specimens from wet-tropical conditions appear to have a higher number of contigs per assembly and lower N50 values. We find no significant correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome-enrichment approaches), and can be performed with limited sample destruction
    corecore