2 research outputs found

    Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, <it>Glossina pallidipes</it>, we found the need for tools to search quickly a set of reads for near exact text matches.</p> <p>Methods</p> <p>A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of <it>de novo </it>assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads.</p> <p>Results</p> <p>Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension.</p> <p>Conclusion</p> <p>The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.</p
    corecore