134 research outputs found

    Split and Merge Functions for Supporting Multiple Processing Pipelines in Mercury BLASTN

    Get PDF
    Biosequence similarity search is an important application in computational biology. Mercury BLASTN, an FPGA-based implementation of BLAST for DNA, is one of the alternatives for fast DNA sequence comparison. The re-design of BLAST into a streaming application combined with a high-throughput hardware pipeline have enabled Mercury BLAST to emerge as one of the fastest implementations of bio-sequence similarity search. This performance can be further enhanced by exploiting the data-level parallelism present within the application. Here we present a multiple FPGA-based Mercury BLASTN design in order to double the speed and throughput of DNA sequence computation. This paper describes a dual Mercury BLASTN design, the detailed design of the split and merge functions, and simulation results

    Throughput-optimal systolic arrays from recurrence equations

    Get PDF
    Many compute-bound software kernels have seen order-of-magnitude speedups on special-purpose accelerators built on specialized architectures such as field-programmable gate arrays (FPGAs). These architectures are particularly good at implementing dynamic programming algorithms that can be expressed as systems of recurrence equations, which in turn can be realized as systolic array designs. To efficiently find good realizations of an algorithm for a given hardware platform, we pursue software tools that can search the space of possible parallel array designs to optimize various design criteria. Most existing design tools in this area produce a design that is latency-space optimal. However, we instead wish to target applications that operate on a large collection of small inputs, e.g. a database of biological sequences. For such applications, overall throughput rather than latency per input is the most important measure of performance. In this work, we introduce a new procedure to optimize throughput of a systolic array subject to resource constraints, in this case the area and bandwidth constraints of an FPGA device. We show that the throughput of an array is dependent on the maximum number of lattice points executed by any processor in the array, which to a close approximation is determined solely by the array’s projection vector. We describe a bounded search process to find throughput-optimal projection vectors and a tool to perform automated design space exploration, discovering a range of array designs that are optimal for inputs of different sizes. We apply our techniques to the Nussinov RNA folding algorithm to generate multiple mappings of this algorithm into systolic arrays. By combining our library of designs with run-time reconfiguration of an FPGA device to dynamically switch among them, we predict significant speedup over a single, latency-space optimal array

    WOODSTOCC: Extracting Latent Parallelism from a DNA Sequence Aligner on a GPU

    Get PDF
    An exponential increase in the speed of DNA sequencing over the past decade has driven demand for fast, space-efficient algorithms to process the resultant data. The first step in processing is alignment of many short DNA sequences, or reads, against a large reference sequence. This work presents WOODSTOCC, an implementation of short-read alignment designed for Graphics Processing Unit (GPU) architectures. WOODSTOCC translates a novel CPU implementation of gapped short-read alignment, which has guaranteed optimal and complete results, to the GPU. Our implementation combines an irregular trie search with dynamic programming to expose regularly structured parallelism. We first describe this implementation, then discuss its port to the GPU. WOODSTOCC’s GPU port exploits three generally useful techniques for extracting regular parallelism from irregular computations: dynamic thread mapping with a worklist, kernel stage decoupling, and kernel slicing. We discuss the performance impact of these techniques and suggest further opportunities for improvement

    Comparison of dot chromosome sequences from D. melanogaster and D. virilis reveals an enrichment of DNA transposon sequences in heterochromatic domains

    Get PDF
    BACKGROUND: Chromosome four of Drosophila melanogaster, known as the dot chromosome, is largely heterochromatic, as shown by immunofluorescent staining with antibodies to heterochromatin protein 1 (HP1) and histone H3K9me. In contrast, the absence of HP1 and H3K9me from the dot chromosome in D. virilis suggests that this region is euchromatic. D. virilis diverged from D. melanogaster 40 to 60 million years ago. RESULTS: Here we describe finished sequencing and analysis of 11 fosmids hybridizing to the dot chromosome of D. virilis (372,650 base-pairs) and seven fosmids from major euchromatic chromosome arms (273,110 base-pairs). Most genes from the dot chromosome of D. melanogaster remain on the dot chromosome in D. virilis, but many inversions have occurred. The dot chromosomes of both species are similar to the major chromosome arms in gene density and coding density, but the dot chromosome genes of both species have larger introns. The D. virilis dot chromosome fosmids have a high repeat density (22.8%), similar to homologous regions of D. melanogaster (26.5%). There are, however, major differences in the representation of repetitive elements. Remnants of DNA transposons make up only 6.3% of the D. virilis dot chromosome fosmids, but 18.4% of the homologous regions from D. melanogaster; DINE-1 and 1360 elements are particularly enriched in D. melanogaster. Euchromatic domains on the major chromosomes in both species have very few DNA transposons (less than 0.4 %). CONCLUSION: Combining these results with recent findings about RNAi, we suggest that specific repetitive elements, as well as density, play a role in determining higher-order chromatin packaging

    Undergraduate research. Genomics Education Partnership

    Get PDF
    The Genomics Education Partnership offers an inclusive model for undergraduate research experiences incorporated into the academic year science curriculum, with students pooling their work to contribute to international data bases
    • …
    corecore