15 research outputs found

    DIDA: Distributed Indexing Dispatched Alignment

    Get PDF
    One essential application in bioinformatics that is affected by the high-throughput sequencing data deluge is the sequence alignment problem, where nucleotide or amino acid sequences are queried against targets to find regions of close similarity. When queries are too many and/or targets are too large, the alignment process becomes computationally challenging. This is usually addressed by preprocessing techniques, where the queries and/or targets are indexed for easy access while searching for matches. When the target is static, such as in an established reference genome, the cost of indexing is amortized by reusing the generated index. However, when the targets are non-static, such as contigs in the intermediate steps of a de novo assembly process, a new index must be computed for each run. To address such scalability problems, we present DIDA, a novel framework that distributes the indexing and alignment tasks into smaller subtasks over a cluster of compute nodes. It provides a workflow beyond the common practice of embarrassingly parallel implementations. DIDA is a cost-effective, scalable and modular framework for the sequence alignment problem in terms of memory usage and runtime. It can be employed in large-scale alignments to draft genomes and intermediate stages of de novo assembly runs. The DIDA source code, sample files and user manual are available through http://www.bcgsc.ca/platform/bioinfo/software/dida. The software is released under the British Columbia Cancer Agency License (BCCA), and is free for academic use

    Parallel Benchmarks and Comparison-Based Computing

    No full text
    Non-numeric algorithms have been largely ignored in parallel benchmarking suites. Prior studies have concentrated mainly on the computational speed of processors within very regular and structured numeric codes. In this paper, we survey the current state of non-numeric benchmark algorithms and investigate the use of in-place merging as a suitable candidate for this role. In-place merging enjoys several important advantages, including the scalability of efficient memory utilization, the generality of comparison-based computing and the representativeness of near-random data access patterns. Experimental results over several families of parallel architectures are presented. A preliminary version of a portion of this paper was presented at the International Conference on Parallel Computing held in Gent, Belgium, in September, 1995. This research has been supported in part by the National Science Foundation under grant CDA--9115428 and by the Office of Naval Research under contract N00014..

    Algorithm 821

    No full text

    Work funded by the DoD High Performance Computing Modernization Program CEWES Major Shared Resource Center through

    No full text
    this report are those of the author (s) and should not be construed as an official Department of the Defense position, policy, or decision unless so designated by other official documentation. Using the MPE Graphics Library with Fortran9

    Scalability of different aligners using DIDA for <i>C. elegant</i> data.

    No full text
    <p>Y-axis indicates the runtime/memory scalability in the in the [0.1] interval for different alignment tools. The scalability of each tool is shown in the standalone case and within DIDA framework on 2, 4, 8, and 12 nodes.</p
    corecore