77,557 research outputs found
Parallel String Sample Sort
We discuss how string sorting algorithms can be parallelized on modern
multi-core shared memory machines. As a synthesis of the best sequential string
sorting algorithms and successful parallel sorting algorithms for atomic
objects, we propose string sample sort. The algorithm makes effective use of
the memory hierarchy, uses additional word level parallelism, and largely
avoids branch mispredictions. Additionally, we parallelize variants of multikey
quicksort and radix sort that are also useful in certain situations.Comment: 34 pages, 7 figures and 12 table
Engineering Parallel String Sorting
We discuss how string sorting algorithms can be parallelized on modern
multi-core shared memory machines. As a synthesis of the best sequential string
sorting algorithms and successful parallel sorting algorithms for atomic
objects, we first propose string sample sort. The algorithm makes effective use
of the memory hierarchy, uses additional word level parallelism, and largely
avoids branch mispredictions. Then we focus on NUMA architectures, and develop
parallel multiway LCP-merge and -mergesort to reduce the number of random
memory accesses to remote nodes. Additionally, we parallelize variants of
multikey quicksort and radix sort that are also useful in certain situations.
Comprehensive experiments on five current multi-core platforms are then
reported and discussed. The experiments show that our implementations scale
very well on real-world inputs and modern machines.Comment: 46 pages, extension of "Parallel String Sample Sort" arXiv:1305.115
GPU-Accelerated BWT Construction for Large Collection of Short Reads
Advances in DNA sequencing technology have stimulated the development of
algorithms and tools for processing very large collections of short strings
(reads). Short-read alignment and assembly are among the most well-studied
problems. Many state-of-the-art aligners, at their core, have used the
Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome
(typical example, NCBI human genome). Recently, BWT has also found its use in
string-graph assembly, for indexing the reads (i.e., raw data from DNA
sequencers). In a typical data set, the volume of reads is tens of times of the
sequenced genome and can be up to 100 Gigabases. Note that a reference genome
is relatively stable and computing the index is not a frequent task. For reads,
the index has to computed from scratch for each given input. The ability of
efficient BWT construction becomes a much bigger concern than before. In this
paper, we present a practical method called CX1 for constructing the BWT of
very large string collections. CX1 is the first tool that can take advantage of
the parallelism given by a graphics processing unit (GPU, a relative cheap
device providing a thousand or more primitive cores), as well as simultaneously
the parallelism from a multi-core CPU and more interestingly, from a cluster of
GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100
Gigabases can be constructed in less than 2 hours using a machine equipped with
a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such
machines (the speedup is almost linear after excluding the first 16 minutes for
loading the reads from the hard disk). The previously fastest tool BRC is
measured to take 12 hours to process 100 Gigabases on one machine; it is
non-trivial how BRC can be parallelized to take advantage a cluster of
machines, let alone GPUs.Comment: 11 page
GaKCo: a Fast GApped k-mer string Kernel using COunting
String Kernel (SK) techniques, especially those using gapped -mers as
features (gk), have obtained great success in classifying sequences like DNA,
protein, and text. However, the state-of-the-art gk-SK runs extremely slow when
we increase the dictionary size () or allow more mismatches (). This
is because current gk-SK uses a trie-based algorithm to calculate co-occurrence
of mismatched substrings resulting in a time cost proportional to
. We propose a \textbf{fast} algorithm for calculating
\underline{Ga}pped -mer \underline{K}ernel using \underline{Co}unting
(GaKCo). GaKCo uses associative arrays to calculate the co-occurrence of
substrings using cumulative counting. This algorithm is fast, scalable to
larger and , and naturally parallelizable. We provide a rigorous
asymptotic analysis that compares GaKCo with the state-of-the-art gk-SK.
Theoretically, the time cost of GaKCo is independent of the term
that slows down the trie-based approach. Experimentally, we observe that GaKCo
achieves the same accuracy as the state-of-the-art and outperforms its speed by
factors of 2, 100, and 4, on classifying sequences of DNA (5 datasets), protein
(12 datasets), and character-based English text (2 datasets), respectively.
GaKCo is shared as an open source tool at
\url{https://github.com/QData/GaKCo-SVM}Comment: @ECML 201
CHR Grammars
A grammar formalism based upon CHR is proposed analogously to the way
Definite Clause Grammars are defined and implemented on top of Prolog. These
grammars execute as robust bottom-up parsers with an inherent treatment of
ambiguity and a high flexibility to model various linguistic phenomena. The
formalism extends previous logic programming based grammars with a form of
context-sensitive rules and the possibility to include extra-grammatical
hypotheses in both head and body of grammar rules. Among the applications are
straightforward implementations of Assumption Grammars and abduction under
integrity constraints for language analysis. CHR grammars appear as a powerful
tool for specification and implementation of language processors and may be
proposed as a new standard for bottom-up grammars in logic programming.
To appear in Theory and Practice of Logic Programming (TPLP), 2005Comment: 36 pp. To appear in TPLP, 200
- …
