45,750 research outputs found
Hybridizing Non-dominated Sorting Algorithms: Divide-and-Conquer Meets Best Order Sort
Many production-grade algorithms benefit from combining an asymptotically
efficient algorithm for solving big problem instances, by splitting them into
smaller ones, and an asymptotically inefficient algorithm with a very small
implementation constant for solving small subproblems. A well-known example is
stable sorting, where mergesort is often combined with insertion sort to
achieve a constant but noticeable speed-up.
We apply this idea to non-dominated sorting. Namely, we combine the
divide-and-conquer algorithm, which has the currently best known asymptotic
runtime of , with the Best Order Sort algorithm, which
has the runtime of but demonstrates the best practical performance
out of quadratic algorithms.
Empirical evaluation shows that the hybrid's running time is typically not
worse than of both original algorithms, while for large numbers of points it
outperforms them by at least 20%. For smaller numbers of objectives, the
speedup can be as large as four times.Comment: A two-page abstract of this paper will appear in the proceedings
companion of the 2017 Genetic and Evolutionary Computation Conference (GECCO
2017
Combinatorial and Asymptotical Results on the Neighborhood Grid
In 2009, Joselli et al introduced the Neighborhood Grid data structure for
fast computation of neighborhood estimates in point clouds. Even though the
data structure has been used in several applications and shown to be
practically relevant, it is theoretically not yet well understood. The purpose
of this paper is to present a polynomial-time algorithm to build the data
structure. Furthermore, it is investigated whether the presented algorithm is
optimal. This investigations leads to several combinatorial questions for which
partial results are given. Finally, we present several limits and experiments
regarding the quality of the obtained neighborhood relation.Comment: 33 pages, 18 Figure
GPU-Accelerated BWT Construction for Large Collection of Short Reads
Advances in DNA sequencing technology have stimulated the development of
algorithms and tools for processing very large collections of short strings
(reads). Short-read alignment and assembly are among the most well-studied
problems. Many state-of-the-art aligners, at their core, have used the
Burrows-Wheeler transform (BWT) as a main-memory index of a reference genome
(typical example, NCBI human genome). Recently, BWT has also found its use in
string-graph assembly, for indexing the reads (i.e., raw data from DNA
sequencers). In a typical data set, the volume of reads is tens of times of the
sequenced genome and can be up to 100 Gigabases. Note that a reference genome
is relatively stable and computing the index is not a frequent task. For reads,
the index has to computed from scratch for each given input. The ability of
efficient BWT construction becomes a much bigger concern than before. In this
paper, we present a practical method called CX1 for constructing the BWT of
very large string collections. CX1 is the first tool that can take advantage of
the parallelism given by a graphics processing unit (GPU, a relative cheap
device providing a thousand or more primitive cores), as well as simultaneously
the parallelism from a multi-core CPU and more interestingly, from a cluster of
GPU-enabled nodes. Using CX1, the BWT of a short-read collection of up to 100
Gigabases can be constructed in less than 2 hours using a machine equipped with
a quad-core CPU and a GPU, or in about 43 minutes using a cluster with 4 such
machines (the speedup is almost linear after excluding the first 16 minutes for
loading the reads from the hard disk). The previously fastest tool BRC is
measured to take 12 hours to process 100 Gigabases on one machine; it is
non-trivial how BRC can be parallelized to take advantage a cluster of
machines, let alone GPUs.Comment: 11 page
Fast construction of FM-index for long sequence reads
Summary: We present a new method to incrementally construct the FM-index for
both short and long sequence reads, up to the size of a genome. It is the first
algorithm that can build the index while implicitly sorting the sequences in
the reverse (complement) lexicographical order without a separate sorting step.
The implementation is among the fastest for indexing short reads and the only
one that practically works for reads of averaged kilobases in length.
Availability and implementation: https://github.com/lh3/ropebwt2
Contact: [email protected]: 2 page
- β¦