6,262 research outputs found

    Sorting by Block Moves

    Get PDF
    The research in this thesis is focused on the problem of Block Sorting, which has applications in Computational Biology and in Optical Character Recognition (OCR). A block in a permutation is a maximal sequence of consecutive elements that are also consecutive in the identity permutation. BLOCK SORTING is the process of transforming an arbitrary permutation to the identity permutation through a sequence of block moves. Given an arbitrary permutation π and an integer m, the Block Sorting Problem, or the problem of deciding whether the transformation can be accomplished in at most m block moves has been shown to be NP-hard. After being known to be 3-approximable for over a decade, block sorting has been researched extensively and now there are several 2-approximation algorithms for its solution. This work introduces new structures on a permutation, which are called runs and ordered pairs, and are used to develop two new approximation algorithms. Both the new algorithms are 2-approximation algorithms, yielding the approximation ratio equal to the current best. This work also includes an analysis of both the new algorithms showing they are 2-approximation algorithms

    Importance of Explicit Vectorization for CPU and GPU Software Performance

    Full text link
    Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU implementations of a particular computationally intensive Metropolis Monte Carlo algorithm. Explicit vectorization on the CPU and the equivalent, explicit memory coalescing, on the GPU are found to be critical to achieving good performance of this algorithm in both environments. The fully-optimized CPU version achieves a 9x to 12x speedup over the original CPU version, in addition to speedup from multi-threading. This is 2x faster than the fully-optimized GPU version.Comment: 17 pages, 17 figure
    • …
    corecore