3 research outputs found

    Lightweight Massively Parallel Suffix Array Construction

    Get PDF
    The suffix array is an array of sorted suffixes in lexicographic order, where each sorted suffix is represented by its starting position in the input string. It is a fundamental data structure that finds various applications in areas such as string processing, text indexing, data compression, computational biology, and many more. Over the last three decades, researchers have proposed a broad spectrum of suffix array construction algorithms (SACAs). However, the majority of SACAs were implemented using sequential and parallel programming models. The maturity of GPU programming opened doors to the development of massively parallel GPU SACAs that outperform the fastest versions of suffix sorting algorithms optimized for the CPU parallel computing. Over the last five years, several GPU SACA approaches were proposed and implemented. They prioritized the running time over lightweight design. In this thesis, we design and implement a lightweight massively parallel SACA on the GPU using the prefix-doubling technique. Our prefix-doubling implementation is memory-efficient and can successfully construct the suffix array for input strings as large as 640 megabytes (MB) on Tesla P100 GPU. On large datasets, our implementation achieves a speedup of 7-16x over the fastest, highly optimized, OpenMP-accelerated suffix array constructor, libdivsufsort, that leverages the CPU shared memory parallelism. The performance of our algorithm relies on several high-performance parallel primitives such as radix sort, conditional filtering, inclusive prefix sum, random memory scattering, and segmented sort. We evaluate the performance of our implementation over a variety of real-world datasets with respect to its runtime, throughput, memory usage, and scalability. We compare our results against libdivsufsort that we run on a Haswell compute node equipped with 24 cores. Our GPU SACA is simple and compact, consisting of less than 300 lines of readable and effective source code. Additionally, we design and implement a fast and lightweight algorithm for checking the correctness of the suffix array

    Reconfigurations of Combinatorial Problems: Graph Colouring and Hamiltonian Cycle

    Get PDF
    We explore algorithmic aspects of two known combinatorial problems, Graph Colouring and Hamiltonian Cycle, by examining properties of their solution space. One can model the set of solutions of a combinatorial problem PP by the solution graph R(P)R(P), where vertices are solutions of PP and there is an edge between two vertices, when the two corresponding solutions satisfy an adjacency reconfiguration rule. For example, we can define the reconfiguration rule for graph colouring to be that two solutions are adjacent when they differ in colour in exactly one vertex. The exploration of the properties of the solution graph R(P)R(P) can give rise to interesting questions. The connectivity of R(P)R(P) is the most prominent question in this research area. This is reasonable, since the main motivation for modelling combinatorial solutions as a graph is to be able to transform one into the other in a stepwise fashion, by following paths between solutions in the graph. Connectivity questions can be made binary, that is expressed as decision problems which accept a 'yes' or 'no' answer. For example, given two specific solutions, is there a path between them? Is the graph of solutions R(P)R(P) connected? In this thesis, we first show that the diameter of the solution graph Rl(G)R_{l}(G) of vertex ll-colourings of k-colourable chordal and chordal bipartite graphs GG is O(n2)O(n^2), where l>kl > k and n is the number of vertices of GG. Then, we formulate a decision problem on the connectivity of the graph colouring solution graph, where we allow extra colours to be used in order to enforce a path between two colourings with no path between them. We give some results for general instances and we also explore what kind of graphs pose a challenge to determine the complexity of the problem for general instances. Finally, we give a linear algorithm which decides whether there is a path between two solutions of the Hamiltonian Cycle Problem for graphs of maximum degree five, and thus providing insights towards the complexity classification of the decision problem
    corecore