Search CORE

337 research outputs found

Analysing the Performance of GPU Hash Tables for State Space Exploration

Author: Cassee Nathan
Wijs Anton
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2017
Field of study

In the past few years, General Purpose Graphics Processors (GPUs) have been used to significantly speed up numerous applications. One of the areas in which GPUs have recently led to a significant speed-up is model checking. In model checking, state spaces, i.e., large directed graphs, are explored to verify whether models satisfy desirable properties. GPUexplore is a GPU-based model checker that uses a hash table to efficiently keep track of already explored states. As a large number of states is discovered and stored during such an exploration, the hash table should be able to quickly handle many inserts and queries concurrently. In this paper, we experimentally compare two different hash tables optimised for the GPU, one being the GPUexplore hash table, and the other using Cuckoo hashing. We compare the performance of both hash tables using random and non-random data obtained from model checking experiments, to analyse the applicability of the two hash tables for state space exploration. We conclude that Cuckoo hashing is three times faster than GPUexplore hashing for random data, and that Cuckoo hashing is five to nine times faster for non-random data. This suggests great potential to further speed up GPUexplore in the near future.Comment: In Proceedings GaM 2017, arXiv:1712.0834

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

GPU processing of sketches

Author: Brouwer M.R.
Publication venue
Publication date: 15/04/2020
Field of study

Pure OAI Repository

Fast $k$ -NNG construction with GPU-based quick multi-select

Author: D'Souza Roshan
Dashti Ali
Komarov Ivan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/09/2013
Field of study

In this paper we describe a new brute force algorithm for building the

k

-Nearest Neighbor Graph (

k

-NNG). The

k

-NNG algorithm has many applications in areas such as machine learning, bio-informatics, and clustering analysis. While there are very efficient algorithms for data of low dimensions, for high dimensional data the brute force search is the best algorithm. There are two main parts to the algorithm: the first part is finding the distances between the input vectors which may be formulated as a matrix multiplication problem. The second is the selection of the

k

-NNs for each of the query vectors. For the second part, we describe a novel graphics processing unit (GPU) -based multi-select algorithm based on quick sort. Our optimization makes clever use of warp voting functions available on the latest GPUs along with use-controlled cache. Benchmarks show significant improvement over state-of-the-art implementations of the

k

-NN search on GPUs

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

GPU LSM: A Dynamic Dictionary Data Structure for the GPU

Author: Amenta Nina
Ashkiani Saman
Farach-Colton Martin
Li Shengren
Owens John D.
Publication venue
Publication date: 01/01/2017
Field of study

We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA K40c GPU has an average update (insertion or deletion) rate of 225 M elements/s, 13.5x faster than merging items into a sorted array. The GPU LSM supports the retrieval operations of lookup, count, and range query operations with an average rate of 75 M, 32 M and 23 M queries/s respectively. The trade-off for the dynamic updates is that the sorted array is almost twice as fast on retrievals. We believe that our GPU LSM is the first dynamic general-purpose dictionary data structure for the GPU.Comment: 11 pages, accepted to appear on the Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS'18

arXiv.org e-Print Archive

eScholarship - University of California

Efficient Computation of K-Nearest Neighbor Graphs for Large High-Dimensional Data Sets on GPU Clusters

Author: Dashti Ali
Publication venue: UWM Digital Commons
Publication date: 01/08/2013
Field of study

The k-Nearest Neighbor Graph (k-NNG) and the related k-Nearest Neighbor (k-NN) methods have a wide variety of applications in areas such as bioinformatics, machine learning, data mining, clustering analysis, and pattern recognition. Our application of interest is manifold embedding. Due to the large dimensionality of the input data (\u3c15k), spatial subdivision based techniques such OBBs, k-d tree, BSP etc., are not viable. The only alternative is the brute-force search, which has two distinct parts. The first finds distances between individual vectors in the corpus based on a pre-defined metric. Given the distance matrix, the second step selects k nearest neighbors for each member of the query data set. This thesis presents the development and implementation of a distributed exact k-Nearest Neighbor Graph (k-NNG) construction method. The proposed method uses Graphics Processing Units (GPUs) and exploits multiple levels of parallelism for distributed computational systems using GPUs. It is scalable for different cluster sizes, with each compute node in the cluster containing multiple GPUs. The distance computation is formulated as a basic matrix multiplication and reduction operation. The optimized CUBLAS matrix multiplication library is used for this purpose. Various distance metrics such as Euclidian, cosine, and Pearson are supported. For k-NNG construction, two different methods are presented. The first is based on an approach called batch index sorting to build the k-NNG with three sorting operations. This method uses the optimized radix sort implementation in the Thrust library for GPU. The second is an efficient implementation using the latest GPU functionalities of a variant of the quick select algorithm. Overall, the batch index sorting based k-NNG method is approximately 13x faster than a distributed MATLAB implementation. The quick select algorithm itself has a 5x speedup over state-of-the art GPU methods. This has enabled the processing of k-NNG construction on a data set containing 20 million image vectors, each with dimension 15,000, as part of a manifold embedding technique for analyzing the conformations of biomolecules

University of Wisconsin-Milwaukee

Optimum Algorithms for a Model of Direct Chaining

Author: Chen Wen-Chin
Vitter Jeffrey Scott
Publication venue: Society for Industrial and Applied Mathematics
Publication date: 01/05/1985
Field of study

Direct chaining is a popular and efficient class of hashing algorithms. In this paper we study optimum algorithms among direct chaining methods, under the restrictions that the records in the hash table are not moved after they are inserted, that for each chain the relative ordering of the records in the chain does not change after more insertions, and that only one link field is used per table slot. The varied-insertion coalesced hashing method (VICH), which is proposed and analyzed in [CV84], is conjectured to be optimum among all direct chaining algorithms in this class. We give strong evidence in favor of the conjecture by showing that VICH is optimum under fairly general conditions

KU ScholarWorks

DACHash: A Dynamic, Cache-Aware and Concurrent Hash Table on GPUs

Author: Zhou Hao
Publication venue: eGrove
Publication date: 01/01/2021
Field of study

GPU acceleration of hash tables in high-volume transaction applications such as computational geometry and bio-informatics are emerging. Recently, several hash table designs have been proposed on GPUs, but our analysis shows that they still do not adequately factor in several important aspects of a GPU’s execution environment, leaving large room forfurther optimization

eGrove (Univ. of Mississippi)