Search CORE

118 research outputs found

GPU LSM: A Dynamic Dictionary Data Structure for the GPU

Author: Amenta Nina
Ashkiani Saman
Farach-Colton Martin
Li Shengren
Owens John D.
Publication venue
Publication date: 01/01/2018
Field of study

We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA K40c GPU has an average update (insertion or deletion) rate of 225 M elements/s, 13.5x faster than merging items into a sorted array. The GPU LSM supports the retrieval operations of lookup, count, and range query operations with an average rate of 75 M, 32 M and 23 M queries/s respectively. The trade-off for the dynamic updates is that the sorted array is almost twice as fast on retrievals. We believe that our GPU LSM is the first dynamic general-purpose dictionary data structure for the GPU.Comment: 11 pages, accepted to appear on the Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS'18

arXiv.org e-Print Archive

eScholarship - University of California

Optimum Algorithms for a Model of Direct Chaining

Author: Chen Wen-Chin
Vitter Jeffrey Scott
Publication venue: Society for Industrial and Applied Mathematics
Publication date: 01/05/1985
Field of study

Direct chaining is a popular and efficient class of hashing algorithms. In this paper we study optimum algorithms among direct chaining methods, under the restrictions that the records in the hash table are not moved after they are inserted, that for each chain the relative ordering of the records in the chain does not change after more insertions, and that only one link field is used per table slot. The varied-insertion coalesced hashing method (VICH), which is proposed and analyzed in [CV84], is conjectured to be optimum among all direct chaining algorithms in this class. We give strong evidence in favor of the conjecture by showing that VICH is optimum under fairly general conditions

KU ScholarWorks

Analysis of Early-Insertion Standard Coalesced Hashing

Author: Chen Wen-Chin
Vitter Jeffrey Scott
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 16/03/2011
Field of study

This paper analyzes the early-insertion standard coalesced hashing method (EISCH), which is a variant of the standard coalesced hashing algorithm (SCH) described in [Knu73], [Vit80] and [Vit82b]. The analysis answers the open problem posed in [Vit80]. The number of probes per successful search in full tables is 5% better with EISCH than with SCH

KU ScholarWorks

Meerkat: A framework for Dynamic Graph Algorithms on GPUs

Author: Cheramangalath Unnikrishnan
Concessao Kevin Jude
Dev MJ Ricky
Nasre Rupesh
Publication venue
Publication date: 02/06/2023
Field of study

Graph algorithms are challenging to implement due to their varying topology and irregular access patterns. Real-world graphs are dynamic in nature and routinely undergo edge and vertex additions, as well as, deletions. Typical examples of dynamic graphs are social networks, collaboration networks, and road networks. Applying static algorithms repeatedly on dynamic graphs is inefficient. Unfortunately, we know little about how to efficiently process dynamic graphs on massively parallel architectures such as GPUs. Existing approaches to represent and process dynamic graphs are either not general or inefficient. In this work, we propose a library-based framework for dynamic graph algorithms that proposes a GPU-tailored graph representation and exploits the warp-cooperative execution model. The library, named Meerkat, builds upon a recently proposed dynamic graph representation on GPUs. This representation exploits a hashtable-based mechanism to store a vertex's neighborhood. Meerkat also enables fast iteration through a group of vertices, such as the whole set of vertices or the neighbors of a vertex. Based on the efficient iterative patterns encoded in Meerkat, we implement dynamic versions of the popular graph algorithms such as breadth-first search, single-source shortest paths, triangle counting, weakly connected components, and PageRank. Compared to the state-of-the-art dynamic graph analytics framework Hornet, Meerkat is

12.6\times

12.94\times

, and

6.1\times

faster, for query, insert, and delete operations, respectively. Using a variety of real-world graphs, we observe that Meerkat significantly improves the efficiency of the underlying dynamic graph algorithm. Meerkat performs

1.17\times

for BFS,

1.32\times

for SSSP,

1.74\times

for PageRank, and

6.08\times

for WCC, better than Hornet on average

arXiv.org e-Print Archive

Data structures for set manipulation- hash table, 1986

Author
Publication venue
Publication date
Field of study

The most important issue addressed in this thesis is the efficient implementation of hash table methods. There are credential trade-offs in a desired implement ion. These are discussed in issues such as hash addressing, handling collision, hash table layout., and bucket overflow problems. The criteria of good hash function is providing even distribution. Collision is the major problem in hash table methods. Two major hashtable methods are discussed. Open Addressing Method places the synonymous items somewhere within the table. The Chaining Method, however, chains all synonymies and stores them somewhere outside the table called overflow area. Hash table is widely used by system software as an ideal data structure. Hash Table -applications canbe found in compiler's symbol table, database, directories of file organizations, as well as in problem-solving application programs

DigitalCommons@Robert W. Woodruff Library

Scalable Hash Tables

Author: Maier Tobias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 19/07/2022
Field of study

The term scalability with regards to this dissertation has two meanings: It means taking the best possible advantage of the provided resources (both computational and memory resources) and it also means scaling data structures in the literal sense, i.e., growing the capacity, by “rescaling” the table. Scaling well to computational resources implies constructing the fastest best per- forming algorithms and data structures. On today’s many-core machines the best performance is immediately associated with parallelism. Since CPU frequencies have stopped growing about 10-15 years ago, parallelism is the only way to take ad- vantage of growing computational resources. But for data structures in general and hash tables in particular performance is not only linked to faster computations. The most execution time is actually spent waiting for memory. Thus optimizing data structures to reduce the amount of memory accesses or to take better advantage of the memory hierarchy especially through predictable access patterns and prefetch- ing is just as important. In terms of scaling the size of hash tables we have identified three domains where scaling hash-based data structures have been lacking previously, i.e., space effi- cient growing, concurrent hash tables, and Approximate Membership Query data structures (AMQ-filter). Throughout this dissertation, we describe the problems in these areas and develop efficient solutions. We highlight three different libraries that we have developed over the course of this dissertation, each containing mul- tiple implementations that have shown throughout our testing to be among the best implementations in their respective domains. In this composition they offer a comprehensive toolbox that can be used to solve many kinds of hashing related problems or to develop individual solutions for further ones. DySECT is a library for space efficient hash tables specifically growing space effi- cient hash tables that scale with their input size. It contains the namesake DySECT data structure in addition to a number of different probing and cuckoo based im- plementations. Growt is a library for highly efficient concurrent hash tables. It contains a very fast base table and a number of extensions to adapt this table to match any purpose. All extension can be combined to create a variety of different interfaces. In our extensive experimental evaluation, each adaptation has shown to be among the best hash tables for their specific purpose. Lpqfilter is a library for concurrent approximate membership query (AMQ) data structures. It contains some original data structures, like the linear probing quotient filter, as well as some novel approaches to dynamically sized quotient filters

KITopen

Recommended from our members

Internal hashing for dynamic and static tables

Author: Cook Curtis R.
Lewis Ted G.
Publication venue: Oregon State University. Department of Computer Science
Publication date
Field of study

This tutorial discusses one of the oldest problems in computing: how to search and retrieve keyed information from a list in the least amount of time. Hashing - a technique that mathematically converts a key into a storage address - is one of the best methods of finding and retrieving information associated with a unique identifying key. We briefly survey techniques which have evolved over the past 25 years and then introduce more recent research results for extremely compact and fast methods based on perfect and minimal perfect hashing. Perfect and minimal perfect hashing is useful for rapid lookup in a static table such as keywords in a compiler, spelling checkers, and database management systems. The results presented here show techniques for constructing long lists which can be searched in one memory reference.KEYWORDS AND PHRASES: Key-to-address transformation, hash coding, hash table, scatter table, bucket hashing, perfect hashing, minimal perfect hashin

ScholarsArchive@OSU

Simulating Uniform Hashing in Constant Time and Optimal Space

Author: Pagh Rasmus
Östlin Anna
Publication venue: 'Aarhus University Library'
Publication date: 05/06/2002
Field of study

Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. In this paper it is shown how to implement hash functions that can be evaluated on a RAM in constant time, and behave like truly random functions on any set of n inputs, with high probability. The space needed to represent a function is O(n) words, which is the best possible (and a polynomial improvement compared to previous fast hash functions). As a consequence, a broad class of hashing schemes can be implemented to meet, with high probability, the performance guarantees of their uniform hashing analysis

Tidsskrift.dk (Det Kongelige Bibliotek)