4,257 research outputs found

    Data structure animation tutorial

    Get PDF
    This study is an animation tutorial for the people who wants to learn the Data Structure. The emphasis is placed on vivid animations to help the people to understand algorithms for data structure easily. Some of the implementations to be addressed are: stack (Array-Based Stack, Linked Stack), queue (Array-Based Queue), List (Circular Linked List, Double Linked List, Linear Linked List), sort (Quick Sort, Merge Sort, Bubble Sort, Shell Sort, Insertion Sort, Heap Sort, Radix Sort, Selection Sort), heap (Priority Queue, Heap Build, Heap Sort), recursive (Tower of Hanio), hashing (Open Hashing, Close Hashing) binary search (Loop, Recursive), tree (2-3 Tree, Huffman Tree, Binary Search Tree, Balance Tree). Conclusions are formulated in terms of further work to be accomplished in order to better help understanding the completed algorithm

    Cache-Oblivious Peeling of Random Hypergraphs

    Full text link
    The computation of a peeling order in a randomly generated hypergraph is the most time-consuming step in a number of constructions, such as perfect hashing schemes, random rr-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n))O(\mathrm{sort}(n)) I/Os and O(nlogn)O(n \log n) time to peel a random hypergraph with nn edges. We experimentally evaluate the performance of our implementation of this algorithm in a real-world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.67.6 billion keys in less than 2121 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets

    Fast and Powerful Hashing using Tabulation

    Get PDF
    Randomized algorithms are often enjoyed for their simplicity, but the hash functions employed to yield the desired probabilistic guarantees are often too complicated to be practical. Here we survey recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees. Simple tabulation hashing dates back to Zobrist [1970]. Keys are viewed as consisting of cc characters and we have precomputed character tables h1,...,hch_1,...,h_c mapping characters to random hash values. A key x=(x1,...,xc)x=(x_1,...,x_c) is hashed to h1[x1]h2[x2].....hc[xc]h_1[x_1] \oplus h_2[x_2].....\oplus h_c[x_c]. This schemes is very fast with character tables in cache. While simple tabulation is not even 4-independent, it does provide many of the guarantees that are normally obtained via higher independence, e.g., linear probing and Cuckoo hashing. Next we consider twisted tabulation where one input character is "twisted" in a simple way. The resulting hash function has powerful distributional properties: Chernoff-Hoeffding type tail bounds and a very small bias for min-wise hashing. This also yields an extremely fast pseudo-random number generator that is provably good for many classic randomized algorithms and data-structures. Finally, we consider double tabulation where we compose two simple tabulation functions, applying one to the output of the other, and show that this yields very high independence in the classic framework of Carter and Wegman [1977]. In fact, w.h.p., for a given set of size proportional to that of the space consumed, double tabulation gives fully-random hashing. We also mention some more elaborate tabulation schemes getting near-optimal independence for given time and space. While these tabulation schemes are all easy to implement and use, their analysis is not
    corecore