83 research outputs found

    A Robust Scheme for Multilevel Extendible Hashing

    Full text link
    Dynamic hashing, while surpassing other access methods for uniformly distributed data, usually performs badly for non-uniformly distributed data. We propose a robust scheme for multi-level extendible hashing allowing efficient processing of skewed data as well as uniformly distributed data. In order to test our access method we implemented it and compared it to several existing hashing schemes. The results of the experimental evaluation demonstrate the superiority of our approach in both index size and performance

    A fast retrieval method for local or distributed data

    Get PDF
    In this paper, we propose an improvement to an approach to data retrieval which is performed in only one access to a bucket hash table or file. The idea behind it, is to let the system assign one digit to the record key so that the hashed new record key is "forced " to fall in a bucket according to some practical criteria. From a user point of view this forced hash procedure could be thought of as a “user-system cooperating code assignment”, since the user is free to code an object to be retrieved but the system may append s a digit to that code. For one access retrieval purposes, the new code key-digit is used to find its address. However, should the digit is not known, the retrieval process will find the key in its surrounding, provided it exists. In this approach it is unnecessary a bucket overflow area of any kind, since this method allows a high load factor for practical use. In the event of the hash table is nearly full, a simple procedure could be ran to extend the table size either by keeping the original digit or assigning new ones. For distributed data sets this methodology shows an appealing performance in real life and simulation results.Eje: Programación concurrenteRed de Universidades con Carreras en Informática (RedUNCI

    Early Grouping Gets the Skew

    Full text link
    We propose a new algorithm for external grouping with large results. Our approach handles skewed data gracefully and lowers the amount of random IO on disk considerably. Contrary to existing grouping algorithms, our new algorithm does not require the optimizer to employ complicated or error-prone procedures adjusting the parameters prior to query plan execution. We implemented several variants of our algorithm as well as the most commonly used algorithms for grouping and carried out extensive experiments on both synthetic and real data. The results of these experiments reveal the dominance of our approach. In case of heavily skewed data we outperform the other algorithms by a factor of two

    Analysis and Comparison of Extendible Hashing and B+ Trees Access Methods

    Get PDF
    This thesis Is a discussion and evaluation of both extendible hashing and B+ tree. The study Includes a design and lmplementatlon under the UNIX system. Comparisons and analysis are made using empirical results.Computing and Information Scienc

    A fast retrieval method for local or distributed data

    Get PDF
    In this paper, we propose an improvement to an approach to data retrieval which is performed in only one access to a bucket hash table or file. The idea behind it, is to let the system assign one digit to the record key so that the hashed new record key is "forced " to fall in a bucket according to some practical criteria. From a user point of view this forced hash procedure could be thought of as a “user-system cooperating code assignment”, since the user is free to code an object to be retrieved but the system may append s a digit to that code. For one access retrieval purposes, the new code key-digit is used to find its address. However, should the digit is not known, the retrieval process will find the key in its surrounding, provided it exists. In this approach it is unnecessary a bucket overflow area of any kind, since this method allows a high load factor for practical use. In the event of the hash table is nearly full, a simple procedure could be ran to extend the table size either by keeping the original digit or assigning new ones. For distributed data sets this methodology shows an appealing performance in real life and simulation results.Eje: Programación concurrenteRed de Universidades con Carreras en Informática (RedUNCI

    Study of Two Competing Index Mechanisms: Prefix B+-tree and Trie Structures

    Get PDF
    This thesis deals with two competing index mechanisms, namely, prefix B+-trees and trie structures, which are useful for handling varying size keys in document retrieval systems. Refinements and variants of these two indexing methods are studied. Tradeoffs of storage requirements and retrieval time or performance benefits and maintainance difficulties for various refining approaches are examined.Computing and Information Scienc

    Prefix Recoding: a Front-end Compression Technique for Simple Prefix B-trees

    Get PDF
    This study examines the effect of receding common prefixes of shortest separators and thus extending the alphabet and compressing both the sequence set and the simple prefix B-tree index. The purpose of the study is to investigate the effect on a simple prefix B-tree of receding prefixes with a shorter symbol that maintains collating sequence order.Computing and Information Science

    Round-Hashing for Data Storage: Distributed Servers and External-Memory Tables

    Get PDF
    This paper proposes round-hashing, which is suitable for data storage on distributed servers and for implementing external-memory tables in which each lookup retrieves at most one single block of external memory, using a stash. For data storage, round-hashing is like consistent hashing as it avoids a full rehashing of the keys when new servers are added. Experiments show that the speed to serve requests is tenfold or more than the state of the art. In distributed data storage, this guarantees better throughput for serving requests and, moreover, greatly reduces decision times for which data should move to new servers as rescanning data is much faster

    The Ubiquitous B-tree: Volume II

    Get PDF
    Major developments relating to the B-tree from early 1979 through the fall of 1986 are presented. This updates the well-known article, The Ubiquitous B-Tree by Douglas Comer (Computing Surveys, June 1979). After a basic overview of B and B+ trees, recent research is cited as well as descriptions of nine B-tree variants developed since Comer\u27s article. The advantages and disadvantages of each variant over the basic B-tree are emphasized. Also included are a discussion of concurrency control issues in B-trees and a speculation on the future of B-trees

    Graphical image persistence and code generation for object oriented databases

    Get PDF
    Attached is the detailed description of the design and implementation of graphical image persistence and code generation for object oriented databases. Graphical image persistent is incorporated into a graphics editor called OODINI. OODINI creates and manipulates graphical schemas for object-oriented databases. This graphical image on secondary storage is then translated into an abstract, generic code for dual model databases. This abstract code, DAL can then be converted into different dual model database languages. We provide an example by generating code for the VODAK Data Modeling language. It is also possible to generate a different abstract language code, OODAL from a graphical schema. This language does not have any dual model database architectural dependencies
    corecore