2 research outputs found

    Performance Of Load Balancing Techniques For Join Operations In Shared-Noting Database Management Systems

    No full text
    We investigate various load balancing approaches for hash-based join techniques popular in multicomputer-based shared-nothing database systems. When the tuples are not uniformly distributed among the hash buckets, redistribution of these buckets among the processors is necessary to maintain good system performance. Two recent load balancing techniques which rely on sampling and incremental balancing, respectively, have been shown to be more robust than conventional methods. The comparison of these two approaches, however, has not been investigated. In this study, we improve these two schemes and implement them along with a conventional method and a standard join technique which does not do load balancing on an nCUBE/2 parallel computer to compare their performance. Our experi- mental results indicate that the sampling technique is the better approach. To further evaluate the performance of these techniques under diverse hardware conditions, we also develop a cost model and implement a simulator to perform sensitivity analyses with respect to various hardware parameters. The simulation results show that both sampling and incremental techniques provide noticeable savings over conventional methods, with the sampling approach being more scalable in supporting very large database systems. © 1999 Academic Press

    Performance Analysis of Three Database Server Distribution Algorithms

    Get PDF
    Although the concept of the distributed database has been around for over 20 years, it has not dominated the computer landscape especially in business-related applications. This paper will explore the effectiveness of distributed database under a variety of conditions by conducting experiments using a number of different combinations of variables listed below. Specifically, the following questions will be researched: How does the workload intensity influence the need and performance of distributed database applications? How does the number of nodes the database is stored upon affect the data access time? How does the method used to assign a given query to a specific database node influence the access time? The first variable is workload intensity. It is expected as intensity increases the need to utilize some form of distributed database increases. The second factor is number of nodes upon which the database is distributed. One would expect that as the number of nodes increases, access time would be reduced. The third variable is the algorithm used to distribute the inquiries across multiple nodes. A symmetric algorithm, one that provides an equal chance of any given inquiry landing on any specific node, would be expected to offer the most promise. However, results indicate the load balanced method outperforms both the sequential and random selection methods
    corecore