3,337 research outputs found

    Improved Analysis of Deterministic Load-Balancing Schemes

    Full text link
    We consider the problem of deterministic load balancing of tokens in the discrete model. A set of nn processors is connected into a dd-regular undirected network. In every time step, each processor exchanges some of its tokens with each of its neighbors in the network. The goal is to minimize the discrepancy between the number of tokens on the most-loaded and the least-loaded processor as quickly as possible. Rabani et al. (1998) present a general technique for the analysis of a wide class of discrete load balancing algorithms. Their approach is to characterize the deviation between the actual loads of a discrete balancing algorithm with the distribution generated by a related Markov chain. The Markov chain can also be regarded as the underlying model of a continuous diffusion algorithm. Rabani et al. showed that after time T=O(log⁥(Kn)/ÎŒ)T = O(\log (Kn)/\mu), any algorithm of their class achieves a discrepancy of O(dlog⁥n/ÎŒ)O(d\log n/\mu), where ÎŒ\mu is the spectral gap of the transition matrix of the graph, and KK is the initial load discrepancy in the system. In this work we identify some natural additional conditions on deterministic balancing algorithms, resulting in a class of algorithms reaching a smaller discrepancy. This class contains well-known algorithms, eg., the Rotor-Router. Specifically, we introduce the notion of cumulatively fair load-balancing algorithms where in any interval of consecutive time steps, the total number of tokens sent out over an edge by a node is the same (up to constants) for all adjacent edges. We prove that algorithms which are cumulatively fair and where every node retains a sufficient part of its load in each step, achieve a discrepancy of O(min⁥{dlog⁥n/ÎŒ,dn})O(\min\{d\sqrt{\log n/\mu},d\sqrt{n}\}) in time O(T)O(T). We also show that in general neither of these assumptions may be omitted without increasing discrepancy. We then show by a combinatorial potential reduction argument that any cumulatively fair scheme satisfying some additional assumptions achieves a discrepancy of O(d)O(d) almost as quickly as the continuous diffusion process. This positive result applies to some of the simplest and most natural discrete load balancing schemes.Comment: minor corrections; updated literature overvie

    Analysis, Tracing, Characterization and Performance Modeling of Select ASCI Applications for BlueGene/L Using Parallel Discrete Event Simulation

    Get PDF
    Caltech's Jet Propulsion Laboratory (JPL) and Center for Advanced Computer Architecture (CACR) are conducting application and simulation analyses of Blue Gene/L[1] in order to establish a range of effectiveness of the architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution

    Adaptive Parallel Iterative Deepening Search

    Full text link
    Many of the artificial intelligence techniques developed to date rely on heuristic search through large spaces. Unfortunately, the size of these spaces and the corresponding computational effort reduce the applicability of otherwise novel and effective algorithms. A number of parallel and distributed approaches to search have considerably improved the performance of the search process. Our goal is to develop an architecture that automatically selects parallel search strategies for optimal performance on a variety of search problems. In this paper we describe one such architecture realized in the Eureka system, which combines the benefits of many different approaches to parallel heuristic search. Through empirical and theoretical analyses we observe that features of the problem space directly affect the choice of optimal parallel search strategy. We then employ machine learning techniques to select the optimal parallel search strategy for a given problem space. When a new search task is input to the system, Eureka uses features describing the search space and the chosen architecture to automatically select the appropriate search strategy. Eureka has been tested on a MIMD parallel processor, a distributed network of workstations, and a single workstation using multithreading. Results generated from fifteen puzzle problems, robot arm motion problems, artificial search spaces, and planning problems indicate that Eureka outperforms any of the tested strategies used exclusively for all problem instances and is able to greatly reduce the search time for these applications

    Towards Optimal Distributed Node Scheduling in a Multihop Wireless Network through Local Voting

    Full text link
    In a multihop wireless network, it is crucial but challenging to schedule transmissions in an efficient and fair manner. In this paper, a novel distributed node scheduling algorithm, called Local Voting, is proposed. This algorithm tries to semi-equalize the load (defined as the ratio of the queue length over the number of allocated slots) through slot reallocation based on local information exchange. The algorithm stems from the finding that the shortest delivery time or delay is obtained when the load is semi-equalized throughout the network. In addition, we prove that, with Local Voting, the network system converges asymptotically towards the optimal scheduling. Moreover, through extensive simulations, the performance of Local Voting is further investigated in comparison with several representative scheduling algorithms from the literature. Simulation results show that the proposed algorithm achieves better performance than the other distributed algorithms in terms of average delay, maximum delay, and fairness. Despite being distributed, the performance of Local Voting is also found to be very close to a centralized algorithm that is deemed to have the optimal performance
    • 

    corecore