3,337 research outputs found
Improved Analysis of Deterministic Load-Balancing Schemes
We consider the problem of deterministic load balancing of tokens in the
discrete model. A set of processors is connected into a -regular
undirected network. In every time step, each processor exchanges some of its
tokens with each of its neighbors in the network. The goal is to minimize the
discrepancy between the number of tokens on the most-loaded and the
least-loaded processor as quickly as possible.
Rabani et al. (1998) present a general technique for the analysis of a wide
class of discrete load balancing algorithms. Their approach is to characterize
the deviation between the actual loads of a discrete balancing algorithm with
the distribution generated by a related Markov chain. The Markov chain can also
be regarded as the underlying model of a continuous diffusion algorithm. Rabani
et al. showed that after time , any algorithm of their
class achieves a discrepancy of , where is the spectral
gap of the transition matrix of the graph, and is the initial load
discrepancy in the system.
In this work we identify some natural additional conditions on deterministic
balancing algorithms, resulting in a class of algorithms reaching a smaller
discrepancy. This class contains well-known algorithms, eg., the Rotor-Router.
Specifically, we introduce the notion of cumulatively fair load-balancing
algorithms where in any interval of consecutive time steps, the total number of
tokens sent out over an edge by a node is the same (up to constants) for all
adjacent edges. We prove that algorithms which are cumulatively fair and where
every node retains a sufficient part of its load in each step, achieve a
discrepancy of in time . We
also show that in general neither of these assumptions may be omitted without
increasing discrepancy. We then show by a combinatorial potential reduction
argument that any cumulatively fair scheme satisfying some additional
assumptions achieves a discrepancy of almost as quickly as the
continuous diffusion process. This positive result applies to some of the
simplest and most natural discrete load balancing schemes.Comment: minor corrections; updated literature overvie
Analysis, Tracing, Characterization and Performance Modeling of Select ASCI Applications for BlueGene/L Using Parallel Discrete Event Simulation
Caltech's Jet Propulsion Laboratory (JPL) and Center for Advanced Computer Architecture (CACR) are conducting application and simulation analyses of Blue Gene/L[1] in order to establish a range of effectiveness of the architecture in performing important classes of computations and to determine the design sensitivity of the global interconnect network in support of real world ASCI application execution
Adaptive Parallel Iterative Deepening Search
Many of the artificial intelligence techniques developed to date rely on
heuristic search through large spaces. Unfortunately, the size of these spaces
and the corresponding computational effort reduce the applicability of
otherwise novel and effective algorithms. A number of parallel and distributed
approaches to search have considerably improved the performance of the search
process. Our goal is to develop an architecture that automatically selects
parallel search strategies for optimal performance on a variety of search
problems. In this paper we describe one such architecture realized in the
Eureka system, which combines the benefits of many different approaches to
parallel heuristic search. Through empirical and theoretical analyses we
observe that features of the problem space directly affect the choice of
optimal parallel search strategy. We then employ machine learning techniques to
select the optimal parallel search strategy for a given problem space. When a
new search task is input to the system, Eureka uses features describing the
search space and the chosen architecture to automatically select the
appropriate search strategy. Eureka has been tested on a MIMD parallel
processor, a distributed network of workstations, and a single workstation
using multithreading. Results generated from fifteen puzzle problems, robot arm
motion problems, artificial search spaces, and planning problems indicate that
Eureka outperforms any of the tested strategies used exclusively for all
problem instances and is able to greatly reduce the search time for these
applications
Towards Optimal Distributed Node Scheduling in a Multihop Wireless Network through Local Voting
In a multihop wireless network, it is crucial but challenging to schedule
transmissions in an efficient and fair manner. In this paper, a novel
distributed node scheduling algorithm, called Local Voting, is proposed. This
algorithm tries to semi-equalize the load (defined as the ratio of the queue
length over the number of allocated slots) through slot reallocation based on
local information exchange. The algorithm stems from the finding that the
shortest delivery time or delay is obtained when the load is semi-equalized
throughout the network. In addition, we prove that, with Local Voting, the
network system converges asymptotically towards the optimal scheduling.
Moreover, through extensive simulations, the performance of Local Voting is
further investigated in comparison with several representative scheduling
algorithms from the literature. Simulation results show that the proposed
algorithm achieves better performance than the other distributed algorithms in
terms of average delay, maximum delay, and fairness. Despite being distributed,
the performance of Local Voting is also found to be very close to a centralized
algorithm that is deemed to have the optimal performance
- âŠ