347 research outputs found
Multi-threading a state-of-the-art maximum clique algorithm
We present a threaded parallel adaptation of a state-of-the-art maximum clique
algorithm for dense, computationally challenging graphs. We show that near-linear speedups
are achievable in practice and that superlinear speedups are common. We include results for
several previously unsolved benchmark problems
A New Perspective on Randomized Gossip Algorithms
In this short note we propose a new approach for the design and analysis of
randomized gossip algorithms which can be used to solve the average consensus
problem. We show how that Randomized Block Kaczmarz (RBK) method - a method for
solving linear systems - works as gossip algorithm when applied to a special
system encoding the underlying network. The famous pairwise gossip algorithm
arises as a special case. Subsequently, we reveal a hidden duality of
randomized gossip algorithms, with the dual iterative process maintaining a set
of numbers attached to the edges as opposed to nodes of the network. We prove
that RBK obtains a superlinear speedup in the size of the block, and
demonstrate this effect through experiments
Adaptive Parallel Iterative Deepening Search
Many of the artificial intelligence techniques developed to date rely on
heuristic search through large spaces. Unfortunately, the size of these spaces
and the corresponding computational effort reduce the applicability of
otherwise novel and effective algorithms. A number of parallel and distributed
approaches to search have considerably improved the performance of the search
process. Our goal is to develop an architecture that automatically selects
parallel search strategies for optimal performance on a variety of search
problems. In this paper we describe one such architecture realized in the
Eureka system, which combines the benefits of many different approaches to
parallel heuristic search. Through empirical and theoretical analyses we
observe that features of the problem space directly affect the choice of
optimal parallel search strategy. We then employ machine learning techniques to
select the optimal parallel search strategy for a given problem space. When a
new search task is input to the system, Eureka uses features describing the
search space and the chosen architecture to automatically select the
appropriate search strategy. Eureka has been tested on a MIMD parallel
processor, a distributed network of workstations, and a single workstation
using multithreading. Results generated from fifteen puzzle problems, robot arm
motion problems, artificial search spaces, and planning problems indicate that
Eureka outperforms any of the tested strategies used exclusively for all
problem instances and is able to greatly reduce the search time for these
applications
A distributed evolutionary algorithm with a superlinear speedup for solving the vehicle routing problem
In this paper we present a distributed evolutionary algorithm for solving the capacitated vehicle routing problem. Our algorithm consists of autonomous processes that create heterogeneous evolutionary environments, perform evolution on separate populations of chromosomes, and communicate asynchronously through occasional migrations of chromosomes. The paper also presents experiments where the algorithm has been tested on some benchmark problem instances. By measuring the effects of distribution on solution quality and on computing time, the experiments confirm that the algorithm achieves a superlinear speedup
The Effects of Microprocessor Architecture on Speedup in Distrbuted Memory Supercomputers
Amdahl\u27s Law states that speedup in moving from one processor to N identical processors can never be greater than N, and in fact usually is lower than N because of operations that must be done sequentially. Amdahl\u27s Law gives us the following formula for speedup: Speedup \u3c or = (S+P)/(S+(P/N)) where is the number of processors, S is the percentage of the code that is serial (i.e., cannot be parallelized), and P is the percentage of code that is parallelizable. We can substitute 1 - S for P in the above formula and we see that as S approaches zero speedup approaches N. It can also be shown that seemingly small values of S can severely limit the maximum speedup. Researchers at the University of Maine saw speedups that seemed to contradict Amdahl\u27s Law, and identified an assumption made by the law that is not always true. When this assumption is not true, it is possible to achieve speedups that are larger than the theoretical maximum speedup of N given by Amdahl\u27s Law. The assumption in question is that the computer performance scales linearly as the size of the problem is reduced by dividing it over a larger number of processors. This assumption is not valid for computers with tiered memory. In this thesis we investigate superlinear speedup through a series of test programs specifically designed to exhibit superlinear speedup. After demonstrating these programs show superlinear speedup, we suggest methods for detecting the potential for superlinear speedup in a variety of algorithms
- …