779 research outputs found
Parallel Simulated Annealing
Since the paper by Kirkpatrick, Gelatt and Vecchi in 1983, the use of Simulated Annealing (SA) in solving combinatoric optimization problems has increased substantially. The SA algorithm has been applied to difficult problems in the difficult problems in the digital design automation such as cell placement and wire routing. While these studies have yielded good or near optimum solutions, they have required very long computer execution times (hours and days). These long times, coupled with the recent availability of the number of commercial parallel processors, has prompted the search for parallel implementations of the SA algorithm. The goal ahs been to obtain algorithmic speedup through the exploitation of parallelism. This paper presents a method for mapping the SA algorithm onto a dynamically structured tree of processors. Such a tree of processors can be mapped onto both shared memory and message based styles of parallel processors. The parallel SA (PSA) algorithm is discussed and its performance evaluated using simulation techniques. An important property of the PSA algorithm presented is that it maintains the same move decision sequence as the Serial SA (SSA) algorithm this avoiding problems associated with move conflicts, erroneous move acceptance/rejection decisions and oscillations which have been associated with other PSA algorithm proposals. The PSA algorithm presented fully preserves the convergence properties of the SSA algorithm with speedups varying roughly as log2N where N is the number of processors in the parallel processor
Combining spectral sequencing and parallel simulated annealing for the MinLA problem
In this paper we present and analyze new sequential and parallel
heuristics to approximate the Minimum Linear Arrangement problem
(MinLA). The heuristics consist in obtaining a first global solution
using Spectral Sequencing and improving it locally through Simulated
Annealing. In order to accelerate the annealing process, we present a
special neighborhood distribution that tends to favor moves with high
probability to be accepted. We show how to make use of this
neighborhood to parallelize the Metropolis stage on distributed memory
machines by mapping partitions of the input graph to processors and
performing moves concurrently. The paper reports the results obtained
with this new heuristic when applied to a set of large graphs,
including graphs arising from finite elements methods and graphs
arising from VLSI applications. Compared to other heuristics, the
measurements obtained show that the new heuristic improves the
solution quality, decreases the running time and offers an excellent
speedup when ran on a commodity network made of nine personal
computers.Postprint (published version
On the convergence of parallel simulated annealing
AbstractWe consider a parallel simulated annealing algorithm that is closely related to the so-called parallel chain algorithm. Periodically a new state from p∈N1 states is chosen as the initial state for p simulated annealing Markov chains running independent of each other. We use selection strategies such as best-wins or worst-wins and show that the algorithm in the case of best-wins does not in general converge to the set of global minima. Indeed the period length and the number p have to be large enough. In the case of worst-wins the convergence result is true. The phenomenon of the superiority of worst-wins over best-wins already occurs in finite-time simulations
Speedup and accuracy of parallel simulated annealing algorithms
This work presents two parallel simulated annealing algorithms to solve the vehicle routing problem with time
windows (VRPTW). The aim is to explore speedups and investigate how the shorter annealing chains (ISR algotithm) and
the shorter number of cooling stages (ISC algorithm) influence the accuracy of solutions to the problem
A parallel simulated annealing algorithm for standard cell placement on a hypercube computer
A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm
Parallel genetic algorithm and parallel simulated annealing algorithm for the closest string problem
In this paper, we design genetic algorithm and simulated
annealing algorithm and their parallel versions to solve the Closest String
problem. Our implementation and experiments show usefulness of the
parallel GA and SA algorithms
Networked Computing in Wireless Sensor Networks for Structural Health Monitoring
This paper studies the problem of distributed computation over a network of
wireless sensors. While this problem applies to many emerging applications, to
keep our discussion concrete we will focus on sensor networks used for
structural health monitoring. Within this context, the heaviest computation is
to determine the singular value decomposition (SVD) to extract mode shapes
(eigenvectors) of a structure. Compared to collecting raw vibration data and
performing SVD at a central location, computing SVD within the network can
result in significantly lower energy consumption and delay. Using recent
results on decomposing SVD, a well-known centralized operation, into
components, we seek to determine a near-optimal communication structure that
enables the distribution of this computation and the reassembly of the final
results, with the objective of minimizing energy consumption subject to a
computational delay constraint. We show that this reduces to a generalized
clustering problem; a cluster forms a unit on which a component of the overall
computation is performed. We establish that this problem is NP-hard. By
relaxing the delay constraint, we derive a lower bound to this problem. We then
propose an integer linear program (ILP) to solve the constrained problem
exactly as well as an approximate algorithm with a proven approximation ratio.
We further present a distributed version of the approximate algorithm. We
present both simulation and experimentation results to demonstrate the
effectiveness of these algorithms
- …