5,586 research outputs found
Implementation of a fixing strategy and parallelization in a recent global optimization method
Electromagnetism-like Mechanism (EM) heuristic is a population-based stochastic global optimization method inspired by the attraction-repulsion mechanism of the electromagnetism theory. EM was originally proposed for solving continuous global optimization problems with bound constraints and it has been shown that the algorithm performs quite well compared to some other global optimization methods. In this work, we propose two extensions to improve the performance of the original algorithm: First, we introduce a fixing strategy that provides a mechanism for not being trapped in local minima, and thus, improves the effectiveness of the search. Second, we use the proposed fixing strategy to parallelize the algorithm and utilize a cooperative parallel search on the solution space. We then evaluate the performance of our study under three criteria: the quality of the solutions, the number of function evaluations and the number of local minima obtained. Test problems are generated by an algorithm suggested in the literature that builds test problems with varying degrees of difficulty. Finally, we benchmark our results with that of the
Knitro solver with the multistart option set
Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines
In this paper, we address the problem of efficient execution of a computation
pattern, referred to here as the irregular wavefront propagation pattern
(IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in
several image processing operations. In the IWPP, data elements in the
wavefront propagate waves to their neighboring elements on a grid if a
propagation condition is satisfied. Elements receiving the propagated waves
become part of the wavefront. This pattern results in irregular data accesses
and computations. We develop and evaluate strategies for efficient computation
and propagation of wavefronts using a multi-level queue structure. This queue
structure improves the utilization of fast memories in a GPU and reduces
synchronization overheads. We also develop a tile-based parallelization
strategy to support execution on multiple CPUs and GPUs. We evaluate our
approaches on a state-of-the-art GPU accelerated machine (equipped with 3 GPUs
and 2 multicore CPUs) using the IWPP implementations of two widely used image
processing operations: morphological reconstruction and euclidean distance
transform. Our results show significant performance improvements on GPUs. The
use of multiple CPUs and GPUs cooperatively attains speedups of 50x and 85x
with respect to single core CPU executions for morphological reconstruction and
euclidean distance transform, respectively.Comment: 37 pages, 16 figure
Scalable Parallel Numerical Constraint Solver Using Global Load Balancing
We present a scalable parallel solver for numerical constraint satisfaction
problems (NCSPs). Our parallelization scheme consists of homogeneous worker
solvers, each of which runs on an available core and communicates with others
via the global load balancing (GLB) method. The parallel solver is implemented
with X10 that provides an implementation of GLB as a library. In experiments,
several NCSPs from the literature were solved and attained up to 516-fold
speedup using 600 cores of the TSUBAME2.5 supercomputer.Comment: To be presented at X10'15 Worksho
- …