148,029 research outputs found
Parallel Local Search for the Costas Array Problem
The Costas Array Problem is a highly combina- torial problem linked to radar applications. We present in this paper its detailed modeling and solving by Adaptive Search, a constraint-based local search method. Experiments have been done on both sequential and parallel hardware up to several hundreds of cores. Performance evaluation of the sequential version shows results outperforming previous implementations, while the parallel version shows nearly linear speedups up to 8,192 cores
Tackling Dynamic Vehicle Routing Problem with Time Windows by means of Ant Colony System
The Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) is an
extension of the well-known Vehicle Routing Problem (VRP), which takes into
account the dynamic nature of the problem. This aspect requires the vehicle
routes to be updated in an ongoing manner as new customer requests arrive in
the system and must be incorporated into an evolving schedule during the
working day. Besides the vehicle capacity constraint involved in the classical
VRP, DVRPTW considers in addition time windows, which are able to better
capture real-world situations. Despite this, so far, few studies have focused
on tackling this problem of greater practical importance. To this end, this
study devises for the resolution of DVRPTW, an ant colony optimization based
algorithm, which resorts to a joint solution construction mechanism, able to
construct in parallel the vehicle routes. This method is coupled with a local
search procedure, aimed to further improve the solutions built by ants, and
with an insertion heuristics, which tries to reduce the number of vehicles used
to service the available customers. The experiments indicate that the proposed
algorithm is competitive and effective, and on DVRPTW instances with a higher
dynamicity level, it is able to yield better results compared to existing
ant-based approaches.Comment: 10 pages, 2 figure
High-Quality Shared-Memory Graph Partitioning
Partitioning graphs into blocks of roughly equal size such that few edges run
between blocks is a frequently needed operation in processing graphs. Recently,
size, variety, and structural complexity of these networks has grown
dramatically. Unfortunately, previous approaches to parallel graph partitioning
have problems in this context since they often show a negative trade-off
between speed and quality. We present an approach to multi-level shared-memory
parallel graph partitioning that guarantees balanced solutions, shows high
speed-ups for a variety of large graphs and yields very good quality
independently of the number of cores used. For example, on 31 cores, our
algorithm partitions our largest test instance into 16 blocks cutting less than
half the number of edges than our main competitor when both algorithms are
given the same amount of time. Important ingredients include parallel label
propagation for both coarsening and improvement, parallel initial partitioning,
a simple yet effective approach to parallel localized local search, and fast
locality preserving hash tables
Scalable Parallel Numerical Constraint Solver Using Global Load Balancing
We present a scalable parallel solver for numerical constraint satisfaction
problems (NCSPs). Our parallelization scheme consists of homogeneous worker
solvers, each of which runs on an available core and communicates with others
via the global load balancing (GLB) method. The parallel solver is implemented
with X10 that provides an implementation of GLB as a library. In experiments,
several NCSPs from the literature were solved and attained up to 516-fold
speedup using 600 cores of the TSUBAME2.5 supercomputer.Comment: To be presented at X10'15 Worksho
Parallel Graph Partitioning for Complex Networks
Processing large complex networks like social networks or web graphs has
recently attracted considerable interest. In order to do this in parallel, we
need to partition them into pieces of about equal size. Unfortunately, previous
parallel graph partitioners originally developed for more regular mesh-like
networks do not work well for these networks. This paper addresses this problem
by parallelizing and adapting the label propagation technique originally
developed for graph clustering. By introducing size constraints, label
propagation becomes applicable for both the coarsening and the refinement phase
of multilevel graph partitioning. We obtain very high quality by applying a
highly parallel evolutionary algorithm to the coarsened graph. The resulting
system is both more scalable and achieves higher quality than state-of-the-art
systems like ParMetis or PT-Scotch. For large complex networks the performance
differences are very big. For example, our algorithm can partition a web graph
with 3.3 billion edges in less than sixteen seconds using 512 cores of a high
performance cluster while producing a high quality partition -- none of the
competing systems can handle this graph on our system.Comment: Review article. Parallelization of our previous approach
arXiv:1402.328
Submodular memetic approximation for multiobjective parallel test paper generation
Parallel test paper generation is a biobjective distributed resource optimization problem, which aims to generate multiple similarly optimal test papers automatically according to multiple user-specified assessment criteria. Generating high-quality parallel test papers is challenging due to its NP-hardness in both of the collective objective functions. In this paper, we propose a submodular memetic approximation algorithm for solving this problem. The proposed algorithm is an adaptive memetic algorithm (MA), which exploits the submodular property of the collective objective functions to design greedy-based approximation algorithms for enhancing steps of the multiobjective MA. Synergizing the intensification of submodular local search mechanism with the diversification of the population-based submodular crossover operator, our algorithm can jointly optimize the total quality maximization objective and the fairness quality maximization objective. Our MA can achieve provable near-optimal solutions in a huge search space of large datasets in efficient polynomial runtime. Performance results on various datasets have shown that our algorithm has drastically outperformed the current techniques in terms of paper quality and runtime efficiency
Parallel local search for solving Constraint Problems on the Cell Broadband Engine (Preliminary Results)
We explore the use of the Cell Broadband Engine (Cell/BE for short) for
combinatorial optimization applications: we present a parallel version of a
constraint-based local search algorithm that has been implemented on a
multiprocessor BladeCenter machine with twin Cell/BE processors (total of 16
SPUs per blade). This algorithm was chosen because it fits very well the
Cell/BE architecture and requires neither shared memory nor communication
between processors, while retaining a compact memory footprint. We study the
performance on several large optimization benchmarks and show that this
achieves mostly linear time speedups, even sometimes super-linear. This is
possible because the parallel implementation might explore simultaneously
different parts of the search space and therefore converge faster towards the
best sub-space and thus towards a solution. Besides getting speedups, the
resulting times exhibit a much smaller variance, which benefits applications
where a timely reply is critical
- …