8 research outputs found
Scaling Techniques for Parallel Ant Colony Optimization on Large Problem Instances
Ant Colony Optimization (ACO) is a nature-inspired optimization metaheuristic which has been successfully applied to a wide range of different problems. However, a significant limiting factor in terms of its scalability is memory complexity; in many problems, the pheromone matrix which encodes trails left by ants grows quadratically with the instance size. For very large instances, this memory requirement is a limiting factor, making ACO an impractical technique. In this paper we propose a restricted variant of the pheromone matrix with linear memory complexity, which stores pheromone values only for members of a candidate set of next moves. We also evaluate two selection methods for moves outside the candidate set. Using a combination of these techniques we achieve, in a reasonable time, the best solution qualities recorded by ACO on the Art TSP Traveling Salesman Problem instances, and the first evaluation of a parallel implementation of MAX-MIN Ant System on instances of this scale (≥ 105 vertices). We find that, although ACO cannot yet achieve the solutions found by state-of-the-art genetic algorithms, we rapidly find approximate solutions within 1 -- 2% of the best known
Accelerating supply chains with Ant Colony Optimization across range of hardware solutions
This pre-print, arXiv:2001.08102v1 [cs.NE], was published subsequently by Elsevier in Computers and Industrial Engineering, vol. 147, 106610, pp. 1-14 on 29 Jun 2020 and is available at https://doi.org/10.1016/j.cie.2020.106610Ant Colony algorithm has been applied to various optimization problems, however most of the previous work on scaling and parallelism focuses on Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new idea comparison, the algorithmic dynamics does not always transfer to complex real-life problems, where additional meta-data is required during solution construction. This paper looks at real-life outbound supply chain problem using Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO architectures - Independent Ant Colonies (IAC) and Parallel Ants (PA). Results showed that PA was able to reach a higher solution quality in fewer iterations as the number of parallel instances increased. Furthermore, speed performance was measured across three different hardware solutions - 16 core CPU, 68 core Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization techniques such as SS-Roulette were implemented using C++ and CUDA. Although excellent for TSP, it was concluded that for the given supply chain problem GPUs are not suitable due to meta-data access footprint required. Furthermore, compared to their sequential counterpart, vectorized CPU AVX2 implementation achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at least up to 1024 parallel instances on the supply chain network problem solved
Parallelised and vectorised ant colony optimization
Ant Colony Optimisation (ACO) is a versatile population-based optimisation metaheuristic
based on the foraging behaviour of certain species of ant, and is part of the
Evolutionary Computation family of algorithms. While ACO generally provides good
quality solutions to the problems it is applied to, two key limitations prevent it from
being truly viable on large-scale problems: A high memory requirement that grows
quadratically with instance size, and high execution time. This thesis presents a parallelised
and vectorised implementation of ACO using OpenMP and AVX SIMD instructions;
while this alone is enough to improve upon the execution time of the algorithm,
this implementation also features an alternative memory structure and a novel candidate
set approach, the use of which significantly reduces the memory requirement of
ACO. This parallelism is enabled through the use of Max-Min Ant System, an ACO
variant that only utilises local memory during the solution process and therefore risks
no synchronisation issues, and an adaptation of vRoulette, a vector-compatible variant
of the common roulette wheel selection method. Through the use of these techniques
ACO is also able to find good quality solutions for the very large Art TSPs, a problem
set that has traditionally been unfeasible to solve with ACO due to high memory
requirements and execution time. These techniques can also benefit ACO when it
comes to solving other problems. In this case the Virtual Machine Placement problem,
in which Virtual Machines have to be efficiently allocated to Physical Machines in a
cloud environment, is used as a benchmark, with significant improvements to execution
time
Recommended from our members
OptPlatform: metaheuristic optimisation framework for solving complex real-world problems
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonWe optimise daily, whether that is planning a round trip that visits the most attractions within a given holiday budget or just taking a train instead of driving a car in a rush hour. Many problems, just like these, are solved by individuals as part of our daily schedule, and they are effortless and straightforward. If we now scale that to many individuals with many different schedules, like a school timetable, we get to a point where it is just not feasible or practical to solve by hand. In such instances, optimisation methods are used to obtain an optimal solution. In this thesis, a practical approach to optimisation has been taken by developing an optimisation platform with all the necessary tools to be used by practitioners who are not necessarily familiar with the subject of optimisation. First, a high-performance metaheuristic optimisation framework (MOF) called OptPlatform is implemented, and the versatility and performance are evaluated across multiple benchmarks and real-world optimisation problems. Results show that, compared to competing MOFs, the OptPlatform outperforms in both the solution quality and computation time. Second, the most suitable hardware platform for OptPlatform is determined by an in-depth analysis of Ant Colony Optimisation scaling across CPU, GPU and enterprise Xeon Phi. Contrary to the common benchmark problems used in the literature, the supply chain problem solved could not scale on GPUs. Third, a variety of metaheuristics are implemented into OptPlatform. Including, a new metaheuristic based on Imperialist Competitive Algorithm (ICA), called ICA with Independence and Constrained Assimilation (ICAwICA) is proposed. The ICAwICA was compared against two different types of benchmark problems, and results show the versatile application of the algorithm, matching and in some cases outperforming the custom-tuned approaches. Finally, essential MOF features like automatic algorithm selection and tuning, lacking on existing frameworks, are implemented in OptPlatform. Two novel approaches are proposed and compared to existing methods. Results indicate the superiority of the implemented tuning algorithms within constrained tuning budget environment
Tools and Algorithms for the Construction and Analysis of Systems
This book is Open Access under a CC BY licence. The LNCS 11427 and 11428 proceedings set constitutes the proceedings of the 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019. The total of 42 full and 8 short tool demo papers presented in these volumes was carefully reviewed and selected from 164 submissions. The papers are organized in topical sections as follows: Part I: SAT and SMT, SAT solving and theorem proving; verification and analysis; model checking; tool demo; and machine learning. Part II: concurrent and distributed systems; monitoring and runtime verification; hybrid and stochastic systems; synthesis; symbolic verification; and safety and fault-tolerant systems
Discontinuous Galerkin Spectral Element Methods for Astrophysical Flows in Multi-physics Applications
In engineering applications, discontinuous Galerkin methods (DG) have been proven to be a powerful and flexible class of high order methods for problems in computational fluid dynamics. However, the potential benefits of DG for applications in astrophysical contexts is still relatively unexplored in its entirety. To this day, a decent number of studies surveying DG for astrophysical flows have been conducted. But the adoption of DG by the astrophysics community is just beginning to gain traction and integration of DG into established, multi-physics simulation frameworks for comprehensive astrophysical modeling is still lacking. It is our firm believe, that the full potential of novel approaches for numerically solving the fluid equations only shows under the pressure of real-world simulations with all aspects of multi-physics, challenging flow configurations, resolution and runtime constraints, and efficiency metrics on high-performance systems involved. Thus, we see the pressing need to propel DG from the well-trodden path of cataloguing test results under "optimal" laboratory conditions towards the harsh and unforgiving environment of large-scale astrophysics simulations. Consequently, the core of this work is the development and deployment of a robust DG scheme solving the ideal magneto-hydrodynamics equations with multiple species on three-dimensional Cartesian grids with adaptive mesh refinement. We chose to implement DG within the venerable simulation framework FLASH, with a specific focus on multi-physics problems in astrophysics. This entails modifications of the vanilla DG scheme to make it fit seamlessly within FLASH in such a way that all other physics modules can be naturally coupled without additional implementation overhead. A key ingredient is that our DG scheme uses mean value data organized into blocks - the central data structure in FLASH. Having the opportunity to work on mean values, allows us to rely on a rock-solid, monotone Finite Volume (FV) scheme as "backup" whenever the high order DG method fails in cases when the flow gets too harsh. Finding ways to combine the two schemes in a fail-safe manner without loosing primary conservation while still maintaining high order accuracy for smooth, well-resolved flows involves a series of careful considerations, which we document in this thesis. The result of our work is a novel shock capturing scheme - a hybrid between FV and DG - with smooth transitions between low and high order fluxes according to solution smoothness estimators. We present extensive validations and test cases, specifically its interaction with multi-physics modules in FLASH such as (self-)gravity and radiative transfer. We also investigate the benefits and pitfalls of integrating end-to-end entropy stability into our numerical scheme, with special focus on highly compressible turbulent flows and shocks. Our implementation of DG in FLASH allows us to conduct preliminary yet comprehensive astrophysics simulations proving that our new solver is ready for assessments and investigations by the astrophysics community
XXIII Congreso Argentino de Ciencias de la Computación - CACIC 2017 : Libro de actas
Trabajos presentados en el XXIII Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de La Plata los dÃas 9 al 13 de octubre de 2017, organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y la Facultad de Informática de la Universidad Nacional de La Plata (UNLP).Red de Universidades con Carreras en Informática (RedUNCI