Search CORE

150 research outputs found

Parallelization Strategies for Ant Colony Optimisation on GPUs

Author: Amos Martyn
Cecilia Jose M.
Garcia Jose M.
Nisbet Andy
Ujaldon Manuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Ant Colony Optimisation (ACO) is an effective population-based meta-heuristic for the solution of a wide variety of problems. As a population-based algorithm, its computation is intrinsically massively parallel, and it is there- fore theoretically well-suited for implementation on Graphics Processing Units (GPUs). The ACO algorithm comprises two main stages: Tour construction and Pheromone update. The former has been previously implemented on the GPU, using a task-based parallelism approach. However, up until now, the latter has always been implemented on the CPU. In this paper, we discuss several parallelisation strategies for both stages of the ACO algorithm on the GPU. We propose an alternative data-based parallelism scheme for Tour construction, which fits better on the GPU architecture. We also describe novel GPU programming strategies for the Pheromone update stage. Our results show a total speed-up exceeding 28x for the Tour construction stage, and 20x for Pheromone update, and suggest that ACO is a potentially fruitful area for future research in the GPU domain.Comment: Accepted by 14th International Workshop on Nature Inspired Distributed Computing (NIDISC 2011), held in conjunction with the 25th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS 2011

arXiv.org e-Print Archive

CiteSeerX

Northumbria Research Link

Crossref

Applying ACO To Large Scale TSP Instances

Author: A DeléVacq
JH Holland
JM Cecilia
M Dorigo
M Dorigo
M Randall
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/09/2017
Field of study

Ant Colony Optimisation (ACO) is a well known metaheuristic that has proven successful at solving Travelling Salesman Problems (TSP). However, ACO suffers from two issues; the first is that the technique has significant memory requirements for storing pheromone levels on edges between cities and second, the iterative probabilistic nature of choosing which city to visit next at every step is computationally expensive. This restricts ACO from solving larger TSP instances. This paper will present a methodology for deploying ACO on larger TSP instances by removing the high memory requirements, exploiting parallel CPU hardware and introducing a significant efficiency saving measure. The approach results in greater accuracy and speed. This enables the proposed ACO approach to tackle TSP instances of up to 200K cities within reasonable timescales using a single CPU. Speedups of as much as 1200 fold are achieved by the technique

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

A Review on GPU Based Parallel Computing for NP Problems

Author: Swati S. Dhable, Santosh Kumar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2016
Field of study

Now a days there are different number of optimization problems are present. Which are NP problems to solve this problems parallel metaheuristic algorithm are required. Graph theories are most commonly studied combinational problems. In this paper providing the new move towards solve this combinational problem with GPU based parallel computing using CUDA architecture. Comparing those problem with relevant to the transfer rate, effective memory utilization and speedup etc. to acquire the paramount possible solution. By applying the different algorithms on the optimization problem to catch the efficient memory exploitation, synchronized execution, saving time and increasing speedup of execution. Due to this the speedup factor is enhance and get the best optimal solution

International Journal on Recent and Innovation Trends in Computing and Communication

Dynamic load balancing on heterogeneous clusters for parallel ant colony optimization

Author: A Delévacq
A Martin
Antonia Sánchez
Antonio Llanes
B Yu
BR Ke
DE Goldberg
DE Goldberg
DS Johnson
E Alba
E Lawler
G Michell De
G Reinelt
G Rozenberg
J Nickolls
J Shalf
JM Cecilia
JM Cecilia
José M. Cecilia
José M. García
Komarudin
M Dorigo
M Dorigo
M Dorigo
M Dorigo
M Dorigo
M Manfrin
M Pedemonte
Manuel Ujaldón
Martyn Amos
MP Garcia
R González
R Rahman
RSS Chang
T Stutzle
T Stützle
W Wolf
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

© 2016 Springer Science+Business Media New York Ant colony optimisation (ACO) is a nature-inspired, population-based metaheuristic that has been used to solve a wide variety of computationally hard problems. In order to take full advantage of the inherently stochastic and distributed nature of the method, we describe a parallelization strategy that leverages these features on heterogeneous and large-scale, massively-parallel hardware systems. Our approach balances workload effectively, by dynamically assigning jobs to heterogeneous resources which then run ACO implementations using different search strategies. Our experimental results confirm that we can obtain significant improvements in terms of both solution quality and energy expenditure, thus opening up new possibilities for the development of metaheuristic-based solutions to “real world” problems on high-performance, energy-efficient contemporary heterogeneous computing platforms

Institutional Repository UCAM

Northumbria Research Link

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

E-space: Manchester Metropolitan University's Research Repository

Parallel Ant Colony Optimization: Algorithmic Models and Hardware Implementations

Author: Pierre Delisle
Publication venue: 'IntechOpen'
Publication date: 20/02/2013
Field of study

IntechOpen

Generic Techniques in General Purpose GPU Programming with Applications to Ant Colony and Image Processing Algorithms

Author: DAWSON LAURENCE,JAMES
Publication venue
Publication date: 01/01/2015
Field of study

In 2006 NVIDIA introduced a new unified GPU architecture facilitating general-purpose computation on the GPU. The following year NVIDIA introduced CUDA, a parallel programming architecture for developing general purpose applications for direct execution on the new unified GPU. CUDA exposes the GPU's massively parallel architecture of the GPU so that parallel code can be written to execute much faster than its sequential counterpart. Although CUDA abstracts the underlying architecture, fully utilising and scheduling the GPU is non-trivial and has given rise to a new active area of research. Due to the inherent complexities pertaining to GPU development, in this thesis we explore and find efficient parallel mappings of existing and new parallel algorithms on the GPU using NVIDIA CUDA. We place particular emphasis on metaheuristics, image processing and designing reusable techniques and mappings that can be applied to other problems and domains. We begin by focusing on Ant Colony Optimisation (ACO), a nature inspired heuristic approach for solving optimisation problems. We present a versatile improved data-parallel approach for solving the Travelling Salesman Problem using ACO resulting in significant speedups. By extending our initial work, we show how existing mappings of ACO on the GPU are unable to compete against their sequential counterpart when common CPU optimisation strategies are employed and detail three distinct candidate set parallelisation strategies for execution on the GPU. By further extending our data-parallel approach we present the first implementation of an ACO-based edge detection algorithm on the GPU to reduce the execution time and improve the viability of ACO-based edge detection. We finish by presenting a new color edge detection technique using the volume of a pixel in the HSI color space along with a parallel GPU implementation that is able to withstand greater levels of noise than existing algorithms

Durham e-Theses

An Efficient Ant Colony Optimization Framework for HPC Environments

Author: Banga Julio R.
Doallo Ramón
González Patricia
Osorio Roberto
Pardo Xoán C.
Publication venue: 'Elsevier BV'
Publication date: 18/11/2021
Field of study

Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Combinatorial optimization problems arise in many disciplines, both in the basic sciences and in applied fields such as engineering and economics. One of the most popular combinatorial optimization methods is the Ant Colony Optimization (ACO) metaheuristic. Its parallel nature makes it especially attractive for implementation and execution in High Performance Computing (HPC) environments. Here we present a novel parallel ACO strategy making use of efficient asynchronous decentralized cooperative mechanisms. This strategy seeks to fulfill two objectives: (i) acceleration of the computations by performing the ants’ solution construction in parallel; (ii) convergence improvement through the stimulation of the diversification in the search and the cooperation between different colonies. The two main features of the proposal, decentralization and desynchronization, enable a more effective and efficient response in environments where resources are highly coupled. Examples of such infrastructures include both traditional HPC clusters, and also new distributed environments, such as cloud infrastructures, or even local computer networks. The proposal has been evaluated using the popular Traveling Salesman Problem (TSP), as a well-known NP-hard problem widely used in the literature to test combinatorial optimization methods. An exhaustive evaluation has been carried out using three medium and large size instances from the TSPLIB library, and the experiments show encouraging results with superlinear speedups compared to the sequential algorithm (e.g. speedups of 18 with 16 cores), and a very good scalability (experiments were performed with up to 384 cores improving execution time even at that scale).This work was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00 / AEI / 10.13039/501100011033), and by Xunta de Galicia and FEDER funds of the EU (Centro de Investigación de Galicia accreditation 2019–2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30). JRB acknowledges funding from the Ministry of Science and Innovation of Spain MCIN / AEI / 10.13039/501100011033 through grant PID2020-117271RB-C22 (BIODYNAMICS), and from MCIN / AEI / 10.13039/501100011033 and “ERDF A way of making Europe” through grant DPI2017-82896-C2-2-R (SYNBIOCONTROL). Authors also acknowledge the Galician Supercomputing Center (CESGA) for the access to its facilities. Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

Repositorio da Universidade da Coruña

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Digital.CSIC

Soft Computing Techiniques for the Protein Folding Problem on High Performance Computing Architectures

Author: Arcas Túnez Francisco
Bueno Crespo Andrés
Cecilia Canales José María
García Valverde Teresa
Llanes Antonio
Muñoz Andrés
Pérez Sánchez Horacio
Sánchez Antonia María
Publication venue: 'Bentham Science Publishers Ltd.'
Publication date: 01/01/2016
Field of study

The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.This work is jointly supported by the FundaciónSéneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants 15290/PI/2010 and 18946/JLI/13, by the Spanish MEC and European Commission FEDER under grant with reference TEC2012-37945-C02-02 and TIN2012-31345, by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF). We also thank NVIDIA for hardware donation within UCAM GPU educational and research centers.Ingeniería, Industria y Construcció

Institutional Repository UCAM

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas