Search CORE

6 research outputs found

Problema de asignación quadrática (pac) sobre gpu a través de una pga maestro-esclavo

Author: Amarillo Calvo Víctor Hugo
Castellanos Millán Julián Octavio
Poveda Chaves Roberto Manuel
Publication venue: 'Universidad Distrital Francisco Jose de Caldas'
Publication date: 20/12/2016
Field of study

This document describes the implementation of a Master–Slave Parallel Genetic Algorithm (PGA) on Graphic Processing Units (GPU) to find solutions or solutions close to optimal solutions to particular instances of the Quadratic Assignment Problem (QAP). The efficiency of the algorithm is tested on a set of QAPLIB standard library problems.Este documento describe la implementación de un algoritmo genético paralelo maestroesclavo (AGP) en unidades de procesamiento gráfico (UPG) para encontrar soluciones o soluciones cercanas a soluciones óptimas para casos particulares del Problema de asignación Cuadrática (PAC). La eficiencia del algoritmo se prueba en un conjunto de problemas de la biblioteca estándar QAPLIB

Universidad Distrital de la ciudad de Bogotá: Open Journal Systems

GPU-accelerated Parallel Solutions to the Quadratic Assignment Problem

Author: Novoa Clara
Qasem Apan
Publication venue
Publication date: 20/07/2023
Field of study

The Quadratic Assignment Problem (QAP) is an important combinatorial optimization problem with applications in many areas including logistics and manufacturing. QAP is known to be NP-hard, a computationally challenging problem, which requires the use of sophisticated heuristics in finding acceptable solutions for most real-world data sets. In this paper, we present GPU-accelerated implementations of a 2opt and a tabu search algorithm for solving the QAP. For both algorithms, we extract parallelism at multiple levels and implement novel code optimization techniques that fully utilize the GPU hardware. On a series of experiments on the well-known QAPLIB data sets, our solutions, on average run an order-of-magnitude faster than previous implementations and deliver up to a factor of 63 speedup on specific instances. The quality of the solutions produced by our implementations of 2opt and tabu is within 1.03% and 0.15% of the best known values. The experimental results also provide key insight into the performance characteristics of accelerated QAP solvers. In particular, the results reveal that both algorithmic choice and the shape of the input data sets are key factors in finding efficient implementations.Comment: 25 pages, 9 figures; parts of this work appeared as short papers in XSEDE14 and XSEDE15 conferences. This version of the paper is a substantial extension of previous work with optimizations for newer GPU platforms and extended experimental result

arXiv.org e-Print Archive

PGAGrid: A Parallel Genetic Algorithm of Fine-Grained implemented on GPU to find solutions near the optimum to the Quadratic Assignment Problem (QAP)

Author: Poveda Chaves Roberto Manuel
Publication venue
Publication date: 01/01/2019
Field of study

This work consists in implementing a fine-grained parallel genetic algorithm improved with a greedy 2-opt heuristic to find near-optimal solutions to the Quadratic Assignment Problem (QAP). The proposed algorithm was fully implemented on Graphics Processing Units (GPUs). A two-dimensional GPU grid of size 8x8 defines the population of the genetic algorithm (set of permutations of the QAP), and each GPU block consists of n GPU threads, where n is the size of the QAP. Each GPU block was used to represent the chromosome of a single individual, and each GPU thread represents a gene of such chromosome. The proposed algorithm was tested on a subset of the standard QAPLIB data set. Results show that this implementation is able to find good solutions for large QAP instances in few parallel iterations of the evolutionary process.Resumen: Este trabajo consiste en implementar un algoritmo genético paralelo de grano fino mejorado con una heurística 2-opt voraz para encontrar soluciones cercanas al óptimo al problema de Asignación Cuadrática (QAP). El algoritmo propuesto fue completamente implementado sobre Unidades de Procesamiento Gráfico (GPUs). Una retícula GPU bidimensional de tamaño 8×8 define la población del algoritmo genético (conjunto de permutaciones del QAP) y cada bloque GPU consiste de n hilos GPU donde n es el tamaño del QAP. Cada bloque GPU fue utilizado para representar el cromosoma de un solo individuo y cada hilo GPU representa un gen de tal cromosoma. El algoritmo propuesto fue comprobado sobre un subconjunto de problemas de la librería estándar QAPLIB. Los resultados muestran que esta implementación es capaz de encontrar buenas soluciones para grandes instancias del QAP en pocas iteraciones del proceso evolutivo.Doctorad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Nacional De Colombia - Repositorio Institucional UN

Scalable parallel evolutionary optimisation based on high performance computing

Author: Jin C
Publication venue: RMIT University
Publication date
Field of study

Evolutionary algorithms (EAs) have been successfully applied to solve various challenging optimisation problems. Due to their stochastic nature, EAs typically require considerable time to find desirable solutions; especially for increasingly complex and large-scale problems. As a result, many works studied implementing EAs on parallel computing facilities to accelerate the time-consuming processes. Recently, the rapid development of modern parallel computing facilities such as the high performance computing (HPC) bring not only unprecedented computational capabilities but also challenges on designing parallel algorithms. This thesis mainly focuses on designing scalable parallel evolutionary optimisation (SPEO) frameworks which run efficiently on the HPC. Motivated by the interesting phenomenon that many EAs begin to employ increasingly large population sizes, this thesis firstly studies the effect of a large population size through comprehensive experiments. Numerical results indicate that a large population benefits to the solving of complex problems but requires a large number of maximal fitness evaluations (FEs). However, since sequential EAs usually requires a considerable computing time to achieve extensive FEs, we propose a scalable parallel evolutionary optimisation framework that can efficiently deploy parallel EAs over many CPU cores at CPU-only HPC. On the other hand, since EAs using a large number of FEs can produce massive useful information in the course of evolution, we design a surrogate-based approach to learn from this historical information and to better solve complex problems. Then this approach is implemented in parallel based on the proposed scalable parallel framework to achieve remarkable speedups. Since demanding a great computing power on CPU-only HPC is usually very expensive, we design a framework based on GPU-enabled HPC to improve the cost-effectiveness of parallel EAs. The proposed framework can efficiently accelerate parallel EAs using many GPUs and can achieve superior cost-effectiveness. However, since it is very challenging to correctly implement parallel EAs on the GPU, we propose a set of guidelines to verify the correctness of GPU-based EAs. In order to examine these guidelines, they are employed to verify a GPU-based brain storm optimisation that is also proposed in this thesis. In conclusion, the comprehensively experimental study is firstly conducted to investigate the impacts of a large population. After that, a SPEO framework based on CPU-only HPC is proposed and is employed to accelerate a time-consuming implementation of EA. Finally, the correctness verification of implementing EAs based on a single GPU is discussed and the SPEO framework is then extended to be deployed based on GPU-enabled HPC

RMIT Research Repository

ACO on multiple GPUS with CUDA for faster solution of QAPs

Author: Shigeyoshi Tsutsui
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

Abstract. In this paper, we implement ACO algorithms on a PC which has 4 GTX 480 GPUs. We implement two types of ACO models; the island model, and the master/slave model. When we compare the island model and the master/slave model, the island model shows promising speedup values on class (iv) QAP instances. On the other hand, the master/slave model showed promising speedup values on both classes (i) and (iv) with large-size QAP instances

CiteSeerX