Search CORE

131 research outputs found

A Review on GPU Based Parallel Computing for NP Problems

Author: Swati S. Dhable, Santosh Kumar
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2016
Field of study

Now a days there are different number of optimization problems are present. Which are NP problems to solve this problems parallel metaheuristic algorithm are required. Graph theories are most commonly studied combinational problems. In this paper providing the new move towards solve this combinational problem with GPU based parallel computing using CUDA architecture. Comparing those problem with relevant to the transfer rate, effective memory utilization and speedup etc. to acquire the paramount possible solution. By applying the different algorithms on the optimization problem to catch the efficient memory exploitation, synchronized execution, saving time and increasing speedup of execution. Due to this the speedup factor is enhance and get the best optimal solution

International Journal on Recent and Innovation Trends in Computing and Communication

Parallelization of Ant System for GPU under the PRAM Model

Author: Brodnik Andrej
Grgurovič Marko
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 03/05/2018
Field of study

We study the parallelized ant system algorithm solving the traveling salesman problem on n cities. First, following the series of recent results for the graphics processing unit, we show that they translate to the PRAM (parallel random access machine) model. In addition, we develop a novel pheromone matrix update method under the PRAM CREW (concurrent-read exclusive-write) model and translate it to the graphics processing unit without atomic instructions. As a consequence, we give new asymptotic bounds for the parallel ant system, resulting in step complexities O(n łg łg n) on CRCW (concurrent-read concurrent-write) and O(n łg n) on CREW variants of PRAM using n2 processors in both cases. Finally, we present an experimental comparison with the currently known pheromone matrix update methods on the graphics processing unit and obtain encouraging results

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

A Parallel Meta-Heuristic Approach to Reduce Vehicle Travel Time in Smart Cities

Author: Jimeno-Morenilla Antonio
Migallón Gomis Héctor
Rico Héctor
Sanchez-Romero Jose-Luis
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The development of the smart city concept and inhabitants’ need to reduce travel time, in addition to society’s awareness of the importance of reducing fuel consumption and respecting the environment, have led to a new approach to the classic travelling salesman problem (TSP) applied to urban environments. This problem can be formulated as “Given a list of geographic points and the distances between each pair of points, what is the shortest possible route that visits each point and returns to the departure point?”. At present, with the development of Internet of Things (IoT) devices and increased capabilities of sensors, a large amount of data and measurements are available, allowing researchers to model accurately the routes to choose. In this work, the aim is to provide a solution to the TSP in smart city environments using a modified version of the metaheuristic optimization algorithm Teacher Learner Based Optimization (TLBO). In addition, to improve performance, the solution is implemented by means of a parallel graphics processing unit (GPU) architecture, specifically a Compute Unified Device Architecture (CUDA) implementation.This research was supported by the Spanish Ministry of Science, Innovation and Universities and the Research State Agency under Grant RTI2018-098156-B-C54 co-financed by FEDER funds, and by the Spanish Ministry of Economy and Competitiveness under Grant TIN2017-89266-R, co-financed by FEDER funds

Repositorio Institucional de la Universidad de Alicante

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Parallel Ant Colony Optimization: Algorithmic Models and Hardware Implementations

Author: Pierre Delisle
Publication venue: 'IntechOpen'
Publication date: 20/02/2013
Field of study

IntechOpen

Parallelization Strategies for Ant Colony Optimisation on GPUs

Author: Amos Martyn
Cecilia Jose M.
Garcia Jose M.
Nisbet Andy
Ujaldon Manuel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Ant Colony Optimisation (ACO) is an effective population-based meta-heuristic for the solution of a wide variety of problems. As a population-based algorithm, its computation is intrinsically massively parallel, and it is there- fore theoretically well-suited for implementation on Graphics Processing Units (GPUs). The ACO algorithm comprises two main stages: Tour construction and Pheromone update. The former has been previously implemented on the GPU, using a task-based parallelism approach. However, up until now, the latter has always been implemented on the CPU. In this paper, we discuss several parallelisation strategies for both stages of the ACO algorithm on the GPU. We propose an alternative data-based parallelism scheme for Tour construction, which fits better on the GPU architecture. We also describe novel GPU programming strategies for the Pheromone update stage. Our results show a total speed-up exceeding 28x for the Tour construction stage, and 20x for Pheromone update, and suggest that ACO is a potentially fruitful area for future research in the GPU domain.Comment: Accepted by 14th International Workshop on Nature Inspired Distributed Computing (NIDISC 2011), held in conjunction with the 25th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS 2011

arXiv.org e-Print Archive

CiteSeerX

Northumbria Research Link

Crossref

Adaptive large neighborhood search algorithm – performance evaluation under parallel schemes & applications

Author: Kumar Sandip
Publication venue: Scholars Junction
Publication date: 12/05/2023
Field of study

Adaptive Large Neighborhood Search (ALNS) is a fairly recent yet popular single-solution heuristic for solving discrete optimization problems. Even though the heuristic has been a popular choice for researchers in recent times, the parallelization of this algorithm is not widely studied in the literature compared to the other classical metaheuristics. To extend the existing literature, this study proposes several different parallel schemes to parallelize the basic/sequential ALNS algorithm. More specifically, seven different parallel schemes are employed to target different characteristics of the ALNS algorithm and the capability of the local computers. The schemes of this study are implemented in a master-slave architecture to manage and assign loads in processors of the local computers. The overall goal is to simultaneously explore different areas of the search space in an attempt to escape the local minima, taking effective steps toward the optimal solution and, to the end, accelerating the convergence of the ALNS algorithm. The performance of the schemes is tested by solving a capacitated vehicle routing problem (CVRP) with available wellknown test instances. Our computational results indicate that all the parallel schemes are capable of providing a competitive optimality gap in solving CVRP within our investigated test instances. However, the parallel scheme (scheme 1), which runs the ALNS algorithm independently within different slave processors (e.g., without sharing any information with other slave processors) until the synchronization occurs only when one of the processors meets its predefined termination criteria and reports the solution to the master processor, provides the best running time with solving the instances approximately 10.5 times faster than the basic/sequential ALNS algorithm. These findings are applied in a real-life fulfillment process using mixed-mode delivery with trucks and drones. Complex but optimized routes are generated in a short time that is applicable to perform last-mile delivery to customers

Scholars Junction - Mississippi State University Institutional Repository

Re-engineering the ant colony optimization for CMP architectures

Author: Cecilia-Canales José María
GARCÍA CARRASCO JOSE MANUEL
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2020
Field of study

[EN] The ant colony optimization (ACO) is inspired by the behavior of real ants, and as a bioinspired method, its underlying computation is massively parallel by definition. This paper shows re-engineering strategies to migrate the ACO algorithm applied to the Traveling Salesman Problem to modern Intel-based multi- and many-core architectures in a step-by-step methodology. The paper provides detailed guidelines on how to optimize the algorithm for the intra-node (thread and vector) parallelization, showing the performance scalability along with the number of cores on different Intel architectures, reporting up to 5.5x speedup factor between the Intel Xeon Phi Knights Landing and Intel Xeon v2. Moreover, parallel efficiency is provided for all targeted architectures, finding that core load imbalance, memory bandwidth limitations, and NUMA effects on data placement are some of the key factors limiting performance. Finally, a distributed implementation is also presented, reaching up to 2.96x speedup factor when running the code on 3 nodes over the single-node counterpart version. In the latter case, the parallel efficiency is affected by the synchronization frequency, which also affects the quality of the solution found by the distributed implementation.This work was partially supported by the Fundación Séneca, Agencia de Ciencia y Tecnología de la Región de Murcia under Project 20813/PI/18, and by Spanish Ministry of Science, Innovation and Universities as well as European Commission FEDER funds under Grants TIN2015-66972-C5-3-R, RTI2018-098156-B-C53, TIN2016-78799-P (AEI/FEDER, UE), and RTC-2017-6389-5. We acknowledge the excellent work done by Victor Montesinos while he was doing a research internship supported by the University of Murcia.Cecilia-Canales, JM.; García Carrasco, JM. (2020). Re-engineering the ant colony optimization for CMP architectures. The Journal of Supercomputing (Online). 76(6):4581-4602. https://doi.org/10.1007/s11227-019-02869-8S45814602766Yang XS (2010) Nature-inspired metaheuristic algorithms. Luniver Press, LebanonAkila M, Anusha P, Sindhu M, Selvan Krishnasamy T (2017) Examination of PSO, GA-PSO and ACO algorithms for the design optimization of printed antennas. In: IEEE Applied Electromagnetics Conference (AEMC)Dorigo M, Stützle T (2004) Ant colony optimization. A bradford book. The MIT Press, CambridgeCecilia JM, García JM, Nisbet A, Amos M, Ujaldón M (2013) Enhancing data parallelism for ant colony optimization on GPUs. J Parallel Distrib Comput 73(1):42–51Dawson L, Stewart I (2013) Improving ant colony optimization performance on the GPU using CUDA. In: IEEE Conference on Evolutionary Computation, pp 1901–1908Llanes A, Cecilia JM, Sánchez A, García JM, Amos M, Ujaldón M (2016) Dynamic load balancing on heterogeneous clusters for parallel ant colony optimization. Cluster Comput 19(1):1–11Cecilia JM, Llanes A, Abellán JL, Gómez-Luna J, Chang L, Hwu WW (2018) High-throughput ant colony optimization on graphics processing units. J Parallel Distrib Comput 113:261–274Lloyd H, Amos M (2016) A Highly Parallelized and Vectorized Implementation of Max–Min Ant System on Intel Xeon Phi. In: IEEE computational intelligenceTirado F, Barrientos RJ, González P, Mora M (2017) Efficient exploitation of the Xeon Phi architecture for the ant colony optimization (ACO) metaheuristic. J Supercomput 73(11):5053–5070Montesinos V, García JM (2018) Vectorization strategies for ant colony optimization on intel architectures. Parallel Computing is Everywhere. IOS Press, Amsterdam, pp 400–409Lawler E, Lenstra J, Kan A, Shmoys D (1987) The Traveling salesman problem. Wiley, New YorkMontesinos V (June 2018) Performance analysis of ant colony optimization on intel architectures. Master’s Thesis, University of Murcia (Spain)Lloyd H, Amos M (2017) Analysis of independent roulette selection in parallel ant colony optimization. In: Genetic and Evolutionary Computation Conference, ACM, pp 19–26Dorigo M (1992) Optimization, learning and natural algorithms. Ph.D. Thesis, Politecnico di Milano, ItalyDuran A, Klemm M (2012) The intel many integrated core architecture. In: Internal Conference on High Performance Computing and Simulation (HPCS), pp 365–366The OpenMP API specification for parallel programming. URL: https://www.openmp.org . [Last accessed 14 June 2018]The Message Passing Interface (MPI) standard. URL: http://www.mcs.anl.gov/research/projects/mpi/ . [Last accessed 15 June 2018]Vladimirov A, Asai R (2016) Clustering modes in Knights landing processors: developer’s guide. Colfax international. URL: https://colfaxresearch.com/knl-numa/ . [Last accessed: 16 June 2018]Intel Developer Zone. URL: https://software.intel.com/en-us/modern-code . [Last accessed 02 Oct 2018]Pearce M (2018) What is code modernization? Intel developer zone. URL: http://software.intel.com/en-us/articles/what-is-code-modernization . [Last accessed 15 Feb 2018]Stützle T ACOTSP v1.03. Last accessed 15 Feb 2018. URL: http://iridia.ulb.ac.be/~mdorigo/ACO/downloads/ACOTSP-1.03.tgzReinelt G (1991) TSPLIB—a traveling salesman problem library. ORSA J Comput 3:376–384Crainic TG, Toulouse M (2003) Parallel strategies for meta-heuristics. State-of-the-art handbook in metaheuristics. Kluwer Academic Publishers, Dordrecht, pp 475–513Delévacq A, Delisle P, Gravel M, Krajecki M (2013) Parallel ant colony optimization on graphics processing units. J Parallel Distrib Comput 73(1):52–61Skinderowicz R (2016) The GPU-based parallel ant colony system. J Parallel Distrib Comput 98:48–60Zhou Y, He F, Hou N, Qiu Y (2018) Parallel ant colony optimization on multi-core SIMD CPUs. Future Gener Comput Syst 79:473–487Peake J, Amos M, Yiapanis P, Lloyd H (2018) Vectorized candidate set selection for parallel ant colony optimization. In: Genetic and Evolutionary Computation Conference, ACM, pp 1300–1306Stützle T (1998) Parallelization strategies for ant colony optimization. In: Eiben AE, Bäck T, Schoenauer M, Schwefel HP (eds) Parallel problem solving from nature—PPSN V. PPSN. Lecture Notes in Computer Science, vol 1498. Springer, Berlin, HeidelbergAbdelkafi O, Lepagnot J, Idoumghar L (2014) Multi-level parallelization for hybrid ACO. In: Siarry P, Idoumghar L, Lepagnot J (eds) Swarm Intelligence Based Optimization. ICSIBO 2014. Lecture Notes in Computer Science, vol 8472. Springer, ChamMichel R, Middendorf M (1998) An island model based ant system with lookahead for the shortest super sequence problem. In: Eiben AE, Bäck T, Schoenauer M, Schwefel HP (eds) Parallel problem solving from nature— PPSN V. PPSN. Lecture Notes in Computer Science, vol 1498. Springer, Berlin, HeidelbergChen L, Sun H, Wang S (2008) Parallel implementation of ant colony optimization on MPP. In: International Conference on Machine Learning and CyberneticsLin Y, Cai H, Xiao J, Zhang J (2007) Pseudo parallel ant colony optimization for continuous functions. In: International Conference on Natural Computatio

RiuNet