Search CORE

100 research outputs found

A Parallel Meta-Heuristic Approach to Reduce Vehicle Travel Time in Smart Cities

Author: Jimeno-Morenilla Antonio
Migallón Gomis Héctor
Rico Héctor
Sanchez-Romero Jose-Luis
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The development of the smart city concept and inhabitants’ need to reduce travel time, in addition to society’s awareness of the importance of reducing fuel consumption and respecting the environment, have led to a new approach to the classic travelling salesman problem (TSP) applied to urban environments. This problem can be formulated as “Given a list of geographic points and the distances between each pair of points, what is the shortest possible route that visits each point and returns to the departure point?”. At present, with the development of Internet of Things (IoT) devices and increased capabilities of sensors, a large amount of data and measurements are available, allowing researchers to model accurately the routes to choose. In this work, the aim is to provide a solution to the TSP in smart city environments using a modified version of the metaheuristic optimization algorithm Teacher Learner Based Optimization (TLBO). In addition, to improve performance, the solution is implemented by means of a parallel graphics processing unit (GPU) architecture, specifically a Compute Unified Device Architecture (CUDA) implementation.This research was supported by the Spanish Ministry of Science, Innovation and Universities and the Research State Agency under Grant RTI2018-098156-B-C54 co-financed by FEDER funds, and by the Spanish Ministry of Economy and Competitiveness under Grant TIN2017-89266-R, co-financed by FEDER funds

Repositorio Institucional de la Universidad de Alicante

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Generic Techniques in General Purpose GPU Programming with Applications to Ant Colony and Image Processing Algorithms

Author: DAWSON LAURENCE,JAMES
Publication venue
Publication date: 01/01/2015
Field of study

In 2006 NVIDIA introduced a new unified GPU architecture facilitating general-purpose computation on the GPU. The following year NVIDIA introduced CUDA, a parallel programming architecture for developing general purpose applications for direct execution on the new unified GPU. CUDA exposes the GPU's massively parallel architecture of the GPU so that parallel code can be written to execute much faster than its sequential counterpart. Although CUDA abstracts the underlying architecture, fully utilising and scheduling the GPU is non-trivial and has given rise to a new active area of research. Due to the inherent complexities pertaining to GPU development, in this thesis we explore and find efficient parallel mappings of existing and new parallel algorithms on the GPU using NVIDIA CUDA. We place particular emphasis on metaheuristics, image processing and designing reusable techniques and mappings that can be applied to other problems and domains. We begin by focusing on Ant Colony Optimisation (ACO), a nature inspired heuristic approach for solving optimisation problems. We present a versatile improved data-parallel approach for solving the Travelling Salesman Problem using ACO resulting in significant speedups. By extending our initial work, we show how existing mappings of ACO on the GPU are unable to compete against their sequential counterpart when common CPU optimisation strategies are employed and detail three distinct candidate set parallelisation strategies for execution on the GPU. By further extending our data-parallel approach we present the first implementation of an ACO-based edge detection algorithm on the GPU to reduce the execution time and improve the viability of ACO-based edge detection. We finish by presenting a new color edge detection technique using the volume of a pixel in the HSI color space along with a parallel GPU implementation that is able to withstand greater levels of noise than existing algorithms

Durham e-Theses

Recommended from our members

OptPlatform: metaheuristic optimisation framework for solving complex real-world problems

Author: Dzalbs Ivars
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonWe optimise daily, whether that is planning a round trip that visits the most attractions within a given holiday budget or just taking a train instead of driving a car in a rush hour. Many problems, just like these, are solved by individuals as part of our daily schedule, and they are effortless and straightforward. If we now scale that to many individuals with many different schedules, like a school timetable, we get to a point where it is just not feasible or practical to solve by hand. In such instances, optimisation methods are used to obtain an optimal solution. In this thesis, a practical approach to optimisation has been taken by developing an optimisation platform with all the necessary tools to be used by practitioners who are not necessarily familiar with the subject of optimisation. First, a high-performance metaheuristic optimisation framework (MOF) called OptPlatform is implemented, and the versatility and performance are evaluated across multiple benchmarks and real-world optimisation problems. Results show that, compared to competing MOFs, the OptPlatform outperforms in both the solution quality and computation time. Second, the most suitable hardware platform for OptPlatform is determined by an in-depth analysis of Ant Colony Optimisation scaling across CPU, GPU and enterprise Xeon Phi. Contrary to the common benchmark problems used in the literature, the supply chain problem solved could not scale on GPUs. Third, a variety of metaheuristics are implemented into OptPlatform. Including, a new metaheuristic based on Imperialist Competitive Algorithm (ICA), called ICA with Independence and Constrained Assimilation (ICAwICA) is proposed. The ICAwICA was compared against two different types of benchmark problems, and results show the versatile application of the algorithm, matching and in some cases outperforming the custom-tuned approaches. Finally, essential MOF features like automatic algorithm selection and tuning, lacking on existing frameworks, are implemented in OptPlatform. Two novel approaches are proposed and compared to existing methods. Results indicate the superiority of the implemented tuning algorithms within constrained tuning budget environment

Brunel University Research Archive

A new parallelisation technique for heterogeneous CPUs

Author: Gdura Youssef Omran
Publication venue
Publication date: 01/01/2012
Field of study

Parallelization has moved in recent years into the mainstream compilers, and the demand for parallelizing tools that can do a better job of automatic parallelization is higher than ever. During the last decade considerable attention has been focused on developing programming tools that support both explicit and implicit parallelism to keep up with the power of the new multiple core technology. Yet the success to develop automatic parallelising compilers has been limited mainly due to the complexity of the analytic process required to exploit available parallelism and manage other parallelisation measures such as data partitioning, alignment and synchronization. This dissertation investigates developing a programming tool that automatically parallelises large data structures on a heterogeneous architecture and whether a high-level programming language compiler can use this tool to exploit implicit parallelism and make use of the performance potential of the modern multicore technology. The work involved the development of a fully automatic parallelisation tool, called VSM, that completely hides the underlying details of general purpose heterogeneous architectures. The VSM implementation provides direct and simple access for users to parallelise array operations on the Cell’s accelerators without the need for any annotations or process directives. This work also involved the extension of the Glasgow Vector Pascal compiler to work with the VSM implementation as a one compiler system. The developed compiler system, which is called VP-Cell, takes a single source code and parallelises array expressions automatically. Several experiments were conducted using Vector Pascal benchmarks to show the validity of the VSM approach. The VP-Cell system achieved significant runtime performance on one accelerator as compared to the master processor’s performance and near-linear speedups over code runs on the Cell’s accelerators. Though VSM was mainly designed for developing parallelising compilers it also showed a considerable performance by running C code over the Cell’s accelerators

Glasgow Theses Service

OpenGrey Repository

Parallel Ant Colony Algorithm for Shortest Path Problem

Author: Juhász János
Katona Géza
Lénárt Balázs
Publication venue: 'Periodica Polytechnica Budapest University of Technology and Economics'
Publication date: 28/01/2019
Field of study

During travelling, more and more information must be taken into account, and travelers have to make several complex decisions. In order to support these decisions, IT solutions are unavoidable, and as the computational demand is constantly growing, the examination of state-of-the-art methodologies is necessary. In our research, a parallelized Ant Colony algorithm was investigated, and a parameter study on a real network has been made. The aim was to inspect the sensibility of the method and to demonstrate its applicability in a multi-threaded system (e.g. Cloud-based systems). Based on the research, increased effectiveness can be reached by using more threads. The novelty of the paper is the usage of the processors’ parallel computing capability for routing with the Ant Colony algorithm

Periodica Polytechnica (Budapest University of Technology and Economics)

A dynamic programming model to solve optimisation problems using GPUs

Author: O'Connell Jonathan F.
Publication venue
Publication date
Field of study

This thesis presents a parallel, dynamic programming based model which is deployed on the GPU of a system to accelerate the solving of optimisation problems. This is achieved by simultaneously running GPU based computations, and memory transactions, allowing computation to never pause, and overcoming the memory constraints of solving large problem instances. Due to this some optimisation problems, which are currently not solved in an exact manner for real world sized instances due to their complexity, are moved into the solvable realm. The model is implemented to solve, a range of different test problems, where artificially constructed test data is used to ensure good performance even in the worst cases. Through this extensive testing, we can be confident the model will perform well when used to solve real world test cases. Testing of the model was carried out using a range of different implementation parameters in relation to deployment on the GPU, in order to identify both optimal implementation parameters, and how the model will operate when running on different systems. All problems, when implemented in parallel using the model, show run-time improvements compared to the sequential implementations, in some instances up to hundreds of times faster, but more importantly also show high efficiency metrics for the utilisation of GPU resources. Throughout testing emphasis has been placed on GPU based metrics to ensure the wider generic applicability of the model. Finally, the parallel model allows for new problems to be defined through the use of a simple file format, enabling wider usage of the model

Online Research @ Cardiff

Computer Vision for Volunteer Cotton Detection in a Corn Field with UAS Remote Sensing Imagery and Spot Spray Applications

Author: Braga-Neto Ulisses
Diaz Jorge Solorzano
Enciso Juan
Hardin Robert G.
Martin Daniel E.
Meza Karem
Popescu Sorin C.
Rodriguez Roberto
Searcy Stephen W.
Thomasson J. Alex
Wang Tianyi
Yadav Pappu Kumar
Publication venue
Publication date: 15/07/2022
Field of study

To control boll weevil (Anthonomus grandis L.) pest re-infestation in cotton fields, the current practices of volunteer cotton (VC) (Gossypium hirsutum L.) plant detection in fields of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor L.) involve manual field scouting at the edges of fields. This leads to many VC plants growing in the middle of fields remain undetected that continue to grow side by side along with corn and sorghum. When they reach pinhead squaring stage (5-6 leaves), they can serve as hosts for the boll weevil pests. Therefore, it is required to detect, locate and then precisely spot-spray them with chemicals. In this paper, we present the application of YOLOv5m on radiometrically and gamma-corrected low resolution (1.2 Megapixel) multispectral imagery for detecting and locating VC plants growing in the middle of tasseling (VT) growth stage of cornfield. Our results show that VC plants can be detected with a mean average precision (mAP) of 79% and classification accuracy of 78% on images of size 1207 x 923 pixels at an average inference speed of nearly 47 frames per second (FPS) on NVIDIA Tesla P100 GPU-16GB and 0.4 FPS on NVIDIA Jetson TX2 GPU. We also demonstrate the application of a customized unmanned aircraft systems (UAS) for spot-spray applications based on the developed computer vision (CV) algorithm and how it can be used for near real-time detection and mitigation of VC plants growing in corn fields for efficient management of the boll weevil pests.Comment: 39 page

arXiv.org e-Print Archive

Parallel implementation of maximum parsimony search algorithm on multicore CPUs

Author: Darling Andrew
Publication venue: RIT Scholar Works
Publication date: 01/11/2011
Field of study

Phylogenetics is the study of the evolutionary relationships among species. It is derived from the ancient greek words, phylon meaning race , and genetikos, meaning relative to birth . An important methodology in phylogenetics is a cladistics methodology (parsimony) applied to the study of taxonomic classification. Modern study includes as source data aspects of molecular biology, such as the DNA sequence of homologous (orthologous) genes. The algorithms used attempt to reconstruct evolutionary relationships in the form of phylogenetic trees, based on the available morphological data, behavioral data, and usually DNA sequence data (Fitch W. M., 1971). The topic of this thesis is the parallel implementation of an existing algorithm called Maximum Parsimony, a search for a guaranteed optimal tree(s) based the fewest number of mutations required for tree construction. The algorithm grows linearly with the increase in DNA sequence length and combinatorially with the number of organisms studied (Felsenstein J. , The number of evolutionary trees., 1978). The algorithm may take hours to complete. The limitations of the current implementations such as PAUP are that they are limited to just one core on the CPU, even if 8 are available. This parallel implementation may use as many cores as are available. The method of research is to replicate the accuracy of existing serial software, parallelize the algorithm to many cores without losing accuracy, optimize by various methods, then attempt to port to other hardware architectures. Some time is spent on the implementation of the algorithms onto GPUs and Clusters. The results are that, while this implementation matches the accuracy of the current standard, and speeds up in parallel, it does not presently match the speed of PAUP for reasons yet to be determined

RIT Scholar Works

Comparative evaluation of platforms for parallel Ant Colony Optimization

Author: A Delévacq
Antonio Llanes
AR Brodtkorb
B Yu
BP Flannery
BR Ke
E Alba
Ginés D. Guerrero
J Nickolls
JE Stone
JM Cecilia
José M. Cecilia
José M. García
KY Komarudin Wong
M Dorigo
M Dorigo
M Dorigo
M Pedemonte
Manuel Ujaldón
Martyn Amos
MP Garcia
RSS Chang
T Stutzle
Y Chen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The rapidly growing field of nature-inspired computing concerns the development and application of algorithms and methods based on biological or physical principles. This approach is particularly compelling for practitioners in high-performance computing, as natural algorithms are often inherently parallel in nature (for example, they may be based on a “swarm”-like model that uses a population of agents to optimize a function). Coupled with rising interest in nature-based algorithms is the growth in heterogenous computing; systems that use more than one kind of processor. We are therefore interested in the performance characteristics of nature-inspired algorithms on a number of different platforms. To this end, we present a new OpenCL-based implementation of the Ant Colony Optimization algorithm, and use it as the basis of extensive experimental tests. We benchmark the algorithm against existing implementations, on a wide variety of hardware platforms, and offer extensive analysis. This work provides rigorous foundations for future investigations of Ant Colony Optimization on high-performance platforms

Institutional Repository UCAM

Crossref

Northumbria Research Link

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

E-space: Manchester Metropolitan University's Research Repository