160 research outputs found

    GPU accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

    Full text link
    Pedestrian movement, although ubiquitous and well-studied, is still not that well understood due to the complicating nature of the embedded social dynamics. Interest among researchers in simulating pedestrian movement and interactions has grown significantly in part due to increased computational and visualization capabilities afforded by high power computing. Different approaches have been adopted to simulate pedestrian movement under various circumstances and interactions. In the present work, bi-directional crowd movement is simulated where an equal numbers of individuals try to reach the opposite sides of an environment. Two movement methods are considered. First a Least Effort Model (LEM) is investigated where agents try to take an optimal path with as minimal changes from their intended path as possible. Following this, a modified form of Ant Colony Optimization (ACO) is proposed, where individuals are guided by a goal of reaching the other side in a least effort mode as well as a pheromone trail left by predecessors. The basic idea is to increase agent interaction, thereby more closely reflecting a real world scenario. The methodology utilizes Graphics Processing Units (GPUs) for general purpose computing using the CUDA platform. Because of the inherent parallel properties associated with pedestrian movement such as proximate interactions of individuals on a 2D grid, GPUs are well suited. The main feature of the implementation undertaken here is that the parallelism is data driven. The data driven implementation leads to a speedup up to 18x compared to its sequential counterpart running on a single threaded CPU. The numbers of pedestrians considered in the model ranged from 2K to 100K representing numbers typical of mass gathering events. A detailed discussion addresses implementation challenges faced and averted

    GPU accelerated Hungarian algorithm for traveling salesman problem

    Get PDF
    In this thesis, we present a model of the Traveling Salesman Problem (TSP) cast in a quadratic assignment problem framework with linearized objective function and constraints. This is referred to as Reformulation Linearization Technique at Level 2 (or RLT2). We apply dual ascent procedure for obtaining lower bounds that employs Linear Assignment Problem (LAP) solver recently developed by Date(2016). The solver is a parallelized Hungarian Algorithm that uses Compute Unified Device Architecture (CUDA) enabled NVIDIA Graphics Processing Units (GPU) as the parallel programming architecture. The aim of this thesis is to make use of a modified version of the Dual Ascent-LAP solver to solve the TSP. Though this procedure is computational expensive, the bounds obtained are tight and our experimental results confirm that the gap is within 2% for most problems. However, due to limitations in computational resources, we could only test problem sizes N < 30. Further work can be directed at theoretical and computational analysis to test the efficiency of our approach for larger problem instances

    Recent Advances on GPU Computing in Operations Research

    Get PDF
    In the last decade, Graphics Processing Units (GPUs) have gained an increasing popularity as accelerators for High Performance Computing (HPC) applications. Recent GPUs are not only powerful graphics engines but also highly threaded parallel computing processors that can achieve sustainable speedup as compared with CPUs. In this context, researchers try to exploit the capability of this architecture to solve difficult problems in many domains in science and engineering. In this article, we present recent advances on GPU Computing in Operations Research. We focus in particular on Integer Programming and Linear Programming

    Recent Advances on GPU Computing in Operations Research

    Get PDF
    Abstract-In the last decade, Graphics Processing Units (GPUs) have gained an increasing popularity as accelerators for High Performance Computing (HPC) applications. Recent GPUs are not only powerful graphics engines but also highly threaded parallel computing processors that can achieve sustainable speedup as compared with CPUs. In this context, researchers try to exploit the capability of this architecture to solve difficult problems in many domains in science and engineering. In this article, we present recent advances on GPU Computing in Operations Research. We focus in particular on Integer Programming and Linear Programming

    Parallel genetic approach for routing optimization in large ad hoc networks

    Get PDF
    This article presents a new approach of integrating parallelism into the genetic algorithm (GA), to solve the problem of routing in a large ad hoc network, the goal is to find the shortest path routing. Firstly, we fix the source and destination, and we use the variable-length chromosomes (routes) and their genes (nodes), in our work we have answered the following question: what is the better solution to find the shortest path: the sequential or parallel method?. All modern systems support simultaneous processes and threads, processes are instances of programs that generally run independently, for example, if you start a program, the operating system spawns a new process that runs parallel elements to other programs, within these processes, we can use threads to execute code simultaneously. Therefore, we can make the most of the available central processing unit (CPU) cores. Furthermore, the obtained results showed that our algorithm gives a much better quality of solutions. Thereafter, we propose an example of a network with 40 nodes, to study the difference between the sequential and parallel methods, then we increased the number of sensors to 100 nodes, to solve the problem of the shortest path in a large ad hoc network

    Accelerating supply chains with Ant Colony Optimization across range of hardware solutions

    Get PDF
    This pre-print, arXiv:2001.08102v1 [cs.NE], was published subsequently by Elsevier in Computers and Industrial Engineering, vol. 147, 106610, pp. 1-14 on 29 Jun 2020 and is available at https://doi.org/10.1016/j.cie.2020.106610Ant Colony algorithm has been applied to various optimization problems, however most of the previous work on scaling and parallelism focuses on Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new idea comparison, the algorithmic dynamics does not always transfer to complex real-life problems, where additional meta-data is required during solution construction. This paper looks at real-life outbound supply chain problem using Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO architectures - Independent Ant Colonies (IAC) and Parallel Ants (PA). Results showed that PA was able to reach a higher solution quality in fewer iterations as the number of parallel instances increased. Furthermore, speed performance was measured across three different hardware solutions - 16 core CPU, 68 core Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization techniques such as SS-Roulette were implemented using C++ and CUDA. Although excellent for TSP, it was concluded that for the given supply chain problem GPUs are not suitable due to meta-data access footprint required. Furthermore, compared to their sequential counterpart, vectorized CPU AVX2 implementation achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at least up to 1024 parallel instances on the supply chain network problem solved
    • …
    corecore