Search CORE

2 research outputs found

Effective Implementation of GPU-based Revised Simplex algorithm applying new memory management and cycle avoidance strategies

Author: Gahrouei Arash Raeisi
Ghatee Mehdi
Publication venue
Publication date: 12/03/2018
Field of study

Graphics Processing Units (GPUs) with high computational capabilities used as modern parallel platforms to deal with complex computational problems. We use this platform to solve large-scale linear programing problems by revised simplex algorithm. To implement this algorithm, we propose some new memory management strategies. In addition, to avoid cycling because of degeneracy conditions, we use a tabu rule for entering variable selection in the revised simplex algorithm. To evaluate this algorithm, we consider two sets of benchmark problems and compare the speedup factors for these problems. The comparisons demonstrate that the proposed method is highly effective and solve the problems with the maximum speedup factors 165.2 and 65.46 with respect to the sequential version and Matlab Linprog solver respectively.Comment: 27 pages, 6 Tables, 10 Figures, Extracted from a PhD research program in Department of Computer Science of Amirkabir University of Technology, Tehran, Ira

arXiv.org e-Print Archive

Simultaneous Solving of Batched Linear Programs on a GPU

Author: Gurung Amit
Ray Rajarshi
Publication venue
Publication date: 21/02/2018
Field of study

Linear Programs (LPs) appear in a large number of applications and offloading them to a GPU is viable to gain performance. Existing work on offloading and solving an LP on a GPU suggests that there is performance gain generally on large sized LPs (typically 500 constraints, 500 variables and above). In order to gain performance from a GPU, for applications involving small to medium sized LPs, we propose batched solving of a large number of LPs in parallel. In this paper, we present the design and implementation of a batched LP solver in CUDA, keeping memory coalescent access, low CPU-GPU memory transfer latency and load balancing as the goals. The performance of the batched LP solver is compared against sequential solving in the CPU using the open source solver GLPK (GNU Linear Programming Kit) and the CPLEX solver from IBM. The evaluation on selected LP benchmarks from the Netlib repository displays a maximum speed-up of 95x and 5x with respect to CPLEX and GLPK solver respectively, for a batch of 1e5 LPs. We demonstrate the application of our batched LP solver to enhance performance in the domain of state-space exploration of mathematical models of control systems design.Comment: Around 13 figures and 24 pages. arXiv admin note: substantial text overlap with arXiv:1609.0811

arXiv.org e-Print Archive