Search CORE

3,369 research outputs found

Solving the Uncapacitated Single Allocation p-Hub Median Problem on GPU

Author: A Ilic
AT Ernst
AT Ernst
D Bryan
EG Talbi
H Damgacioglu
H Topcuoglu
I Contreras
J Kratica
J Sohn
JF Campbell
JF Campbell
JF Campbell
JF Chen
M Labbe
M Maric
MR Silva
MW Horner
R Abyazi-Sani
RS Camargo de
S Abdinnour-Helm
T Meyer
TV Luong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/04/2017
Field of study

A parallel genetic algorithm (GA) implemented on GPU clusters is proposed to solve the Uncapacitated Single Allocation p-Hub Median problem. The GA uses binary and integer encoding and genetic operators adapted to this problem. Our GA is improved by generated initial solution with hubs located at middle nodes. The obtained experimental results are compared with the best known solutions on all benchmarks on instances up to 1000 nodes. Furthermore, we solve our own randomly generated instances up to 6000 nodes. Our approach outperforms most well-known heuristics in terms of solution quality and time execution and it allows hitherto unsolved problems to be solved

arXiv.org e-Print Archive

Crossref

Interior Point Methods on GPU with application to Model Predictive Control

Author: Gade-Nielsen Nicolai Fog
Publication venue: Technical University of Denmark
Publication date: 01/01/2014
Field of study

Online Research Database In Technology

Activity recognition from videos with parallel hypergraph matching on GPUs

Author: Celiktutan Oya
Lombardi Eric
Sankur Bülent
Wolf Christian
Publication venue
Publication date: 04/05/2015
Field of study

In this paper, we propose a method for activity recognition from videos based on sparse local features and hypergraph matching. We benefit from special properties of the temporal domain in the data to derive a sequential and fast graph matching algorithm for GPUs. Traditionally, graphs and hypergraphs are frequently used to recognize complex and often non-rigid patterns in computer vision, either through graph matching or point-set matching with graphs. Most formulations resort to the minimization of a difficult discrete energy function mixing geometric or structural terms with data attached terms involving appearance features. Traditional methods solve this minimization problem approximately, for instance with spectral techniques. In this work, instead of solving the problem approximatively, the exact solution for the optimal assignment is calculated in parallel on GPUs. The graphical structure is simplified and regularized, which allows to derive an efficient recursive minimization algorithm. The algorithm distributes subproblems over the calculation units of a GPU, which solves them in parallel, allowing the system to run faster than real-time on medium-end GPUs

arXiv.org e-Print Archive

Hal-Diderot

Adaptive large neighborhood search algorithm – performance evaluation under parallel schemes & applications

Author: Kumar Sandip
Publication venue: Scholars Junction
Publication date: 12/05/2023
Field of study

Adaptive Large Neighborhood Search (ALNS) is a fairly recent yet popular single-solution heuristic for solving discrete optimization problems. Even though the heuristic has been a popular choice for researchers in recent times, the parallelization of this algorithm is not widely studied in the literature compared to the other classical metaheuristics. To extend the existing literature, this study proposes several different parallel schemes to parallelize the basic/sequential ALNS algorithm. More specifically, seven different parallel schemes are employed to target different characteristics of the ALNS algorithm and the capability of the local computers. The schemes of this study are implemented in a master-slave architecture to manage and assign loads in processors of the local computers. The overall goal is to simultaneously explore different areas of the search space in an attempt to escape the local minima, taking effective steps toward the optimal solution and, to the end, accelerating the convergence of the ALNS algorithm. The performance of the schemes is tested by solving a capacitated vehicle routing problem (CVRP) with available wellknown test instances. Our computational results indicate that all the parallel schemes are capable of providing a competitive optimality gap in solving CVRP within our investigated test instances. However, the parallel scheme (scheme 1), which runs the ALNS algorithm independently within different slave processors (e.g., without sharing any information with other slave processors) until the synchronization occurs only when one of the processors meets its predefined termination criteria and reports the solution to the master processor, provides the best running time with solving the instances approximately 10.5 times faster than the basic/sequential ALNS algorithm. These findings are applied in a real-life fulfillment process using mixed-mode delivery with trucks and drones. Complex but optimized routes are generated in a short time that is applicable to perform last-mile delivery to customers

Scholars Junction - Mississippi State University Institutional Repository