Search CORE

246 research outputs found

A GPU-accelerated Branch-and-Bound Algorithm for the Flow-Shop Scheduling Problem

Author: Chakroun Imen
Mohand Mezmaz
Nouredine Melab
Tuyttens Daniel
Publication venue
Publication date: 01/01/2012
Field of study

Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed up those methods. The focus is put on the bounding mechanism of B&B algorithms, which is the most time consuming part of their exploration process. We propose a parallel B&B algorithm based on a GPU-accelerated bounding model. The proposed approach concentrate on optimizing data access management to further improve the performance of the bounding mechanism which uses large and intermediate data sets that do not completely fit in GPU memory. Extensive experiments of the contribution have been carried out on well known FSP benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained performances to a single and a multithreaded CPU-based execution. Accelerations up to x100 are achieved for large problem instances

arXiv.org e-Print Archive

HAL - Lille 3

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

An Incremental Parallel PGAS-based Tree Search Algorithm

Author: Carneiro Tiago
Melab Nouredine
Publication venue: HAL CCSD
Publication date: 15/07/2019
Field of study

International audienceIn this work, we show that the Chapel high-productivity language is suitable for the design and implementation of all aspects involved in the conception of parallel tree search algorithms for solving combinatorial problems. Initially, it is possible to hand-optimize the data structures involved in the search process in a way equivalent to C. As a consequence, the single-threaded search in Chapel is on average only 7% slower than its counterpart written in C. Whereas programming a multicore tree search in Chapel is equivalent to C-OpenMP in terms of performance and programmability, its productivity-aware features for distributed programming stand out. It is possible to incrementally conceive a distributed tree search algorithm starting from its multicore counterpart by adding few lines of code. The distributed implementation performs load balancing among different computer nodes and also exploits all CPU cores of the system. Chapel presents an interesting trade-off between programmability and performance despite the high level of its features. The distributed tree search in Chapel is on average 16% slower and reaches up to 80% of the scalability achieved by its C-MPI+OpenMP counterpart

An Adaptative Multi-GPU based Branch-and-Bound. A Case Study: the Flow-Shop Scheduling Problem

Author: Chakroun Imen
Melab Nouredine
Publication venue
Publication date: 21/06/2012
Field of study

Solving exactly Combinatorial Optimization Problems (COPs) using a Branch-and-Bound (B&B) algorithm requires a huge amount of computational resources. Therefore, we recently investigated designing B&B algorithms on top of graphics processing units (GPUs) using a parallel bounding model. The proposed model assumes parallelizing the evaluation of the lower bounds on pools of sub-problems. The results demonstrated that the size of the evaluated pool has a significant impact on the performance of B&B and that it depends strongly on the problem instance being solved. In this paper, we design an adaptative parallel B&B algorithm for solving permutation-based combinatorial optimization problems such as FSP (Flow-shop Scheduling Problem) on GPU accelerators. To do so, we propose a dynamic heuristic for parameter auto-tuning at runtime. Another challenge of this work is to exploit larger degrees of parallelism by using the combined computational power of multiple GPU devices. The approach has been applied to the permutation flow-shop problem. Extensive experiments have been carried out on well-known FSP benchmarks using an Nvidia Tesla S1070 Computing System equipped with two Tesla T10 GPUs. Compared to a CPU-based execution, accelerations up to 105 are achieved for large problem instances.Comment: 14th IEEE International Conference on High Performance Computing and Communications, HPCC 2012 (2012

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

Adaptive Dynamic Load Balancing in Heterogenous Multiple GPUs-CPUs Distributed Setting: Case Study of B&B Tree Search

Author: Derbel Bilel
Melab Nouredine
Vu Trong-Tuan
Publication venue: Lecture Notes in Computer Science
Publication date: 07/01/2013
Field of study

International audienceThe emergence of new hybrid and heterogenous multi-GPU multi-CPU large scale platforms offers new opportunities and pauses new challenges when solving difficult optimization problems. This paper targets irregular tree search algorithms in which workload is unpredictable. We propose an adaptive distributed approach allowing to distribute the load dynamically at runtime while taking into account the computing abilities of either GPUs or CPUs. Using Branch-and-Bound and Flowshop as a case study, we deployed our approach using up to 20 GPUs jointly to up to 128 CPUs. Through extensive experiments in different system configurations, we report near optimal speedups, thus providing new insights into how to take full advantage of both GPUs and CPUs power in modern computing platforms

HAL - Lille 3

INRIA a CCSD electronic archive server

Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem

Author: Bendjoudi Ahcène
Chakroun Imen
Melab Nouredine
Publication venue: HAL CCSD
Publication date: 10/09/2011
Field of study

International audienceIn this paper,we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, we investigate two software based approaches called thread data reordering and branch refactoring. Experiments reported that parallel evaluation of bounds speeds up execution up to 54.5 times compared to a CPU version

HAL - Lille 3

INRIA a CCSD electronic archive server

B&B@Grid : une approche efficace pour la gridification d'un algorithme Branch and Bound

Author: Melab Nouredine
Mezmaz Mohand
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

La résolution exacte de problèmes d'optimisation combinatoire de grande taille, tels que les problèmes d'ordonnancement, constitue un vrai défi pour les grilles informatiques. En effet, il est nécessaire de repenser les algorithmes de résolution pour prendre en compte les caractéristiques de tels environnements, notamment leur grande échelle, l'hétérogénéité et la disponibilité dynamique de leurs ressources, et leur nature multi-domaine d'administration. Dans cet article, nous proposons une nouvelle approche de passage sur grilles de calcul des méthodes exactes de type Branch-and-Bound appelée B&B@Grid. Cette approche est basée sur un codage des unités de travail (sous problèmes) sous forme d'intervalles permettant de minimiser le coût des communications induites par les opérations de régulation de charge, de tolérance aux pannes et de détection de la terminaison. Cette approche, beaucoup plus performante en terme de coût de communication et de sauvegarde que les meilleures approches connues dans la littérature, a permis la résolution optimale sur la grille nationale Grid'5000 d'une instance standard du problème du Flow-Shop restée non résolue depuis une quinzaine d'années. Le Flow-Shop est l'un des problèmes d'ordonnancement les plus étudiés

HAL - Lille 3

INRIA a CCSD electronic archive server

A Multi-start Local Search Scheduler for an Energy-aware Cloud Manager

Author: Kessaci Yacine
Nouredine Melab
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 28/10/2012
Field of study

International audienceThe field of cloud computing uses different management techniques for data center virtualization such as OpenNebula. However, computers composing the cloud infrastructure use a significant and growing portion of energy in the world specifically when dealing with virtualization for high performance computing (HPC). Therefore, energy-aware computing is crucial for large-scale systems that consume considerable amount of energy. In this paper, we present a new work that aims to deal with the energy consumption within a realistic cloud infrastructure using OpenNebula as a software management solution. Our scheduler is based on a multi-start local search heuristic that helps to find the best scheduling by dispatching the arriving of virtual machines (VM) according to the minimum energy consumption

HAL - Lille 3

INRIA a CCSD electronic archive server

Parallel Hybrid Evolutionary Algorithms on GPU

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceOver the last years, interest in hybrid metaheuristics has risen considerably in the field of optimization. Combinations of methods such as evolutionary algorithms and local searches have provided very powerful search algorithms. However, due to their complexity, the computational time of the solution search exploration remains exorbitant when large problem instances are to be solved. Therefore, the use of GPU-based parallel computing is required as a complementary way to speed up the search. This paper presents a new methodology to design and implement efficiently and effectively hybrid evolutionary algorithms on GPU accelerators. The methodology enables efficient mappings of the explored search space onto the GPU memory hierarchy. The experimental results show that the approach is very efficient especially for large problem instances

HAL - Lille 3

INRIA a CCSD electronic archive server

A Pareto-based Genetic Algorithm for Optimized Assignment of VM Requests on a Cloud Brokering Environment

Author: Kessaci Yacine
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceIn this paper, we deal with cloud brokering for the assignment optimization of VM requests in three-tier cloud infrastructures. We investigate the Pareto-based meta-heuristic approach to take into account multiple client and brokercentric optimization criteria. We propose a new multi-objective Genetic Algorithm ( MOGA-CB ) that can be integrated in a cloud broker. Two objectives are considered in the optimization process: minimizing both the response time and the cost of the selected VM instances to satisfy the clients and to maximize the profit of the broker. The approach has been experimented using realistic data of different types of Amazon EC2 instances and their pricing history. The reported results show that MOGA-CB provides efficiently effective Pareto sets of solutions

HAL - Lille 3

CiteSeerX

INRIA a CCSD electronic archive server