5 research outputs found
A Parallel Branch-and-Bound Method for Cluster Analysis
Cluster analysis is a generic term coined for procedures that are used objectively to group entities based on their similarities and differences. The primary objective of these procedures is to group n items into K mutually exclusive clusters so that items within each cluster are relatively homogeneous in nature while the clusters themselves are distinct. In this research, we have developed, implemented and tested an asynchronous, dynamic parallel branchand-bound algorithm to solve the clustering problem. In the developmental environment, several processes (tasks) work independently on various subproblems generated by the branch-and-bound procedure. This parallel algorithm can solve very large-scale, optimal clustering problems in a reasonable amount of wall-clock time. Linear and superlinear speedups are obtained. Thus, solutions to real-world, complex clustering problems, which could not be solved due to the lack of efficient parallel algorithms, can now be attempted
EFFICIENT IMPLEMENTATION OF BRANCH-AND-BOUND METHOD ON DESKTOP GRIDS
The Berkeley Open Infrastructure for Network Computing (BOINC) is an opensource middleware system for volunteer and desktop grid computing. In this paper we propose BNBTEST, a BOINC version of distributed branch and bound method. The crucial issues of distributed branch-and-bound method are traversing the search tree and loading balance. We developed subtaskspackaging method and three dierent subtasks' distribution strategies to solve these
Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm
International audienceIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which arises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to Ă77.46 are achieved for large problem instances
Paralelização da Técnica Branch and Bound com PVM
Orientador: Roberto A. HexselDissertação (mestrado) - Universidade Federal do ParanĂĄResumo: Este trabalho aborda a implementação paralela da tĂ©cnica Branch-and-Bound em problemas de otimização combinatoria, especificamente busca em grafos. E utilizado na implementação o modelo de programação paralela por troca de mensagens com o uso da biblioteca Parallel Virtual Machine (PVM) sobre o sistema operacional Linux em uma arquitetura multicomputador. E analisado o comportamento da tĂ©cnica Branch-and-Bound, em particular a relação entre (a) trĂȘs critĂ©rios de busca, (b) a utilização dos recursos de memĂłria e (c) granularidade de, processamento e comunicação entre processos. E proposto um esquema de implementação com processos mestre-escravos semi-distribuĂdo, onde o processo mestre Ă© responsĂĄvel pela distribuição de tarefas e os processos escravos pela disseminação de resultados parciais no sistema. Resultados experimentais dessa implementação sĂŁo exibidos e analisados, assim como algumas caracterĂsticas relevantes ao desempenho global encontradas no uso da biblioteca PVM para esta arquitetura. De um modo geral obtivemos em mĂ©dia para os problemas investigados uma eficiĂȘncia da execução paralela da ordem de 98% em comparação Ă execução serial.Abstract: This work presents a parallel implementation employing the Branch-and-Bound technique for solving combinatorial optimization problems, specifically search in graphs. The implementation is based on message passing, using the Parallel Virtual Machine (PVM) library on top of the Linux operational system on a network of PCs. The behaviour of this implementations of Branch-and-Bound is analyzed,whith emphasis on the relationships between (a) three graph search strategies, (b) memory resources utilization, (c) granularity (computation versus communication). The implementation consists of a master and slave processes. The master process is responsible for allocation of work amongst the slaves, and the slave processes for performing the graph search and communicating the parcial results to the master. We present experimental data of several runs of the program on the network of PCs, and analyse the performance of the implementation. We also discuss several issues related to the global performance attained, and on the use of the PVM library in this architecture. Our implementation achieved up to 98% of eficiency in some of the experiments performed
Méthodes de décomposition pour la parallélisation du simplexe en nombres entiers
RĂSUMĂ: Le SPP est un problĂšme de la programmation linĂ©aire en nombres entiers qui est utilisĂ© pour
modĂ©liser des problĂšmes industriels dans de nombreux domaines comme la planification des horaires du personnel, la logistique et la reconnaissance de formes. Dans lâindustrie de transport, il consiste Ă partitionner un ensemble de tĂąches (ex : vols dâavion, segments de trajet dâautobus...) en sous-ensembles (routes de vĂ©hicules ou rotations de personnel navigant) de sorte que les sous-ensembles sĂ©lectionnĂ©s aient un coĂ»t total minimal et que chaque tĂąche appartienne Ă un seul et unique sous-ensemble. Souvent, il est rĂ©solu par la mĂ©thode âbranch and boundâ ou ses variantes. Ces mĂ©thodes sâavĂšrent lentes dans le cas de problĂšmes denses de grande taille. Cependant, en industrie, il est apprĂ©ciĂ© dâavoir une solution rapidement et de tenir compte des informations disponibles telles que lâexistence dâune solution initiale notamment lors de la rĂ©-optimisation par exemple. Cet aspect est fourni aisĂ©ment par les
mĂ©thodes primales qui, Ă partir dâune solution initiale, produisent une suite de solutions Ă coĂ»ts dĂ©croissants qui converge vers une solution optimale. Lâalgorithme du simplexe en nombres entiers avec dĂ©composition (ISUD) est une mĂ©thode primale qui, Ă chaque itĂ©ration, dĂ©compose le problĂšme original en deux sous-problĂšmes. Un premier sous-problĂšme, appelĂ© problĂšme rĂ©duit, qui ne considĂšre que les colonnes dites compatibles avec la solution courante, i.e., sâĂ©crivant comme combinaison linĂ©aire de colonnes/variables non dĂ©gĂ©nĂ©rĂ©es de la solution courante. Un deuxiĂšme sous-problĂšme, appelĂ© problĂšme complĂ©mentaire, qui contient seulement les colonnes incompatibles avec la solution courante. Le problĂšme complĂ©mentaire permet de trouver une direction de descente composĂ©e de plusieurs variables garantissant une
solution meilleure, mais pas nĂ©cessairement entiĂšre. Dans le cas de solutions fractionnaires, un branchement en profondeur permet souvent dâaboutir rapidement Ă une solution entiĂšre. De nos jours, lâinformatique connaĂźt des Ă©volutions frappantes. Les transformations que
connaĂźt le matĂ©riel informatique en termes de vitesse et de puissance sont impressionnantes : un ordinateur portable contemporain est lâĂ©quivalent des plus grosses machines des annĂ©es 1970. Cette Ă©volution induit une transformation du logiciel et des algorithmes aussi profonde, en termes de qualitĂ© et de complexitĂ©. Par consĂ©quent, la tendance actuelle est de produire des processeurs multicoeurs assimilables Ă des machines parallĂšles et de concevoir et implĂ©menter des algorithmes parallĂšles. Lâobjectif gĂ©nĂ©ral de cette thĂšse est dâĂ©tudier les apports du parallĂ©lisme Ă lâalgorithme
dâISUD. Le but est de proposer des implĂ©mentations parallĂšles dâISUD afin dâamĂ©liorer ses performances et tirer profit des Ă©volutions contemporaines de lâinformatique. Pour concevoir ces algorithmes parallĂšles, nous avons exploitĂ© le parallĂ©lisme Ă lâintĂ©rieur dâISUD et nous avons introduit des dĂ©compositions spĂ©cifiques au SPP. Dans un premier temps, notre dĂ©marche est de grouper les colonnes de la solution courante en clusters afin de dĂ©composer le problĂšme initial en sous-problĂšmes indĂ©pendants. Ces derniers
sont rĂ©solus en parallĂšle afin dâamĂ©liorer la solution courante par combinaison des solutions optimales des sous-problĂšmes. Pour cela, nous construisons un graphe dont les noeuds sont les colonnes de la solution courante. Nous attribuons aux arĂȘtes des poids calculĂ©s par des fonctions de densitĂ© qui utilisent les informations issues du problĂšme original comme le nombre
de colonnes qui couvrent des tùches des colonnes Ai et Aj de la solution courante. Le graphe construit est scindé en sous-graphes et par la suite nous obtenons des clusters de la solution courante. Ainsi, nous avons ajouté une deuxiÚme décomposition dynamique à celle qui est déjà intrinsÚque à ISUD. Le résultat est un algorithme parallÚle, le simplexe en nombres entiers
avec double dĂ©composition, baptisĂ© ISU2D. Nous avons testĂ© ce nouvel algorithme sur des instances dâhoraires de chauffeurs dâautobus ayant 1600 contraintes et 570000 variables. Lâalgorithme rĂ©duit le temps dâexĂ©cution dâISUD par un facteur de 3, voire 4 pour certaines
instances. Il atteint la solution optimale, ou une solution assez proche, pour la majoritĂ© de ces instances en moins de 10 min alors que le solveur commercial CPLEX ne parvient pas Ă trouver une solution rĂ©alisable avec un gap moins de 10% aprĂšs une durĂ©e de plus dâune heure dâexĂ©cution. Lâalgorithme ISU2D, dans sa premiĂšre version, reprĂ©sente une premiĂšre implĂ©mentation parallĂšle de lâalgorithme du simplexe en nombres entiers. Cependant, ISU2D souffre encore de la limitation qui est lâutilisation dâune seule dĂ©composition de la solution
courante à la fois. Dans un deuxiÚme temps, nous améliorons ISU2D en généralisant certains aspects de son
concept. Notre objectif dans cette Ă©tape est dâutiliser plusieurs dĂ©compositions dynamiques simultanĂ©ment. Nous proposons un algorithme, nommĂ© DISUD, distribuĂ© Ă base dâISUD et du paradigme du systĂšme multi-agent (SMA). Chaque agent est une entitĂ© qui est, au moins partiellement, autonome et caractĂ©risĂ©e par la dĂ©composition dynamique de la solution courante quâelle applique. Les agents peuvent ĂȘtre indĂ©pendants ou coopĂ©rants suivant la stratĂ©gie adoptĂ©e. Ainsi, nous augmentons les performances dâISU2D et nous tirons profit
dâavantage des nouveautĂ©s en matĂ©riel informatique. Les tests faits sur des instances issues de lâoptimisation des horaires du personnel navigant de compagnies aĂ©riennes montrent que DISUD fonctionne mieux que DCPLEX, la version distribuĂ©e du solveur commercial de pointe
CPLEX. Il atteint des solutions de qualitĂ© meilleure que le DCPLEX en rĂ©duisant le temps dâexĂ©cution par un facteur de 4 en moyenne. De plus, il a rĂ©solu des instances de grande taille que le DCPLEX nâa pu amĂ©liorĂ© aprĂšs une heure dâexĂ©cution. Dans un troisiĂšme travail, nous rĂ©alisons lâobjectif dâintĂ©grer le DISUD dans un environnement de gĂ©nĂ©ration de colonnes (GC). Ce choix se justifie par le fait que lâintĂ©gration de la mĂ©thode de gĂ©nĂ©ration de colonnes avec les mĂ©thodes dâĂ©numĂ©ration telle le âbranch and priceâ est largement utilisĂ© dans lâindustrie. ISUD prĂ©sente du potentiel pour remplacer les mĂ©thodes dâĂ©numĂ©ration usuelles pour rĂ©soudre le SPP. Par consĂ©quent, il y a du potentiel Ă intĂ©grer GC et DISUD pour traiter des problĂšmes de lâindustrie. Nous dĂ©veloppons donc DICG la version distribuĂ©e de gĂ©nĂ©ration de colonnes qui utilise DISUD. Les rĂ©sultats que nous avons obtenus lors de nos tests ont montrĂ© que DICG permet dâavoir des solutions de bonne qualitĂ© et rĂ©duit le temps de calcul dâun facteur de 2 voire 4 par comparaison avec la DRMH, version distribuĂ©e de â Restricted Master Heuristicâ. Avec ces trois travaux, nous pensons avoir rĂ©alisĂ© des apports intĂ©ressants et amĂ©liorĂ© les performances dâISUD. En outre, nous ouvrons la voie pour des travaux futurs afin dâĂ©largir les utilisations de la version distribuĂ©e dâISUD comme par exemple, rendre les agents plus intelligents via des algorithmes dâapprentissage.----------ABSTRACT: SPP is an integer linear programming problem that is used to model many industrial problems such as personnel scheduling, logistics and pattern recognition. In the transport industry, it consists of partitioning a set of tasks ( plane flights, bus itinerary segments, ...) into subsets
(rotation of navigating personnel) so that the selected subsets have a minimum total cost and each task belongs exactly to one subset. Usually, SPP is solved by the branch and bound method or its variants. These methods are known to be slow in the case of large and difficult
problems. However, in industry it is appreciated to have a solution as quickly as possible and to consider available information such as the existence of an initial solution, especially in the re-optimization case. This aspect is easily provided by the primal methods which from an initial solution produce a sequence of decreasing cost solutions that converge towards an optimal or near optimal solution. The Integral Simplex Using Decomposition, ISUD, is a primal method dedicated to solve SPP. At each iteration, it decomposes the original problem into two sub-problems. The first, called the reduced problem (RP), only considers the so-called compatible columns with the current solution. The second, called the complementary problem (CP), deals only with the columns that are incompatible with the current solution. The complementary problem makes it possible to find a descent direction composed of several variables that could be fractional or integer solution. In the case of fractional solutions, a branching often leads to an integer solution. Nowadays, computing science evolves impressively. The transformation of computer hardware
into speedy machines is spectacular : a current laptop is equivalent to the 1970s biggest machines. The current trend is to produce multi-core processors and to design and implement parallel computing techniques. The general objective of this thesis is to study and apply parallel computing techniques to ISUD.We propose parallel implementations of ISUD in order to improve its performances and to profit from the contemporary evolution of the computer science. To design these algorithms we have exploited parallelism within ISUD and introduced specific decompositions. At first, we group the columns of the current solution into clusters in order to decompose the initial problem into independent sub-problems. These will be solved in parallel to get an
improving solution by combining the sub-problems optimal solutions. To do so, we construct a graph whose nodes are the current solution columns. The edge (i, j) weight is computed by weighting functions that use the information from the original problem such as the number of columns that span two columns Ai and Aj of the current solution. Then, the constructed graph is split into sub graphs and as a result to a set of the current solution clusters. Thus, we add a second dynamic decomposition to the RP-CP one which is intrinsic to ISUD.We obtain a parallel algorithm, The Integral Simplex Using Double Decomposition, called ISU2D. We tested it on instances of bus drivers having up to 1600 constraints and 570000 variables. The ISU2D reduces the computing time of ISUD by a factor of 3, even 4 for some instances. It reaches an optimal or near optimal solution for the majority of these instances in less than 10 min while the commercial solver CPLEX cannot even find a feasible solution with a gap that is less than 10 % after a one-hour time limit. But, ISU2D suffers of the limitation which is the use of a single decomposition of the current solution at a time.
In a second step, we improve our algorithm by generalizing the second decomposition concept. Indeed, our goal is to use multiple dynamic decompositions simultaneously. We propose an algorithm, called DISUD, a distributed algorithm based on ISUD and the multi-agent system (MAS). Each agent is an entity that is, at least partially, autonomous and characterized by the dynamic decomposition that it applies. We implemented two variants where agents can be independent or cooperating according to the strategy adopted. Thus, we increase the
performance of ISU2D and benefit more from computing hardware evolution. We tested DISUD on airplane flight scheduling problems. The obtained results show that DISUD is better than DCPLEX, the distributed version of the advanced CPLEX commercial solver on our test instances. It achieves better quality solutions than the DCPLEX and reduces the computing time by an average factor of 4 to 5 for some instances. In addition, it solved large
instances that the DCPLEX could not improve after a one-hour time limit. In a third work, we integrate the DISUD in a column generation context (GC). This choice is
justified by the fact that the coupling of the method of generating columns with enumeration methods such as branch and price is widely used in industry. ISUD has potential to replace the usual enumeration methods to solve the SPP. As a result, there is potential to integrate GC and DISUD to address industry issues. We develop DICG a column generation algorithm which uses DISUD instead of enumeration methods. The results that we obtained during our tests showed that DICG solutions are of good quality (less than 1%). Moreover, it reduces
the time of computation by a factor of 2 or even 4 compared to the DRMH, a distributed version of the Restricted Master heuristic. Thus, we have contributed to ISUD evolution. In addition, we improved the performances of ISUD and reduced its computing time. Furthermore, we paved the way for future work to expand the uses of the distributed version of ISUD