5 research outputs found

    A Parallel Branch-and-Bound Method for Cluster Analysis

    Get PDF
    Cluster analysis is a generic term coined for procedures that are used objectively to group entities based on their similarities and differences. The primary objective of these procedures is to group n items into K mutually exclusive clusters so that items within each cluster are relatively homogeneous in nature while the clusters themselves are distinct. In this research, we have developed, implemented and tested an asynchronous, dynamic parallel branchand-bound algorithm to solve the clustering problem. In the developmental environment, several processes (tasks) work independently on various subproblems generated by the branch-and-bound procedure. This parallel algorithm can solve very large-scale, optimal clustering problems in a reasonable amount of wall-clock time. Linear and superlinear speedups are obtained. Thus, solutions to real-world, complex clustering problems, which could not be solved due to the lack of efficient parallel algorithms, can now be attempted

    EFFICIENT IMPLEMENTATION OF BRANCH-AND-BOUND METHOD ON DESKTOP GRIDS

    Get PDF
    The Berkeley Open Infrastructure for Network Computing (BOINC) is an opensource middleware system for volunteer and desktop grid computing. In this paper we propose BNBTEST, a BOINC version of distributed branch and bound method. The crucial issues of distributed branch-and-bound method are traversing the search tree and loading balance. We developed subtaskspackaging method and three dierent subtasks' distribution strategies to solve these

    Reducing thread divergence in a GPU-accelerated branch-and-bound algorithm

    Get PDF
    International audienceIn this paper, we address the design and implementation of GPU-accelerated Branch-and-Bound algorithms (B&B) for solving Flow-shop scheduling optimization problems (FSP). Such applications are CPU-time consuming and highly irregular. On the other hand, GPUs are massively multi-threaded accelerators using the SIMD model at execution. A major issue which arises when executing on GPU a B&B applied to FSP is thread or branch divergence. Such divergence is caused by the lower bound function of FSP which contains many irregular loops and conditional instructions. Our challenge is therefore to revisit the design and implementation of B&B applied to FSP dealing with thread divergence. Extensive experiments of the proposed approach have been carried out on well-known FSP benchmarks using an Nvidia Tesla C2050 GPU card. Compared to a CPU-based execution, accelerations up to ×77.46 are achieved for large problem instances

    Paralelização da Técnica Branch and Bound com PVM

    Get PDF
    Orientador: Roberto A. HexselDissertação (mestrado) - Universidade Federal do ParanĂĄResumo: Este trabalho aborda a implementação paralela da tĂ©cnica Branch-and-Bound em problemas de otimização combinatoria, especificamente busca em grafos. E utilizado na implementação o modelo de programação paralela por troca de mensagens com o uso da biblioteca Parallel Virtual Machine (PVM) sobre o sistema operacional Linux em uma arquitetura multicomputador. E analisado o comportamento da tĂ©cnica Branch-and-Bound, em particular a relação entre (a) trĂȘs critĂ©rios de busca, (b) a utilização dos recursos de memĂłria e (c) granularidade de, processamento e comunicação entre processos. E proposto um esquema de implementação com processos mestre-escravos semi-distribuĂ­do, onde o processo mestre Ă© responsĂĄvel pela distribuição de tarefas e os processos escravos pela disseminação de resultados parciais no sistema. Resultados experimentais dessa implementação sĂŁo exibidos e analisados, assim como algumas caracterĂ­sticas relevantes ao desempenho global encontradas no uso da biblioteca PVM para esta arquitetura. De um modo geral obtivemos em mĂ©dia para os problemas investigados uma eficiĂȘncia da execução paralela da ordem de 98% em comparação Ă  execução serial.Abstract: This work presents a parallel implementation employing the Branch-and-Bound technique for solving combinatorial optimization problems, specifically search in graphs. The implementation is based on message passing, using the Parallel Virtual Machine (PVM) library on top of the Linux operational system on a network of PCs. The behaviour of this implementations of Branch-and-Bound is analyzed,whith emphasis on the relationships between (a) three graph search strategies, (b) memory resources utilization, (c) granularity (computation versus communication). The implementation consists of a master and slave processes. The master process is responsible for allocation of work amongst the slaves, and the slave processes for performing the graph search and communicating the parcial results to the master. We present experimental data of several runs of the program on the network of PCs, and analyse the performance of the implementation. We also discuss several issues related to the global performance attained, and on the use of the PVM library in this architecture. Our implementation achieved up to 98% of eficiency in some of the experiments performed

    Méthodes de décomposition pour la parallélisation du simplexe en nombres entiers

    Get PDF
    RÉSUMÉ: Le SPP est un problĂšme de la programmation linĂ©aire en nombres entiers qui est utilisĂ© pour modĂ©liser des problĂšmes industriels dans de nombreux domaines comme la planification des horaires du personnel, la logistique et la reconnaissance de formes. Dans l’industrie de transport, il consiste Ă  partitionner un ensemble de tĂąches (ex : vols d’avion, segments de trajet d’autobus...) en sous-ensembles (routes de vĂ©hicules ou rotations de personnel navigant) de sorte que les sous-ensembles sĂ©lectionnĂ©s aient un coĂ»t total minimal et que chaque tĂąche appartienne Ă  un seul et unique sous-ensemble. Souvent, il est rĂ©solu par la mĂ©thode ”branch and bound” ou ses variantes. Ces mĂ©thodes s’avĂšrent lentes dans le cas de problĂšmes denses de grande taille. Cependant, en industrie, il est apprĂ©ciĂ© d’avoir une solution rapidement et de tenir compte des informations disponibles telles que l’existence d’une solution initiale notamment lors de la rĂ©-optimisation par exemple. Cet aspect est fourni aisĂ©ment par les mĂ©thodes primales qui, Ă  partir d’une solution initiale, produisent une suite de solutions Ă  coĂ»ts dĂ©croissants qui converge vers une solution optimale. L’algorithme du simplexe en nombres entiers avec dĂ©composition (ISUD) est une mĂ©thode primale qui, Ă  chaque itĂ©ration, dĂ©compose le problĂšme original en deux sous-problĂšmes. Un premier sous-problĂšme, appelĂ© problĂšme rĂ©duit, qui ne considĂšre que les colonnes dites compatibles avec la solution courante, i.e., s’écrivant comme combinaison linĂ©aire de colonnes/variables non dĂ©gĂ©nĂ©rĂ©es de la solution courante. Un deuxiĂšme sous-problĂšme, appelĂ© problĂšme complĂ©mentaire, qui contient seulement les colonnes incompatibles avec la solution courante. Le problĂšme complĂ©mentaire permet de trouver une direction de descente composĂ©e de plusieurs variables garantissant une solution meilleure, mais pas nĂ©cessairement entiĂšre. Dans le cas de solutions fractionnaires, un branchement en profondeur permet souvent d’aboutir rapidement Ă  une solution entiĂšre. De nos jours, l’informatique connaĂźt des Ă©volutions frappantes. Les transformations que connaĂźt le matĂ©riel informatique en termes de vitesse et de puissance sont impressionnantes : un ordinateur portable contemporain est l’équivalent des plus grosses machines des annĂ©es 1970. Cette Ă©volution induit une transformation du logiciel et des algorithmes aussi profonde, en termes de qualitĂ© et de complexitĂ©. Par consĂ©quent, la tendance actuelle est de produire des processeurs multicoeurs assimilables Ă  des machines parallĂšles et de concevoir et implĂ©menter des algorithmes parallĂšles. L’objectif gĂ©nĂ©ral de cette thĂšse est d’étudier les apports du parallĂ©lisme Ă  l’algorithme d’ISUD. Le but est de proposer des implĂ©mentations parallĂšles d’ISUD afin d’amĂ©liorer ses performances et tirer profit des Ă©volutions contemporaines de l’informatique. Pour concevoir ces algorithmes parallĂšles, nous avons exploitĂ© le parallĂ©lisme Ă  l’intĂ©rieur d’ISUD et nous avons introduit des dĂ©compositions spĂ©cifiques au SPP. Dans un premier temps, notre dĂ©marche est de grouper les colonnes de la solution courante en clusters afin de dĂ©composer le problĂšme initial en sous-problĂšmes indĂ©pendants. Ces derniers sont rĂ©solus en parallĂšle afin d’amĂ©liorer la solution courante par combinaison des solutions optimales des sous-problĂšmes. Pour cela, nous construisons un graphe dont les noeuds sont les colonnes de la solution courante. Nous attribuons aux arĂȘtes des poids calculĂ©s par des fonctions de densitĂ© qui utilisent les informations issues du problĂšme original comme le nombre de colonnes qui couvrent des tĂąches des colonnes Ai et Aj de la solution courante. Le graphe construit est scindĂ© en sous-graphes et par la suite nous obtenons des clusters de la solution courante. Ainsi, nous avons ajoutĂ© une deuxiĂšme dĂ©composition dynamique Ă  celle qui est dĂ©jĂ  intrinsĂšque Ă  ISUD. Le rĂ©sultat est un algorithme parallĂšle, le simplexe en nombres entiers avec double dĂ©composition, baptisĂ© ISU2D. Nous avons testĂ© ce nouvel algorithme sur des instances d’horaires de chauffeurs d’autobus ayant 1600 contraintes et 570000 variables. L’algorithme rĂ©duit le temps d’exĂ©cution d’ISUD par un facteur de 3, voire 4 pour certaines instances. Il atteint la solution optimale, ou une solution assez proche, pour la majoritĂ© de ces instances en moins de 10 min alors que le solveur commercial CPLEX ne parvient pas Ă  trouver une solution rĂ©alisable avec un gap moins de 10% aprĂšs une durĂ©e de plus d’une heure d’exĂ©cution. L’algorithme ISU2D, dans sa premiĂšre version, reprĂ©sente une premiĂšre implĂ©mentation parallĂšle de l’algorithme du simplexe en nombres entiers. Cependant, ISU2D souffre encore de la limitation qui est l’utilisation d’une seule dĂ©composition de la solution courante Ă  la fois. Dans un deuxiĂšme temps, nous amĂ©liorons ISU2D en gĂ©nĂ©ralisant certains aspects de son concept. Notre objectif dans cette Ă©tape est d’utiliser plusieurs dĂ©compositions dynamiques simultanĂ©ment. Nous proposons un algorithme, nommĂ© DISUD, distribuĂ© Ă  base d’ISUD et du paradigme du systĂšme multi-agent (SMA). Chaque agent est une entitĂ© qui est, au moins partiellement, autonome et caractĂ©risĂ©e par la dĂ©composition dynamique de la solution courante qu’elle applique. Les agents peuvent ĂȘtre indĂ©pendants ou coopĂ©rants suivant la stratĂ©gie adoptĂ©e. Ainsi, nous augmentons les performances d’ISU2D et nous tirons profit d’avantage des nouveautĂ©s en matĂ©riel informatique. Les tests faits sur des instances issues de l’optimisation des horaires du personnel navigant de compagnies aĂ©riennes montrent que DISUD fonctionne mieux que DCPLEX, la version distribuĂ©e du solveur commercial de pointe CPLEX. Il atteint des solutions de qualitĂ© meilleure que le DCPLEX en rĂ©duisant le temps d’exĂ©cution par un facteur de 4 en moyenne. De plus, il a rĂ©solu des instances de grande taille que le DCPLEX n’a pu amĂ©liorĂ© aprĂšs une heure d’exĂ©cution. Dans un troisiĂšme travail, nous rĂ©alisons l’objectif d’intĂ©grer le DISUD dans un environnement de gĂ©nĂ©ration de colonnes (GC). Ce choix se justifie par le fait que l’intĂ©gration de la mĂ©thode de gĂ©nĂ©ration de colonnes avec les mĂ©thodes d’énumĂ©ration telle le ”branch and price” est largement utilisĂ© dans l’industrie. ISUD prĂ©sente du potentiel pour remplacer les mĂ©thodes d’énumĂ©ration usuelles pour rĂ©soudre le SPP. Par consĂ©quent, il y a du potentiel Ă  intĂ©grer GC et DISUD pour traiter des problĂšmes de l’industrie. Nous dĂ©veloppons donc DICG la version distribuĂ©e de gĂ©nĂ©ration de colonnes qui utilise DISUD. Les rĂ©sultats que nous avons obtenus lors de nos tests ont montrĂ© que DICG permet d’avoir des solutions de bonne qualitĂ© et rĂ©duit le temps de calcul d’un facteur de 2 voire 4 par comparaison avec la DRMH, version distribuĂ©e de ” Restricted Master Heuristic”. Avec ces trois travaux, nous pensons avoir rĂ©alisĂ© des apports intĂ©ressants et amĂ©liorĂ© les performances d’ISUD. En outre, nous ouvrons la voie pour des travaux futurs afin d’élargir les utilisations de la version distribuĂ©e d’ISUD comme par exemple, rendre les agents plus intelligents via des algorithmes d’apprentissage.----------ABSTRACT: SPP is an integer linear programming problem that is used to model many industrial problems such as personnel scheduling, logistics and pattern recognition. In the transport industry, it consists of partitioning a set of tasks ( plane flights, bus itinerary segments, ...) into subsets (rotation of navigating personnel) so that the selected subsets have a minimum total cost and each task belongs exactly to one subset. Usually, SPP is solved by the branch and bound method or its variants. These methods are known to be slow in the case of large and difficult problems. However, in industry it is appreciated to have a solution as quickly as possible and to consider available information such as the existence of an initial solution, especially in the re-optimization case. This aspect is easily provided by the primal methods which from an initial solution produce a sequence of decreasing cost solutions that converge towards an optimal or near optimal solution. The Integral Simplex Using Decomposition, ISUD, is a primal method dedicated to solve SPP. At each iteration, it decomposes the original problem into two sub-problems. The first, called the reduced problem (RP), only considers the so-called compatible columns with the current solution. The second, called the complementary problem (CP), deals only with the columns that are incompatible with the current solution. The complementary problem makes it possible to find a descent direction composed of several variables that could be fractional or integer solution. In the case of fractional solutions, a branching often leads to an integer solution. Nowadays, computing science evolves impressively. The transformation of computer hardware into speedy machines is spectacular : a current laptop is equivalent to the 1970s biggest machines. The current trend is to produce multi-core processors and to design and implement parallel computing techniques. The general objective of this thesis is to study and apply parallel computing techniques to ISUD.We propose parallel implementations of ISUD in order to improve its performances and to profit from the contemporary evolution of the computer science. To design these algorithms we have exploited parallelism within ISUD and introduced specific decompositions. At first, we group the columns of the current solution into clusters in order to decompose the initial problem into independent sub-problems. These will be solved in parallel to get an improving solution by combining the sub-problems optimal solutions. To do so, we construct a graph whose nodes are the current solution columns. The edge (i, j) weight is computed by weighting functions that use the information from the original problem such as the number of columns that span two columns Ai and Aj of the current solution. Then, the constructed graph is split into sub graphs and as a result to a set of the current solution clusters. Thus, we add a second dynamic decomposition to the RP-CP one which is intrinsic to ISUD.We obtain a parallel algorithm, The Integral Simplex Using Double Decomposition, called ISU2D. We tested it on instances of bus drivers having up to 1600 constraints and 570000 variables. The ISU2D reduces the computing time of ISUD by a factor of 3, even 4 for some instances. It reaches an optimal or near optimal solution for the majority of these instances in less than 10 min while the commercial solver CPLEX cannot even find a feasible solution with a gap that is less than 10 % after a one-hour time limit. But, ISU2D suffers of the limitation which is the use of a single decomposition of the current solution at a time. In a second step, we improve our algorithm by generalizing the second decomposition concept. Indeed, our goal is to use multiple dynamic decompositions simultaneously. We propose an algorithm, called DISUD, a distributed algorithm based on ISUD and the multi-agent system (MAS). Each agent is an entity that is, at least partially, autonomous and characterized by the dynamic decomposition that it applies. We implemented two variants where agents can be independent or cooperating according to the strategy adopted. Thus, we increase the performance of ISU2D and benefit more from computing hardware evolution. We tested DISUD on airplane flight scheduling problems. The obtained results show that DISUD is better than DCPLEX, the distributed version of the advanced CPLEX commercial solver on our test instances. It achieves better quality solutions than the DCPLEX and reduces the computing time by an average factor of 4 to 5 for some instances. In addition, it solved large instances that the DCPLEX could not improve after a one-hour time limit. In a third work, we integrate the DISUD in a column generation context (GC). This choice is justified by the fact that the coupling of the method of generating columns with enumeration methods such as branch and price is widely used in industry. ISUD has potential to replace the usual enumeration methods to solve the SPP. As a result, there is potential to integrate GC and DISUD to address industry issues. We develop DICG a column generation algorithm which uses DISUD instead of enumeration methods. The results that we obtained during our tests showed that DICG solutions are of good quality (less than 1%). Moreover, it reduces the time of computation by a factor of 2 or even 4 compared to the DRMH, a distributed version of the Restricted Master heuristic. Thus, we have contributed to ISUD evolution. In addition, we improved the performances of ISUD and reduced its computing time. Furthermore, we paved the way for future work to expand the uses of the distributed version of ISUD
    corecore