72,490 research outputs found

    Parallel branch and bound on an MIMD system

    Get PDF
    In this paper we give a classification of parallel branch and bound algorithms and develop a class of asynchronous branch and bound algorithms for execution on an MIMD system. We develop sufficient conditions to prevent the anomalies that can occur due to the parallelism, the asynchronicity or the nondeter- minism, from degrading the performance of the algorithm. Such conditions were known already for the synchronous case. It turns out that these conditions are sufficient for asynchronous algorithms as well. We also investigate the consequences of nonhomogeneous processing elements in a parallel computer system. We introduce the notions of perfect parallel time and achieved efficiency to empirically measure the effects of parallelism, because the traditional notions of speedup and efficiency are not capable of fully characterizing the actual execution of an asyn-chronous parallel algorithm. Finally we present some computational results obtained for the symmetric traveling salesman problem

    A simulation tool for the performance evaluation of parallel branch and bound algorithms

    Get PDF
    Parallel computation offers a challenging opportunity to speed up the time consuming enumerative procedures that are necessary to solve hard combinatorial problems. Theoretical analysis of such a parallel branch and bound algorithm is very hard and empirical analysis is not straightforward because the performance of a parallel algorithm cannot be evaluated simply by executing the algorithm on a few parallel systems. Among the difficulties encountered are the noise produced by other users on the system, the limited variation in parallelism (the number of processors in the system is strictly bounded) and the waste of resources involved: most of the time, the outcomes of all computations are already known and the only issue of interest is when these outcomes are produced. We will describe a way to simulate the execution of parallel branch and bound algorithms on arbitrary parallel systems in such a way that the memory and cpu requirements are very reasonable. The use of simulation has only minor consequences for the formulation of the algorithm

    Parallel branch and bound and anomalies

    Get PDF
    In this paper we present a classification of parallel branch and bound algorithms and investigate the anomalies which can occur during the execution of such algorithms. We develop sufficient conditions to prevent deceleration anomalies from degrading the performance. Such conditions were already known for some synchronous cases. It turns out that these conditions can be generalized to arbitrary cases. Finally we develop necessary conditions for acceleration anomalies to improve upon the performance

    Experiments with parallel algorithms for combinatorial problems

    Get PDF
    In the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines constructed so far all use a simple model of parallel computation. Therefore, not every existing parallel machine is equally well suited for each type of algorithm. The adaptation of a certain algorithm to a specific parallel archi- tecture may severely increase the complexity of the algorithm or severely obscure its essence. Little is known about the performance of some standard combinatorial algorithms on existing parallel machines. In this paper we present computational results concerning the solution of knapsack, shortest paths and change-making problems by branch and bound, dynamic programming, and divide and conquer algorithms on the ICL-DAP (an SIMD computer), the Manchester dataflow machine and the CDC-CYBER-205 (a pipeline computer)

    Replicable parallel branch and bound search

    Get PDF
    Combinatorial branch and bound searches are a common technique for solving global optimisation and decision problems. Their performance often depends on good search order heuristics, refined over decades of algorithms research. Parallel search necessarily deviates from the sequential search order, sometimes dramatically and unpredictably, e.g. by distributing work at random. This can disrupt effective search order heuristics and lead to unexpected and highly variable parallel performance. The variability makes it hard to reason about the parallel performance of combinatorial searches. This paper presents a generic parallel branch and bound skeleton, implemented in Haskell, with replicable parallel performance. The skeleton aims to preserve the search order heuristic by distributing work in an ordered fashion, closely following the sequential search order. We demonstrate the generality of the approach by applying the skeleton to 40 instances of three combinatorial problems: Maximum Clique, 0/1 Knapsack and Travelling Salesperson. The overheads of our Haskell skeleton are reasonable: giving slowdown factors of between 1.9 and 6.2 compared with a class-leading, dedicated, and highly optimised C++ Maximum Clique solver. We demonstrate scaling up to 200 cores of a Beowulf cluster, achieving speedups of 100x for several Maximum Clique instances. We demonstrate low variance of parallel performance across all instances of the three combinatorial problems and at all scales up to 200 cores, with median Relative Standard Deviation (RSD) below 2%. Parallel solvers that do not follow the sequential search order exhibit far higher variance, with median RSD exceeding 85% for Knapsack

    A GPU-accelerated Branch-and-Bound Algorithm for the Flow-Shop Scheduling Problem

    Get PDF
    Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed up those methods. The focus is put on the bounding mechanism of B&B algorithms, which is the most time consuming part of their exploration process. We propose a parallel B&B algorithm based on a GPU-accelerated bounding model. The proposed approach concentrate on optimizing data access management to further improve the performance of the bounding mechanism which uses large and intermediate data sets that do not completely fit in GPU memory. Extensive experiments of the contribution have been carried out on well known FSP benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained performances to a single and a multithreaded CPU-based execution. Accelerations up to x100 are achieved for large problem instances

    On parallel Branch and Bound frameworks for Global Optimization

    Get PDF
    Branch and Bound (B&B) algorithms are known to exhibit an irregularity of the search tree. Therefore, developing a parallel approach for this kind of algorithms is a challenge. The efficiency of a B&B algorithm depends on the chosen Branching, Bounding, Selection, Rejection, and Termination rules. The question we investigate is how the chosen platform consisting of programming language, used libraries, or skeletons influences programming effort and algorithm performance. Selection rule and data management structures are usually hidden to programmers for frameworks with a high level of abstraction, as well as the load balancing strategy, when the algorithm is run in parallel. We investigate the question by implementing a multidimensional Global Optimization B&B algorithm with the help of three frameworks with a different level of abstraction (from more to less): Bobpp, Threading Building Blocks (TBB), and a customized Pthread implementation. The following has been found. The Bobpp implementation is easy to code, but exhibits the poorest scalability. On the contrast, the TBB and Pthread implementations scale almost linearly on the used platform. The TBB approach shows a slightly better productivity

    Efficient Computation of Expected Hypervolume Improvement Using Box Decomposition Algorithms

    Full text link
    In the field of multi-objective optimization algorithms, multi-objective Bayesian Global Optimization (MOBGO) is an important branch, in addition to evolutionary multi-objective optimization algorithms (EMOAs). MOBGO utilizes Gaussian Process models learned from previous objective function evaluations to decide the next evaluation site by maximizing or minimizing an infill criterion. A common criterion in MOBGO is the Expected Hypervolume Improvement (EHVI), which shows a good performance on a wide range of problems, with respect to exploration and exploitation. However, so far it has been a challenge to calculate exact EHVI values efficiently. In this paper, an efficient algorithm for the computation of the exact EHVI for a generic case is proposed. This efficient algorithm is based on partitioning the integration volume into a set of axis-parallel slices. Theoretically, the upper bound time complexities are improved from previously O(n2)O (n^2) and O(n3)O(n^3), for two- and three-objective problems respectively, to Θ(nlogn)\Theta(n\log n), which is asymptotically optimal. This article generalizes the scheme in higher dimensional case by utilizing a new hyperbox decomposition technique, which was proposed by D{\"a}chert et al, EJOR, 2017. It also utilizes a generalization of the multilayered integration scheme that scales linearly in the number of hyperboxes of the decomposition. The speed comparison shows that the proposed algorithm in this paper significantly reduces computation time. Finally, this decomposition technique is applied in the calculation of the Probability of Improvement (PoI)
    corecore