5,876 research outputs found

    A GPU-accelerated Branch-and-Bound Algorithm for the Flow-Shop Scheduling Problem

    Get PDF
    Branch-and-Bound (B&B) algorithms are time intensive tree-based exploration methods for solving to optimality combinatorial optimization problems. In this paper, we investigate the use of GPU computing as a major complementary way to speed up those methods. The focus is put on the bounding mechanism of B&B algorithms, which is the most time consuming part of their exploration process. We propose a parallel B&B algorithm based on a GPU-accelerated bounding model. The proposed approach concentrate on optimizing data access management to further improve the performance of the bounding mechanism which uses large and intermediate data sets that do not completely fit in GPU memory. Extensive experiments of the contribution have been carried out on well known FSP benchmarks using an Nvidia Tesla C2050 GPU card. We compared the obtained performances to a single and a multithreaded CPU-based execution. Accelerations up to x100 are achieved for large problem instances

    Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage

    Full text link
    We propose a fast, parallel maximum clique algorithm for large sparse graphs that is designed to exploit characteristics of social and information networks. The method exhibits a roughly linear runtime scaling over real-world networks ranging from 1000 to 100 million nodes. In a test on a social network with 1.8 billion edges, the algorithm finds the largest clique in about 20 minutes. Our method employs a branch and bound strategy with novel and aggressive pruning techniques. For instance, we use the core number of a vertex in combination with a good heuristic clique finder to efficiently remove the vast majority of the search space. In addition, we parallelize the exploration of the search tree. During the search, processes immediately communicate changes to upper and lower bounds on the size of maximum clique, which occasionally results in a super-linear speedup because vertices with large search spaces can be pruned by other processes. We apply the algorithm to two problems: to compute temporal strong components and to compress graphs.Comment: 11 page

    Multi-threading a state-of-the-art maximum clique algorithm

    Get PDF
    We present a threaded parallel adaptation of a state-of-the-art maximum clique algorithm for dense, computationally challenging graphs. We show that near-linear speedups are achievable in practice and that superlinear speedups are common. We include results for several previously unsolved benchmark problems

    Anytime coalition structure generation on synergy graphs

    No full text
    We consider the coalition structure generation (CSG) problem on synergy graphs, which arises in many practical applications where communication constraints, social or trust relationships must be taken into account when forming coalitions. We propose a novel representation of this problem based on the concept of edge contraction, and an innovative branch and bound approach (CFSS), which is particularly efficient when applied to a general class of characteristic functions. This new model provides a non-redundant partition of the search space, hence allowing an effective parallelisation. We evaluate CFSS on two benchmark functions, the edge sum with coordination cost and the collective energy purchasing functions, comparing its performance with the best algorithm for CSG on synergy graphs: DyCE. The latter approach is centralised and cannot be efficiently parallelised due to the exponential memory requirements in the number of agents, which limits its scalability (while CFSS memory requirements are only polynomial). Our results show that, when the graphs are very sparse, CFSS is 4 orders of magnitude faster than DyCE. Moreover, CFSS is the first approach to provide anytime approximate solutions with quality guarantees for very large systems (i.e., with more than 2700 agents

    Towards Work-Efficient Parallel Parameterized Algorithms

    Full text link
    Parallel parameterized complexity theory studies how fixed-parameter tractable (fpt) problems can be solved in parallel. Previous theoretical work focused on parallel algorithms that are very fast in principle, but did not take into account that when we only have a small number of processors (between 2 and, say, 1024), it is more important that the parallel algorithms are work-efficient. In the present paper we investigate how work-efficient fpt algorithms can be designed. We review standard methods from fpt theory, like kernelization, search trees, and interleaving, and prove trade-offs for them between work efficiency and runtime improvements. This results in a toolbox for developing work-efficient parallel fpt algorithms.Comment: Prior full version of the paper that will appear in Proceedings of the 13th International Conference and Workshops on Algorithms and Computation (WALCOM 2019), February 27 - March 02, 2019, Guwahati, India. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-10564-8_2
    corecore