52 research outputs found

    Formula partitioning revisited

    Get PDF
    Dividing a Boolean formula into smaller independent sub-formulae can be a useful technique for accelerating the solution of Boolean problems, including SAT and #SAT. Nevertheless, and despite promising early results, formula partitioning is hardly used in state-of-the-art solvers. In this paper, we show that this is rooted in a lack of consistency of the usefulness of formula partitioning techniques. In particular, we evaluate two existing and a novel partitioning model, coupled with two existing and two novel partitioning algorithms, on a wide range of benchmark instances. Our results show that there is no one-size-fits-all solution: for different formula types, different partitioning models and algorithms are the most suitable. While these results might seem negative, they help to improve our understanding about formula partitioning; moreover, the findings also give guidance as to which method to use for what kinds of formulae

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    Efficient successor retrieval operations for aggregate query processing on clustered road networks

    Get PDF
    Cataloged from PDF version of article.Get-Successors (GS) which retrieves all successors of a junction is a kernel operation used to facilitate aggregate computations in road network queries. Efficient implementation of the GS operation is crucial since the disk access cost of this operation constitutes a considerable portion of the total query processing cost. Firstly, we propose a new successor retrieval operation Get-Unevaluated-Successors (GUS), which retrieves only the unevaluated successors of a given junction. The GUS operation is an efficient implementation of the GS operation, where the candidate successors to be retrieved are pruned according to the properties and state of the algorithm. Secondly, we propose a hypergraph-based model for clustering successively retrieved junctions by the GUS operations to the same pages. The proposed model utilizes query logs to correctly capture the disk access cost of GUS operations. The proposed GUS operation and associated clustering model are evaluated for two different instances of GUS operations which typically arise in Dijkstra's single source shortest path algorithm and incremental network expansion framework. Our simulation results show that the proposed successor retrieval operation together with the proposed clustering hypergraph model is quite effective in reducing the number of disk accesses in query processing. (C) 2010 Published by Elsevier Inc

    Replicated hypergraph partitioning

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2010.Thesis (Master's) -- Bilkent University, 2010.Includes bibliographical references leaves 62-68.Hypergraph partitioning is recently used in distributed information retrieval (IR) and spatial databases to correctly capture the communication and disk access costs. In the hypergraph models for these areas, the quality of the partitions obtained using hypergraph partitioning can be crucial for the objective of the targeted problem. Replication is a widely used terminology to address different performance issues in distributed IR and database systems. The main motivation behind replication is to improve the performance of the targeted issue at the cost of using more space. In this work, we focus on replicated hypergraph partitioning schemes that improve the quality of hypergraph partitioning by vertex replication. To this end, we propose a replicated partitioning scheme where replication and partitioning are performed in conjunction. Our approach utilizes successful multilevel and recursive bipartitioning methodologies for hypergraph partitioning. The replication is achieved in the uncoarsening phase of the multilevel methodology by extending the efficient Fiduccia-Mattheyses (FM) iterative improvement heuristic. We call this extended heuristic replicated FM (rFM). The proposed rFM heuristic supports move, replication and unreplication operations on the vertices by introducing new algorithms and vertex states. We show rFM has the same complexity as FM and integrate the proposed replication scheme into the multilevel hypergraph partitioning tool PaToH. We test the proposed replication scheme on realistic datasets and obtain promising results.Selvitopi, Reha OğuzM.S

    Multi-level direct K-way hypergraph partitioning with multiple constraints and fixed vertices

    Get PDF
    K-way hypergraph partitioning has an ever-growing use in parallelization of scientific computing applications. We claim that hypergraph partitioning with multiple constraints and fixed vertices should be implemented using direct K-way refinement, instead of the widely adopted recursive bisection paradigm. Our arguments are based on the fact that recursive-bisection-based partitioning algorithms perform considerably worse when used in the multiple constraint and fixed vertex formulations. We discuss possible reasons for this performance degradation. We describe a careful implementation of a multi-level direct K-way hypergraph partitioning algorithm, which performs better than a well-known recursive-bisection-based partitioning algorithm in hypergraph partitioning with multiple constraints and fixed vertices. We also experimentally show that the proposed algorithm is effective in standard hypergraph partitioning. © 2007 Elsevier Inc. All rights reserved

    Hypergraph-partitioning-based remapping models for image-space-parallel direct volume rendering of unstructured grids

    Get PDF
    In this work, image-space-parallel direct volume rendering (DVR) of unstructured grids is investigated for distributed-memory architectures. A hypergraph-partitioning-based model is proposed for the adaptive screen partitioning problem in this context. The proposed model aims to balance the rendering loads of processors while trying to minimize the amount of data replication. In the parallel DVR framework we adopted, each data primitive is statically owned by its home processor, which is responsible from replicating its primitives on other processors. Two appropriate remapping models are proposed by enhancing the above model for use within this framework. These two remapping models aim to minimize the total volume of communication in data replication while balancing the rendering loads of processors. Based on the proposed models, a parallel DVR algorithm is developed. The experiments conducted on a PC cluster show that the proposed remapping models achieve better speedup values compared to the remapping models previously suggested for image-space-parallel DVR. © 2007 IEEE

    Power models, energy models and libraries for energy-efficient concurrent data structures and algorithms

    Get PDF
    EXCESS deliverable D2.3. More information at http://www.excess-project.eu/This deliverable reports the results of the power models, energy models and librariesfor energy-efficient concurrent data structures and algorithms as available by projectmonth 30 of Work Package 2 (WP2). It reports i) the latest results of Task 2.2-2.4 onproviding programming abstractions and libraries for developing energy-efficient datastructures and algorithms and ii) the improved results of Task 2.1 on investigating andmodeling the trade-off between energy and performance of concurrent data structuresand algorithms. The work has been conducted on two main EXCESS platforms: Intelplatforms with recent Intel multicore CPUs and Movidius Myriad platforms

    A link-based storage scheme for efficient aggregate query processing on clustered road networks

    Get PDF
    Cataloged from PDF version of article.The need to have efficient storage schemes for spatial networks is apparent when the volume of query processing in some road networks (e.g., the navigation systems) is considered. Specifically, under the assumption that the road network is stored in a central server, the adjacent data elements in the network must be clustered on the disk in such a way that the number of disk page accesses is kept minimal during the processing of network queries. In this work, we introduce the link-based storage scheme for clustered road networks and compare it with the previously proposed junction-based storage scheme. in order to investigate the performance of aggregate network queries in clustered road networks, we extend our recently proposed clustering hypergraph model from junction-based storage to link-based storage. We propose techniques for additional storage savings in bidirectional networks that make the link-based storage scheme even more preferable in terms of the storage efficiency. We evaluate the performance of our link-based storage scheme against the junction-based storage scheme both theoretically and empirically. The results of the experiments conducted on a wide range of road network datasets show that the link-based storage scheme is preferable in terms of both storage and query processing efficiency. (C) 2009 Elsevier B.V. All rights reserved

    Doctor of Philosophy

    Get PDF
    dissertationWith the explosion of chip transistor counts, the semiconductor industry has struggled with ways to continue scaling computing performance in line with historical trends. In recent years, the de facto solution to utilize excess transistors has been to increase the size of the on-chip data cache, allowing fast access to an increased portion of main memory. These large caches allowed the continued scaling of single thread performance, which had not yet reached the limit of instruction level parallelism (ILP). As we approach the potential limits of parallelism within a single threaded application, new approaches such as chip multiprocessors (CMP) have become popular for scaling performance utilizing thread level parallelism (TLP). This dissertation identifies the operating system as a ubiquitous area where single threaded performance and multithreaded performance have often been ignored by computer architects. We propose that novel hardware and OS co-design has the potential to significantly improve current chip multiprocessor designs, enabling increased performance and improved power efficiency. We show that the operating system contributes a nontrivial overhead to even the most computationally intense workloads and that this OS contribution grows to a significant fraction of total instructions when executing several common applications found in the datacenter. We demonstrate that architectural improvements have had little to no effect on the performance of the OS over the last 15 years, leaving ample room for improvements. We specifically consider three potential solutions to improve OS execution on modern processors. First, we consider the potential of a separate operating system processor (OSP) operating concurrently with general purpose processors (GPP) in a chip multiprocessor organization, with several specialized structures acting as efficient conduits between these processors. Second, we consider the potential of segregating existing caching structures to decrease cache interference between the OS and application. Third, we propose that there are components within the OS itself that should be refactored to be both multithreaded and cache topology aware, which in turn, improves the performance and scalability of many-threaded applications
    corecore