1,757 research outputs found

    Cycle time optimization by timing driven placement with simultaneous netlist transformations

    Get PDF
    We present new concepts to integrate logic synthesis and physical design. Our methodology uses general Boolean transformations as known from technology-independent synthesis, and a recursive bi-partitioning placement algorithm. In each partitioning step, the precision of the layout data increases. This allows effective guidance of the logic synthesis operations for cycle time optimization. An additional advantage of our approach is that no complicated layout corrections are needed when the netlist is changed

    Replication for Logic Bipartitioning

    Get PDF
    Logic replication, the duplication of logic in order to limit communication between partitions, is an effective part of a complete partitioning solution. In this paper we seek a better understanding of the important issues in logic replication. By developing new optimizations to existing algorithms we are able to significantly improve the quality of these techniques, achieving up to 12.5 % better results than the best existing replication techniques. When integrated into our already state-of-the-art partitioner, we improve overall cutsizes by 37.8%, while requiring the duplication of at most 7 % of the logic.

    SAT-based optimal hypergraph partitioning with replication

    Get PDF
    We propose a methodology for optimal k-way partitioning with replication of directed hypergraphs via Boolean satisfiability. We begin by leveraging the power of existing and emerging SAT solvers to attack traditional logic bipartitioning and show good scaling behavior. We continue to present the first optimal partitioning results that admit generation and assignment of replicated nodes concurrently. Our framework is general enough that we also give the first published optimal results for partitioning with respect to the maximum subdomain degree metric and the sum of external degrees metric. We show that for the bipartitioning case we can feasibly solve problems of up to 150 nodes with simultaneous replication in hundreds of seconds. For other partitioning metrics, we are able to solve problems up to 40 nodes in hundreds of seconds

    High-Quality Hypergraph Partitioning

    Get PDF
    This dissertation focuses on computing high-quality solutions for the NP-hard balanced hypergraph partitioning problem: Given a hypergraph and an integer kk, partition its vertex set into kk disjoint blocks of bounded size, while minimizing an objective function over the hyperedges. Here, we consider the two most commonly used objectives: the cut-net metric and the connectivity metric. Since the problem is computationally intractable, heuristics are used in practice - the most prominent being the three-phase multi-level paradigm: During coarsening, the hypergraph is successively contracted to obtain a hierarchy of smaller instances. After applying an initial partitioning algorithm to the smallest hypergraph, contraction is undone and, at each level, refinement algorithms try to improve the current solution. With this work, we give a brief overview of the field and present several algorithmic improvements to the multi-level paradigm. Instead of using a logarithmic number of levels like traditional algorithms, we present two coarsening algorithms that create a hierarchy of (nearly) nn levels, where nn is the number of vertices. This makes consecutive levels as similar as possible and provides many opportunities for refinement algorithms to improve the partition. This approach is made feasible in practice by tailoring all algorithms and data structures to the nn-level paradigm, and developing lazy-evaluation techniques, caching mechanisms and early stopping criteria to speed up the partitioning process. Furthermore, we propose a sparsification algorithm based on locality-sensitive hashing that improves the running time for hypergraphs with large hyperedges, and show that incorporating global information about the community structure into the coarsening process improves quality. Moreover, we present a portfolio-based initial partitioning approach, and propose three refinement algorithms. Two are based on the Fiduccia-Mattheyses (FM) heuristic, but perform a highly localized search at each level. While one is designed for two-way partitioning, the other is the first FM-style algorithm that can be efficiently employed in the multi-level setting to directly improve kk-way partitions. The third algorithm uses max-flow computations on pairs of blocks to refine kk-way partitions. Finally, we present the first memetic multi-level hypergraph partitioning algorithm for an extensive exploration of the global solution space. All contributions are made available through our open-source framework KaHyPar. In a comprehensive experimental study, we compare KaHyPar with hMETIS, PaToH, Mondriaan, Zoltan-AlgD, and HYPE on a wide range of hypergraphs from several application areas. Our results indicate that KaHyPar, already without the memetic component, computes better solutions than all competing algorithms for both the cut-net and the connectivity metric, while being faster than Zoltan-AlgD and equally fast as hMETIS. Moreover, KaHyPar compares favorably with the current best graph partitioning system KaFFPa - both in terms of solution quality and running time

    Minimizing communication through computational redundancy in parallel iterative solvers

    Get PDF
    Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2011.Thesis (Master's) -- Bilkent University, 2011.Includes bibliographical references leaves 55-60.Sparse matrix vector multiplication (SpMxV) of the form y = Ax is a kernel operation in iterative linear solvers used in scientific applications. In these solvers, the SpMxV operation is performed repeatedly with the same sparse matrix through iterations until convergence. Depending on the matrix and its decomposition, parallel SpMxV operation necessitates communication among processors in the parallel environment. The communication can be reduced by intelligent decomposition. However, we can further decrease the communication through data replication and redundant computation. The communication occurs due to the transfer of x-vector entries in row-parallel SpMxV computation. The input vector x of the next iteration is computed from the output vector of the current iteration through linear vector operations. Hence, a processor may compute a y-vector entry redundantly, which leads to a x-vector entry in the following iteration, instead of receiving that x-vector entry from another processor. Thus, redundant computation of that y-vector entry may lead to reduction in communication. In this thesis, we devise a directed-graph-based model that correctly captures the computation and communication pattern for above-mentioned iterative solvers. Moreover, we formulate the communication minimization by utilizing redundant computation of y-vector entries as a combinatorial problem on this directed graph model. We propose two heuristics to solve this combinatorial problem. Experimental results indicate that the communication reducing strategy by redundantly computing is promising.Torun, Fahreddin ŞükrüM.S

    Delay driven multi-way circuit partitioning.

    Get PDF
    Wong Sze Hon.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 88-91).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Preliminaries --- p.1Chapter 1.2 --- Motivations --- p.1Chapter 1.3 --- Contributions --- p.3Chapter 1.4 --- Organization of the Thesis --- p.4Chapter 2 --- VLSI Physical Design Automation --- p.5Chapter 2.1 --- Preliminaries --- p.5Chapter 2.2 --- VLSI Design Cycle [1] --- p.6Chapter 2.2.1 --- System Specification --- p.6Chapter 2.2.2 --- Architectural Design --- p.6Chapter 2.2.3 --- Functional Design --- p.6Chapter 2.2.4 --- Logic Design --- p.8Chapter 2.2.5 --- Circuit Design --- p.8Chapter 2.2.6 --- Physical Design --- p.8Chapter 2.2.7 --- Fabrication --- p.8Chapter 2.2.8 --- Packaging and Testing --- p.9Chapter 2.3 --- Physical Design Cycle [1] --- p.9Chapter 2.3.1 --- Partitioning --- p.9Chapter 2.3.2 --- Floorplanning and Placement --- p.11Chapter 2.3.3 --- Routing --- p.11Chapter 2.3.4 --- Compaction --- p.12Chapter 2.3.5 --- Extraction and Verification --- p.12Chapter 2.4 --- Chapter Summary --- p.12Chapter 3 --- Recent Approaches on Circuit Partitioning --- p.14Chapter 3.1 --- Preliminaries --- p.14Chapter 3.2 --- Circuit Representation --- p.15Chapter 3.3 --- Delay Modelling --- p.16Chapter 3.4 --- Partitioning Objectives --- p.19Chapter 3.4.1 --- Interconnections between Partitions --- p.19Chapter 3.4.2 --- Delay Minimization --- p.19Chapter 3.4.3 --- Area and Number of Partitions --- p.20Chapter 3.5 --- Partitioning Algorithms --- p.20Chapter 3.5.1 --- Cut-size Driven Partitioning Algorithm --- p.21Chapter 3.5.2 --- Delay Driven Partitioning Algorithm --- p.32Chapter 3.5.3 --- Acyclic Circuit Partitioning Algorithm --- p.33Chapter 4 --- Clustering Based Acyclic Multi-way Partitioning --- p.38Chapter 4.1 --- Preliminaries --- p.38Chapter 4.2 --- Previous Works on Clustering Based Partitioning --- p.39Chapter 4.2.1 --- Multilevel Circuit Partitioning [2] --- p.40Chapter 4.2.2 --- Cluster-Oriented Iterative-Improvement Partitioner [3] --- p.42Chapter 4.2.3 --- Section Summary --- p.44Chapter 4.3 --- Problem Formulation --- p.45Chapter 4.4 --- Clustering Based Acyclic Multi-Way Partitioning --- p.46Chapter 4.5 --- Modified Fan-out Free Cone Decomposition --- p.47Chapter 4.6 --- Clustering Phase --- p.48Chapter 4.7 --- Partitioning Phase --- p.51Chapter 4.8 --- The Acyclic Constraint --- p.52Chapter 4.9 --- Experimental Results --- p.57Chapter 4.10 --- Chapter Summary --- p.58Chapter 5 --- Network Flow Based Multi-way Partitioning --- p.61Chapter 5.1 --- Preliminaries --- p.61Chapter 5.2 --- Notations and Definitions --- p.62Chapter 5.3 --- Net Modelling --- p.63Chapter 5.4 --- Previous Works on Network Flow Based Partitioning --- p.64Chapter 5.4.1 --- Network Flow Based Min-Cut Balanced Partitioning [4] --- p.65Chapter 5.4.2 --- Network Flow Based Circuit Partitioning for Time-multiplexed FPGAs [5] --- p.66Chapter 5.5 --- Proposed Net Modelling --- p.70Chapter 5.6 --- Partitioning Properties Based on the Proposed Net Modelling --- p.73Chapter 5.7 --- Partitioning Step --- p.75Chapter 5.8 --- Constrained FM Post Processing Step --- p.79Chapter 5.9 --- Experiment Results --- p.81Chapter 6 --- Conclusion --- p.86Bibliography --- p.8

    Replicated partitioning for undirected hypergraphs

    Get PDF
    Cataloged from PDF version of article.Hypergraph partitioning (HP) and replication are diverse but powerful tools that are traditionally applied separately to minimize the costs of parallel and sequential systems that access related data or process related tasks. When combined together, these two techniques have the potential of achieving significant improvements in performance of many applications. In this study, we provide an approach involving a tool that simultaneously performs replication and partitioning of the vertices of an undirected hypergraph whose vertices represent data and nets represent task dependencies among these data. In this approach, we propose an iterative-improvement-based replicated bipartitioning heuristic, which is capable of move, replication, and unreplication of vertices. In order to utilize our replicated bipartitioning heuristic in a recursive bipartitioning framework, we also propose appropriate cut-net removal, cut-net splitting, and pin selection algorithms to correctly encapsulate the two most commonly used cutsize metrics. We embed our replicated bipartitioning scheme into the state-of-the-art multilevel HP tool PaToH to provide an effective and efficient replicated HP tool, rpPaToH. The performance of the techniques proposed and the tools developed is tested over the undirected hypergraphs that model the communication costs of parallel query processing in information retrieval systems. Our experimental analysis indicates that the proposed technique provides significant improvements in the quality of the partitions, especially under low replication ratios. (C) 2012 Elsevier Inc. All rights reserved
    • …
    corecore