1,650 research outputs found

    OPTIMIZING LARGE COMBINATIONAL NETWORKS FOR K-LUT BASED FPGA MAPPING

    Get PDF
    Optimizing by partitioning is a central problem in VLSI design automation, addressing circuit’s manufacturability. Circuit partitioning has multiple applications in VLSI design. One of the most common is that of dividing combinational circuits (usually large ones) that will not fit on a single package among a number of packages. Partitioning is of practical importance for k-LUT based FPGA circuit implementation. In this work is presented multilevel a multi-resource partitioning algorithm for partitioning large combinational circuits in order to efficiently use existing and commercially available FPGAs packagestwo-way partitioning, multi-way partitioning, recursive partitioning, flat partitioning, critical path, cutting cones, bottom-up clusters, top-down min-cut

    Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

    Get PDF
    The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

    Cycle time optimization by timing driven placement with simultaneous netlist transformations

    Get PDF
    We present new concepts to integrate logic synthesis and physical design. Our methodology uses general Boolean transformations as known from technology-independent synthesis, and a recursive bi-partitioning placement algorithm. In each partitioning step, the precision of the layout data increases. This allows effective guidance of the logic synthesis operations for cycle time optimization. An additional advantage of our approach is that no complicated layout corrections are needed when the netlist is changed

    Exploiting Local Logic Structures to Optimize Multi-Core SoC Floorplanning

    Get PDF
    We present a throughput-driven partitioning and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) systems. These two algorithms are integrated along with a published floorplanner in a new iterative physical synthesis flow to optimize system throughput and reduce area occupation. The synthesis flow iterates a floorplanning-partitioning-floorplanning-merging sequence of operations to improve the system topology and the physical locations of cores. The partitioning algorithm performs bottom-up clustering of the internal logic of a given IP core to divide it into smaller ones, each of which has no combinational path from input to output and thus is legal for LI-interface encapsulation. Applying this algorithm to cores on critical feedback loops optimizes their length and in turn enables throughput optimization via the subsequent floorplanning. The merging algorithm reduces the number of cores on non-critical loops, lowering the overall area taken by LI interfaces without hurting the system throughput. Experimental results on a large system-on-chip design show a 16.7% speedup in system throughput and a 2.1% reduction in area occupation

    CAD Tools for Synthesis of Sleep Convention Logic

    Get PDF
    This dissertation proposes an automated flow for the Sleep Convention Logic (SCL) asynchronous design style. The proposed flow synthesizes synchronous RTL into an SCL netlist. The flow utilizes commercial design tools, while supplementing missing functionality using custom tools. A method for determining the performance bottleneck in an SCL design is proposed. A constraint-driven method to increase the performance of linear SCL pipelines is proposed. Several enhancements to SCL are proposed, including techniques to reduce the number of registers and total sleep capacitance in an SCL design

    A New Paradigm in Split Manufacturing: Lock the FEOL, Unlock at the BEOL

    Full text link
    Split manufacturing was introduced as an effective countermeasure against hardware-level threats such as IP piracy, overbuilding, and insertion of hardware Trojans. Nevertheless, the security promise of split manufacturing has been challenged by various attacks, which exploit the well-known working principles of physical design tools to infer the missing BEOL interconnects. In this work, we advocate a new paradigm to enhance the security for split manufacturing. Based on Kerckhoff's principle, we protect the FEOL layout in a formal and secure manner, by embedding keys. These keys are purposefully implemented and routed through the BEOL in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks. We provide our secure physical design flow to the community. We also define the security of split manufacturing formally and provide the associated proofs. At the same time, our technique is competitive with current schemes in terms of layout overhead, especially for practical, large-scale designs (ITC'99 benchmarks).Comment: DATE 2019 (https://www.date-conference.com/conference/session/4.5

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur

    Speeding Up Technology-Independent Timing Optimization by Network Partitioning

    Get PDF
    Abstract Technology-independenttiming optimizationis an importantproblem in logic synthesis. Although many promising techniques have been proposed in the past, unfortunately they are quite slow and thus impractical for large networks. In this paper, we propose DE-PART, a delay-basedpartitioner-cum-optimizer, which purports to solve this problem. Given a combinational logic network that is to be optimized for timing, DEPART divides it into sub-networks using timing information and a constraint on the maximum number of gates allowed in a single sub-network. These sub-networks are then dispatched, one by one, to a standard timing optimizer. The optimized sub-networks are re-glued, generating an optimized network. The challenge is how to partition the original network into sub-networks so that the nal solution quality after partitioning and optimization is comparable to that from the timing optimizer. We propose a partitioning technique that is timing-driven and is simple yet e ective. We compare DEPART with speed up 21 , a state-of-the-art timing optimization tool, and with various partitioning techniques such as min-cut based and region growing, on a suite of large industrial and ISCAS circuits. On more than half of the benchmarks, DEPART yields run-time improvements of 20 to 450 times over a normal invocation of speed up the overall average improvement being 8 times, without compromising the solution quality m uch. Min-cut and region growing partitioning schemes, not being timing-driven, perform poorly in terms of the nal circuit delay
    • …
    corecore