12,736 research outputs found
Cell replication and redundancy elimination during placement for cycle time optimization
This paper presents a new timing driven approach for cell replication tailored to the practical needs of standard cell layout design. Cell replication methods have been studied extensively in the context of generic partitioning problems. However, until now it has remained unclear what practical benefit can be obtained from this concept in a realistic environment for timing driven layout synthesis. Therefore, this paper presents a timing driven cell replication procedure, demonstrates its incorporation into a standard cell placement and routing tool and examines its benefit on the final circuit performance in comparison with conventional gate or transistor sizing techniques. Furthermore, we demonstrate that cell replication can deteriorate the stuck-at fault testability of circuits and show that stuck-at redundancy elimination must be integrated into the placement procedure. Experimental results demonstrate the usefulness of the proposed methodology and suggest that cell replication should be an integral part of the physical design flow complementing traditional gate sizing techniques
Recommended from our members
Microarchitecture optimization for timing and layout
In recent years the drive to produce more complex integrated circuits while spending less design time has driven the demand for design automation tools. The search for design automation methods has resulted in the design of numerous behavioral synthesis and logic synthesis tools. This report describes a system that fills the gap between traditional behavioral synthesis and logic synthesis tools. Techniques are introduced for improving the microarchitecture structure and using feedback from lower-level optimization tools to guide design optimizations while attempting to meet user specified area and time constraints. These techniques include the capability for mixing layout styles such as custom layout for random-logic components and bit-slicing for regularly structured components. In this manner the entire design, control logic and datapath, can be optimized at the same time. Further, this paper presents a new methodology for microarchitecture-level optimization that greatly reduces the amount of technology-specific knowledge necessary to perform the optimizations
Magnetic skyrmion logic gates: conversion, duplication and merging of skyrmions
Magnetic skyrmions, which are topological particle-like excitations in
ferromagnets, have attracted a lot of attention recently. Skyrmionics is an
attempt to use magnetic skyrmions as information carriers in next generation
spintronic devices. Proposals of manipulations and operations of skyrmions are
highly desired. Here, we show that the conversion, duplication and merging of
isolated skyrmions with different chirality and topology are possible all in
one system. We also demonstrate the conversion of a skyrmion into another form
of a skyrmion, i.e., a bimeron. We design spin logic gates such as the AND and
OR gates based on manipulations of skyrmions. These results provide important
guidelines for utilizing the topology of nanoscale spin textures as information
carriers in novel magnetic sensors and spin logic devices.Comment: 17 pages, 6 figure
Comparison of Various Pipelined and Non-Pipelined SCl 8051 ALUs
This paper describes the development of an 8-bit SCL 8051 ALU with two versions: SCL 8051 ALU with nsleep and sleep signals and SCL 8051 ALU without nsleep. Both versions have combinational logic (C/L), registers, and completion components, which all utilize slept gates. Both three-stage pipelined and non-pipelined designs were examined for both versions. The four designs were compared in terms of area, speed, leakage power, average power and energy per operation. The SCL 8051 ALU without nsleep is smaller and faster, but it has greater leakage power. It also has lower average power, and less energy consumption than the SCL 8051 ALU with both nsleep and sleep signals. The pipelined SCL 8051 ALU is bigger, slower, and has larger leakage power, average power and energy consumption than the non-pipelined SCL 8051 ALU
An FPGA Architecture and CAD Flow Supporting Dynamically Controlled Power Gating
© 2015 IEEE.Leakage power is an important component of the total power consumption in field-programmable gate arrays (FPGAs) built using 90-nm and smaller technology nodes. Power gating was shown to be effective at reducing the leakage power. Previous techniques focus on turning OFF unused FPGA resources at configuration time; the benefit of this approach depends on resource utilization. In this paper, we present an FPGA architecture that enables dynamically controlled power gating, in which FPGA resources can be selectively powered down at run-time. This could lead to significant overall energy savings for applications having modules with long idle times. We also present a CAD flow that can be used to map applications to the proposed architecture. We study the area and power tradeoffs by varying the different FPGA architecture parameters and power gating granularity. The proposed CAD flow is used to map a set of benchmark circuits that have multiple power-gated modules to the proposed architecture. Power savings of up to 83% are achievable for these circuits. Finally, we study a control system of a robot that is used in endoscopy. Using the proposed architecture combined with clock gating results in up to 19% energy savings in this application
Interprocedural Type Specialization of JavaScript Programs Without Type Analysis
Dynamically typed programming languages such as Python and JavaScript defer
type checking to run time. VM implementations can improve performance by
eliminating redundant dynamic type checks. However, type inference analyses are
often costly and involve tradeoffs between compilation time and resulting
precision. This has lead to the creation of increasingly complex multi-tiered
VM architectures.
Lazy basic block versioning is a simple JIT compilation technique which
effectively removes redundant type checks from critical code paths. This novel
approach lazily generates type-specialized versions of basic blocks on-the-fly
while propagating context-dependent type information. This approach does not
require the use of costly program analyses, is not restricted by the precision
limitations of traditional type analyses.
This paper extends lazy basic block versioning to propagate type information
interprocedurally, across function call boundaries. Our implementation in a
JavaScript JIT compiler shows that across 26 benchmarks, interprocedural basic
block versioning eliminates more type tag tests on average than what is
achievable with static type analysis without resorting to code transformations.
On average, 94.3% of type tag tests are eliminated, yielding speedups of up to
56%. We also show that our implementation is able to outperform Truffle/JS on
several benchmarks, both in terms of execution time and compilation time.Comment: 10 pages, 10 figures, submitted to CGO 201
- …