600 research outputs found
On Timing Model Extraction and Hierarchical Statistical Timing Analysis
In this paper, we investigate the challenges to apply Statistical Static
Timing Analysis (SSTA) in hierarchical design flow, where modules supplied by
IP vendors are used to hide design details for IP protection and to reduce the
complexity of design and verification. For the three basic circuit types,
combinational, flip-flop-based and latch-controlled, we propose methods to
extract timing models which contain interfacing as well as compressed internal
constraints. Using these compact timing models the runtime of full-chip timing
analysis can be reduced, while circuit details from IP vendors are not exposed.
We also propose a method to reconstruct the correlation between modules during
full-chip timing analysis. This correlation can not be incorporated into timing
models because it depends on the layout of the corresponding modules in the
chip. In addition, we investigate how to apply the extracted timing models with
the reconstructed correlation to evaluate the performance of the complete
design. Experiments demonstrate that using the extracted timing models and
reconstructed correlation full-chip timing analysis can be several times faster
than applying the flattened circuit directly, while the accuracy of statistical
timing analysis is still well maintained
Latch-based RISC-V core with popcount instruction for CNN acceleration
Energy-efficiency is essential for vast majority of mobile and embedded battery-powered systems. Internet-of-Things paradigm combines requirements for high computational capabilities, extreme energy-efficiency and low-cost. Increasing manufacturing process variations pose formidable challenges for deep-submicron integrated circuit designs. The effects of variation are further exacerbated by lowered voltages in energy-efficient designs. Compared to traditional flip-flop-based design, latch-based design offers area, energy-efficiency and variation tolerance benefits at the cost of increased timing behavior complexity. A method for converting flip-flop-based processor core to latch-based core at register-transfer-level is presented in this work.
Convolutional neural networks have enabled image recognition in the field of computer vision at unprecedented accuracy. Performance and memory requirements of canonical convolutional neural networks have been out of reach for low-cost IoT devices. In collaboration with Tampere University, a custom popcount instruction was added to the cores for accelerating IoT optimized vehicle classification convolutional neural network.
This work compares simulation results from synthesized flip-flop-based and latch-based versions of a SCR1 RISC-V processor core and the effects of custom instruction for CNN acceleration. The latch core achieved roughly 50\% smaller energy per operation than the flip-flop core and 2.1x speedup was observed in the execution of the CNN when using the custom instruction
Elastic circuits
Elasticity in circuits and systems provides tolerance to variations in computation and communication delays. This paper presents a comprehensive overview of elastic circuits for those designers who are mainly familiar with synchronous design. Elasticity can be implemented both synchronously and asynchronously, although it was traditionally more often associated with asynchronous circuits. This paper shows that synchronous and asynchronous elastic circuits can be designed, analyzed, and optimized using similar techniques. Thus, choices between synchronous and asynchronous implementations are localized and deferred until late in the design process.Peer ReviewedPostprint (published version
Redundant Skewed Clocking of Pulse-Clocked Latches for Low Power Soft-Error Mitigation
abstract: An integrated methodology combining redundant clock tree synthesis and pulse clocked latches mitigates both single event upsets (SEU) and single event transients (SET) with reduced power consumption. This methodology helps to change the hardness of the design on the fly. This approach, with minimal additional overhead circuitry, has the ability to work in three different modes of operation depending on the speed, hardness and power consumption required by design. This was designed on 90nm low-standby power (LSP) process and utilized commercial CAD tools for testing. Spatial separation of critical nodes in the physical design of this approach mitigates multi-node charge collection (MNCC) upsets. An advanced encryption system implemented with the proposed design, compared to a previous design with non-redundant clock trees and local delay generation. The proposed approach reduces energy per operation up to 18% over an improved version of the prior approach, with negligible area impact. It can save up to 2/3rd of the power consumption and reach maximum possible frequency, when used in non-redundant mode of operation.Dissertation/ThesisMasters Thesis Electrical Engineering 201
Mix & Latch: An Optimization Flow for High-Performance Designs with Single-Clock Mixed-Polarity Latches and Flip-Flops
Flip-flops are the most used sequential elements in synchronous circuits, but designs based on latches can operate at higher frequencies and occupy less area. Techniques to increase the maximum operating frequency of flip-flop based designs, such as time-borrowing, rely on tight hold constraints that are difficult to satisfy using traditional back-end optimization techniques. We propose Mix & Latch , a methodology to increase the operating frequency of synchronous digital circuits using a single clock tree and a mixed distribution of positive- and negative-edge-triggered flops, and positive- and negative-level-sensitive latches. An efficient mathematical model is proposed to optimize the type and location of the sequential elements of the circuit. We ensure that the initial registers are not moved from their initial location, although they may change type, thus allowing the use of equivalence checking and static timing analysis to verify formally the correctness of the transformation. The technique is validated using a 28nm CMOS FDSOI technology, obtaining 1.33X post-layout average operating frequency improvement on a broad set of benchmarks over a standard commercial design flow. Additionally, the circuit area was also reduced by more than 1.19X on average for the same benchmarks, although the overall area reduction is not a goal of the optimization algorithm. To the best of our knowledge, this is the first work that proposes combining mixed-polarity flip-flops and latches to improve the circuit performance
A walk in the statistical mechanical formulation of neural networks
Neural networks are nowadays both powerful operational tools (e.g., for
pattern recognition, data mining, error correction codes) and complex
theoretical models on the focus of scientific investigation. As for the
research branch, neural networks are handled and studied by psychologists,
neurobiologists, engineers, mathematicians and theoretical physicists. In
particular, in theoretical physics, the key instrument for the quantitative
analysis of neural networks is statistical mechanics. From this perspective,
here, we first review attractor networks: starting from ferromagnets and
spin-glass models, we discuss the underlying philosophy and we recover the
strand paved by Hopfield, Amit-Gutfreund-Sompolinky. One step forward, we
highlight the structural equivalence between Hopfield networks (modeling
retrieval) and Boltzmann machines (modeling learning), hence realizing a deep
bridge linking two inseparable aspects of biological and robotic spontaneous
cognition. As a sideline, in this walk we derive two alternative (with respect
to the original Hebb proposal) ways to recover the Hebbian paradigm, stemming
from ferromagnets and from spin-glasses, respectively. Further, as these notes
are thought of for an Engineering audience, we highlight also the mappings
between ferromagnets and operational amplifiers and between antiferromagnets
and flip-flops (as neural networks -built by op-amp and flip-flops- are
particular spin-glasses and the latter are indeed combinations of ferromagnets
and antiferromagnets), hoping that such a bridge plays as a concrete
prescription to capture the beauty of robotics from the statistical mechanical
perspective.Comment: Contribute to the proceeding of the conference: NCTA 2014. Contains
12 pages,7 figure
CAD Tools for Synthesis of Sleep Convention Logic
This dissertation proposes an automated flow for the Sleep Convention Logic (SCL) asynchronous design style. The proposed flow synthesizes synchronous RTL into an SCL netlist. The flow utilizes commercial design tools, while supplementing missing functionality using custom tools. A method for determining the performance bottleneck in an SCL design is proposed. A constraint-driven method to increase the performance of linear SCL pipelines is proposed. Several enhancements to SCL are proposed, including techniques to reduce the number of registers and total sleep capacitance in an SCL design
- …