3,799 research outputs found
AI/ML Algorithms and Applications in VLSI Design and Technology
An evident challenge ahead for the integrated circuit (IC) industry in the
nanometer regime is the investigation and development of methods that can
reduce the design complexity ensuing from growing process variations and
curtail the turnaround time of chip manufacturing. Conventional methodologies
employed for such tasks are largely manual; thus, time-consuming and
resource-intensive. In contrast, the unique learning strategies of artificial
intelligence (AI) provide numerous exciting automated approaches for handling
complex and data-intensive tasks in very-large-scale integration (VLSI) design
and testing. Employing AI and machine learning (ML) algorithms in VLSI design
and manufacturing reduces the time and effort for understanding and processing
the data within and across different abstraction levels via automated learning
algorithms. It, in turn, improves the IC yield and reduces the manufacturing
turnaround time. This paper thoroughly reviews the AI/ML automated approaches
introduced in the past towards VLSI design and manufacturing. Moreover, we
discuss the scope of AI/ML applications in the future at various abstraction
levels to revolutionize the field of VLSI design, aiming for high-speed, highly
intelligent, and efficient implementations
FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture
Neural Network (NN) accelerators with emerging ReRAM (resistive random access
memory) technologies have been investigated as one of the promising solutions
to address the \textit{memory wall} challenge, due to the unique capability of
\textit{processing-in-memory} within ReRAM-crossbar-based processing elements
(PEs). However, the high efficiency and high density advantages of ReRAM have
not been fully utilized due to the huge communication demands among PEs and the
overhead of peripheral circuits.
In this paper, we propose a full system stack solution, composed of a
reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and
its software system including neural synthesizer, temporal-to-spatial mapper,
and placement & routing. We highly leverage the software system to make the
hardware design compact and efficient. To satisfy the high-performance
communication demand, we optimize it with a reconfigurable routing architecture
and the placement & routing tool. To improve the computational density, we
greatly simplify the PE circuit with the spiking schema and then adopt neural
synthesizer to enable the high density computation-resources to support
different kinds of NN operations. In addition, we provide spiking memory blocks
(SMBs) and configurable logic blocks (CLBs) in hardware and leverage the
temporal-to-spatial mapper to utilize them to balance the storage and
computation requirements of NN. Owing to the end-to-end software system, we can
efficiently deploy existing deep neural networks to FPSA. Evaluations show
that, compared to one of state-of-the-art ReRAM-based NN accelerators, PRIME,
the computational density of FPSA improves by 31x; for representative NNs, its
inference performance can achieve up to 1000x speedup.Comment: Accepted by ASPLOS 201
A 64mW DNN-based Visual Navigation Engine for Autonomous Nano-Drones
Fully-autonomous miniaturized robots (e.g., drones), with artificial
intelligence (AI) based visual navigation capabilities are extremely
challenging drivers of Internet-of-Things edge intelligence capabilities.
Visual navigation based on AI approaches, such as deep neural networks (DNNs)
are becoming pervasive for standard-size drones, but are considered out of
reach for nanodrones with size of a few cm. In this work, we
present the first (to the best of our knowledge) demonstration of a navigation
engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based
visual navigation. To achieve this goal we developed a complete methodology for
parallel execution of complex DNNs directly on-bard of resource-constrained
milliwatt-scale nodes. Our system is based on GAP8, a novel parallel
ultra-low-power computing platform, and a 27 g commercial, open-source
CrazyFlie 2.0 nano-quadrotor. As part of our general methodology we discuss the
software mapping techniques that enable the state-of-the-art deep convolutional
neural network presented in [1] to be fully executed on-board within a strict 6
fps real-time constraint with no compromise in terms of flight results, while
all processing is done with only 64 mW on average. Our navigation engine is
flexible and can be used to span a wide performance range: at its peak
performance corner it achieves 18 fps while still consuming on average just
3.5% of the power envelope of the deployed nano-aircraft.Comment: 15 pages, 13 figures, 5 tables, 2 listings, accepted for publication
in the IEEE Internet of Things Journal (IEEE IOTJ
An overview of memristive cryptography
Smaller, smarter and faster edge devices in the Internet of things era
demands secure data analysis and transmission under resource constraints of
hardware architecture. Lightweight cryptography on edge hardware is an emerging
topic that is essential to ensure data security in near-sensor computing
systems such as mobiles, drones, smart cameras, and wearables. In this article,
the current state of memristive cryptography is placed in the context of
lightweight hardware cryptography. The paper provides a brief overview of the
traditional hardware lightweight cryptography and cryptanalysis approaches. The
contrast for memristive cryptography with respect to traditional approaches is
evident through this article, and need to develop a more concrete approach to
developing memristive cryptanalysis to test memristive cryptographic approaches
is highlighted.Comment: European Physical Journal: Special Topics, Special Issue on
"Memristor-based systems: Nonlinearity, dynamics and applicatio
Null Convention Logic applications of asynchronous design in nanotechnology and cryptographic security
This dissertation presents two Null Convention Logic (NCL) applications of asynchronous logic circuit design in nanotechnology and cryptographic security. The first application is the Asynchronous Nanowire Reconfigurable Crossbar Architecture (ANRCA); the second one is an asynchronous S-Box design for cryptographic system against Side-Channel Attacks (SCA). The following are the contributions of the first application: 1) Proposed a diode- and resistor-based ANRCA (DR-ANRCA). Three configurable logic block (CLB) structures were designed to efficiently reconfigure a given DR-PGMB as one of the 27 arbitrary NCL threshold gates. A hierarchical architecture was also proposed to implement the higher level logic that requires a large number of DR-PGMBs, such as multiple-bit NCL registers. 2) Proposed a memristor look-up-table based ANRCA (MLUT-ANRCA). An equivalent circuit simulation model has been presented in VHDL and simulated in Quartus II. Meanwhile, the comparison between these two ANRCAs have been analyzed numerically. 3) Presented the defect-tolerance and repair strategies for both DR-ANRCA and MLUT-ANRCA. The following are the contributions of the second application: 1) Designed an NCL based S-Box for Advanced Encryption Standard (AES). Functional verification has been done using Modelsim and Field-Programmable Gate Array (FPGA). 2) Implemented two different power analysis attacks on both NCL S-Box and conventional synchronous S-Box. 3) Developed a novel approach based on stochastic logics to enhance the resistance against DPA and CPA attacks. The functionality of the proposed design has been verified using an 8-bit AES S-box design. The effects of decision weight, bitstream length, and input repetition times on error rates have been also studied. Experimental results shows that the proposed approach enhances the resistance to against the CPA attack by successfully protecting the hidden key --Abstract, page iii
Exploiting Adaptive Techniques to Improve Processor Energy Efficiency
Rapid device-miniaturization keeps on inducing challenges in building energy efficient microprocessors. As the size of the transistors continuously decreasing, more uncertainties emerge in their operations. On the other hand, integrating more and more transistors on a single chip accentuates the need to lower its supply-voltage. This dissertation investigates one of the primary device uncertainties - timing error, in microprocessor performance bottleneck in NTC era. Then it proposes various innovative techniques to exploit these opportunities to maintain processor energy efficiency, in the context of emerging challenges. Evaluated with the cross-layer methodology, the proposed approaches achieve substantial improvements in processor energy efficiency, compared to other start-of-art techniques
Design Automation and Application for Emerging Reconfigurable Nanotechnologies
In the last few decades, two major phenomena have revolutionized the electronic industry – the ever-increasing dependence on electronic circuits and the Complementary Metal Oxide Semiconductor (CMOS) downscaling. These two phenomena have been complementing each other in a way that while electronics, in general, have demanded more computations per functional unit, CMOS downscaling has aptly supported such needs. However, while the computational demand is still rising exponentially, CMOS downscaling is reaching its physical limits. Hence, the need to explore viable emerging nanotechnologies is more imperative than ever. This thesis focuses on streamlining the existing design automation techniques for a class of emerging reconfigurable nanotechnologies. Transistors based on this technology exhibit duality in conduction, i.e. they can be configured dynamically either as a p-type or an n-type device on the application of an external bias. Owing to this dynamic reconfiguration, these transistors are also referred to as Reconfigurable Field-Effect Transistors (RFETs).
Exploring and developing new technologies just like CMOS, require tackling two main challenges – first, design automation flow has to be modified to enable tailor- made circuit designs. Second, possible application opportunities should be explored where such technologies can outsmart the existing CMOS technologies. This thesis targets the above two objectives for emerging reconfigurable nanotechnologies by proposing approaches for enabling an Electronic Design Automation (EDA) flow for circuits based on RFETs and exploring hardware security as an application that exploits the transistor-level dynamic reconfiguration offered by this technology.
This thesis explains the bottom-up approach adopted to propose a logic synthesis flow by identifying new logic gates and circuit design paradigms that can particularly exploit the dynamic reconfiguration offered by these novel nanotechnologies. This led to the subsequent need of finding natural Boolean logic abstraction for emerging reconfigurable nanotechnologies as it is shown that the existing abstraction of negative unate logic for CMOS technologies is sub-optimal for RFETs-based circuits. In this direction, it has been shown that duality in Boolean logic is a natural abstraction for this technology and can truly represent the duality in conduction offered by individual transistors. Finding this abstraction paved the way for defining suitable primitives and proposing various algorithms for logic synthesis and technology mapping.
The following step is to explore compatible physical synthesis flow for emerging reconfigurable nanotechnologies. Using silicon nanowire-based RFETs, .lef and .lib files have been provided which can provide an end-to-end flow to generate .GDSII file for circuits exclusively based on RFETs. Additionally, new approaches have been explored to improve placement and routing for circuits based on reconfigurable nanotechnologies. It has been demonstrated how these approaches led to superior results as compared to the native flow meant for CMOS.
Lastly, the unique property of transistor-level reconfiguration offered by RFETs is utilized to implement efficient Intellectual Property (IP) protection schemes against adversarial attacks. The ability to control the conduction of individual transistors can be argued as one of the impactful features of this technology and suitably fits into the paradigm of security measures. Prior security schemes based on CMOS technology often come with large overheads in terms of area, power, and delay. In contrast, RFETs-based hardware security measures such as logic locking, split manufacturing, etc. proposed in this thesis, demonstrate affordable security solutions with low overheads.
Overall, this thesis lays a strong foundation for the two main objectives – design automation, and hardware security as an application, to push emerging reconfigurable nanotechnologies for commercial integration. Additionally, contributions done in this thesis are made available under open-source licenses so as to foster new research directions and collaborations.:Abstract
List of Figures
List of Tables
1 Introduction
1.1 What are emerging reconfigurable nanotechnologies?
1.2 Why does this technology look so promising?
1.3 Electronics Design Automation
1.4 The game of see-saw: key challenges vs benefits for emerging reconfigurable nanotechnologies
1.4.1 Abstracting ambipolarity in logic gate designs
1.4.2 Enabling electronic design automation for RFETs
1.4.3 Enhanced functionality: a suitable fit for hardware security applications
1.5 Research questions
1.6 Entire RFET-centric EDA Flow
1.7 Key Contributions and Thesis Organization
2 Preliminaries
2.1 Reconfigurable Nanotechnology
2.1.1 1D devices
2.1.2 2D devices
2.1.3 Factors favoring circuit-flexibility
2.2 Feasibility aspects of RFET technology
2.3 Logic Synthesis Preliminaries
2.3.1 Circuit Model
2.3.2 Boolean Algebra
2.3.3 Monotone Function and the property of Unateness
2.3.4 Logic Representations
3 Exploring Circuit Design Topologies for RFETs
3.1 Contributions
3.2 Organization
3.3 Related Works
3.4 Exploring design topologies for combinational circuits: functionality-enhanced logic gates
3.4.1 List of Combinational Functionality-Enhanced Logic Gates based on RFETs
3.4.2 Estimation of gate delay using the logical effort theory
3.5 Invariable design of Inverters
3.6 Sequential Circuits
3.6.1 Dual edge-triggered TSPC-based D-flip flop
3.6.2 Exploiting RFET’s ambipolarity for metastability
3.7 Evaluations
3.7.1 Evaluation of combinational logic gates
3.7.2 Novel design of 1-bit ALU
3.7.3 Comparison of the sequential circuit with an equivalent CMOS-based design
3.8 Concluding remarks
4 Standard Cells and Technology Mapping
4.1 Contributions
4.2 Organization
4.3 Related Work
4.4 Standard cells based on RFETs
4.4.1 Interchangeable Pull-Up and Pull-Down Networks
4.4.2 Reconfigurable Truth-Table
4.5 Distilling standard cells
4.6 HOF-based Technology Mapping Flow for RFETs-based circuits
4.6.1 Area adjustments through inverter sharings
4.6.2 Technology Mapping Flow
4.6.3 Realizing Parameters For The Generic Library
4.6.4 Defining RFETs-based Genlib for HOF-based mapping
4.7 Experiments
4.7.1 Experiment 1: Distilling standard-cells from a benchmark suite
4.7.2 Experiment 2A: HOF-based mapping .
4.7.3 Experiment 2B: Using the distilled standard-cells during mapping
4.8 Concluding Remarks
5 Logic Synthesis with XOR-Majority Graphs
5.1 Contributions
5.2 Organization
5.3 Motivation
5.4 Background and Preliminaries
5.4.1 Terminologies
5.4.2 Self-duality in NPN classes
5.4.3 Majority logic synthesis
5.4.4 Earlier work on XMG
5.4.5 Classification of Boolean functions
5.5 Preserving Self-Duality
5.5.1 During logic synthesis
5.5.2 During versatile technology mapping
5.6 Advanced Logic synthesis techniques
5.6.1 XMG resubstitution
5.6.2 Exact XMG rewriting
5.7 Logic representation-agnostic Mapping
5.7.1 Versatile Mapper
5.7.2 Support of supergates
5.8 Creating Self-dual Benchmarks
5.9 Experiments
5.9.1 XMG-based Flow
5.9.2 Experimental Setup
5.9.3 Synthetic self-dual benchmarks
5.9.4 Cryptographic benchmark suite
5.10 Concluding remarks and future research directions
6 Physical synthesis flow and liberty generation
6.1 Contributions
6.2 Organization
6.3 Background and Related Work
6.3.1 Related Works
6.3.2 Motivation
6.4 Silicon Nanowire Reconfigurable Transistors
6.5 Layouts for Logic Gates
6.5.1 Layouts for Static Functional Logic Gates
6.5.2 Layout for Reconfigurable Logic Gate
6.6 Table Model for Silicon Nanowire RFETs
6.7 Exploring Approaches for Physical Synthesis
6.7.1 Using the Standard Place & Route Flow
6.7.2 Open-source Flow
6.7.3 Concept of Driver Cells
6.7.4 Native Approach
6.7.5 Island-based Approach
6.7.6 Utilization Factor
6.7.7 Placement of the Island on the Chip
6.8 Experiments
6.8.1 Preliminary comparison with CMOS technology
6.8.2 Evaluating different physical synthesis approaches
6.9 Results and discussions
6.9.1 Parameters Which Affect The Area
6.9.2 Use of Germanium Nanowires Channels
6.10 Concluding Remarks
7 Polymporphic Primitives for Hardware Security
7.1 Contributions
7.2 Organization
7.3 The Shift To Explore Emerging Technologies For Security
7.4 Background
7.4.1 IP protection schemes
7.4.2 Preliminaries
7.5 Security Promises
7.5.1 RFETs for logic locking (transistor-level locking)
7.5.2 RFETs for split manufacturing
7.6 Security Vulnerabilities
7.6.1 Realization of short-circuit and open-circuit scenarios in an RFET-based inverter
7.6.2 Circuit evaluation on sub-circuits
7.6.3 Reliability concerns: A consequence of short-circuit scenario
7.6.4 Implication of the proposed security vulnerability
7.7 Analytical Evaluation
7.7.1 Investigating the security promises
7.7.2 Investigating the security vulnerabilities
7.8 Concluding remarks and future research directions
8 Conclusion
8.1 Concluding Remarks
8.2 Directions for Future Work
Appendices
A Distilling standard-cells
B RFETs-based Genlib
C Layout Extraction File (.lef) for Silicon Nanowire-based RFET
D Liberty (.lib) file for Silicon Nanowire-based RFET
ReDO: Cross-Layer Multi-Objective Design-Exploration Framework for Efficient Soft Error Resilient Systems
Designing soft errors resilient systems is a complex engineering task, which nowadays follows a cross-layer approach. It requires a careful planning for different fault-tolerance mechanisms at different system's layers: starting from the technology up to the software domain. While these design decisions have a positive effect on the reliability of the system, they usually have a detrimental effect on its size, power consumption, performance and cost. Design space exploration for cross-layer reliability is therefore a multi-objective search problem in which reliability must be traded-off with other design dimensions. This paper proposes a cross-layer multi-objective design space exploration algorithm developed to help designers when building soft error resilient electronic systems. The algorithm exploits a system-level Bayesian reliability estimation model to analyze the effect of different cross-layer combinations of protection mechanisms on the reliability of the full system. A new heuristic based on the extremal optimization theory is used to efficiently explore the design space. An extended set of simulations shows the capability of this framework when applied both to benchmark applications and realistic systems, providing optimized systems that outperform those obtained by applying state-of-the-art cross-layer reliability techniques
- …