15 research outputs found

    Advances in Nanowire-Based Computing Architectures

    Get PDF

    Null Convention Logic applications of asynchronous design in nanotechnology and cryptographic security

    Get PDF
    This dissertation presents two Null Convention Logic (NCL) applications of asynchronous logic circuit design in nanotechnology and cryptographic security. The first application is the Asynchronous Nanowire Reconfigurable Crossbar Architecture (ANRCA); the second one is an asynchronous S-Box design for cryptographic system against Side-Channel Attacks (SCA). The following are the contributions of the first application: 1) Proposed a diode- and resistor-based ANRCA (DR-ANRCA). Three configurable logic block (CLB) structures were designed to efficiently reconfigure a given DR-PGMB as one of the 27 arbitrary NCL threshold gates. A hierarchical architecture was also proposed to implement the higher level logic that requires a large number of DR-PGMBs, such as multiple-bit NCL registers. 2) Proposed a memristor look-up-table based ANRCA (MLUT-ANRCA). An equivalent circuit simulation model has been presented in VHDL and simulated in Quartus II. Meanwhile, the comparison between these two ANRCAs have been analyzed numerically. 3) Presented the defect-tolerance and repair strategies for both DR-ANRCA and MLUT-ANRCA. The following are the contributions of the second application: 1) Designed an NCL based S-Box for Advanced Encryption Standard (AES). Functional verification has been done using Modelsim and Field-Programmable Gate Array (FPGA). 2) Implemented two different power analysis attacks on both NCL S-Box and conventional synchronous S-Box. 3) Developed a novel approach based on stochastic logics to enhance the resistance against DPA and CPA attacks. The functionality of the proposed design has been verified using an 8-bit AES S-box design. The effects of decision weight, bitstream length, and input repetition times on error rates have been also studied. Experimental results shows that the proposed approach enhances the resistance to against the CPA attack by successfully protecting the hidden key --Abstract, page iii

    Virtualizing Reconfigurable Architectures: From Fpgas To Beyond

    Get PDF
    With field-programmable gate arrays (FPGAs) being widely deployed in data centers to enhance the computing performance, an efficient virtualization support is required to fully unleash the potential of cloud FPGAs. However, the system support for FPGAs in the context of the cloud environment is still in its infancy, which leads to a low resource utilization due to the tight coupling between compilation and resource allocation. Moreover, the system support proposed in existing works is limited to a homogeneous FPGA cluster comprising identical FPGA devices, which is hard to be extended to a heterogeneous FPGA cluster that comprises multiple types of FPGAs. As the FPGA cloud is expected to become increasingly heterogeneous due to the hardware rolling upgrade strategy, it is necessary to provide efficient virtualization support for the heterogeneous FPGA cluster. In this dissertation, we first identify three pairs of conflicting requirements from runtime management and offline compilation, which are related to the tradeoff between flexibility and efficiency. These conflicting requirements are the fundamental reason why the single-level abstraction proposed in prior works for the homogeneous FPGA cluster cannot be trivially extended to the heterogeneous cluster. To decouple these conflicting requirements, we provide a two-level system abstraction. Specifically, the high-level abstraction is FPGA-agnostic and provides a simple and homogeneous view of the FPGA resources to simplify the runtime management and maximize the flexibility. On the contrary, the low-level abstraction is FPGA-specific and exposes sufficient low-level hardware details to the compilation framework to ensure the mapping quality and maximize the efficiency. This generic two-level system abstraction can also be specialized to the homogeneous FPGA cluster and/or be extended to leverage application-specific information to further improve the efficiency. We also develop a compilation framework and a modular runtime system with a heuristic-based runtime management policy to support this two-level system abstraction. By enabling a dynamic FPGA sharing at the sub-FPGA granularity, the proposed virtualization solution can deploy 1.62x more applications using the same amount of FPGA resources and reduce the compilation time by 22.6% (perform as many compilation tasks in parallel as possible) with an acceptable virtualization overhead, i.e., Finally, we use Liquid Silicon as a case study to show that the proposed virtualization solution can be extended to other spatial reconfigurable architectures. Liquid Silicon is a homogeneous reconfigurable architecture enabled by the non-volatile memory technology (i.e., RRAM). It extends the configuration capability of existing FPGAs from computation to the whole spectrum ranging from computation to data storage. It allows users to better customize hardware by flexibly partitioning hardware resources between computation and memory based on the actual usage. Instead of naively applying the proposed virtualization solution onto Liquid Silicon, we co-optimize the system abstraction and Liquid Silicon architecture to improve the performance

    Cutting Edge Nanotechnology

    Get PDF
    The main purpose of this book is to describe important issues in various types of devices ranging from conventional transistors (opening chapters of the book) to molecular electronic devices whose fabrication and operation is discussed in the last few chapters of the book. As such, this book can serve as a guide for identifications of important areas of research in micro, nano and molecular electronics. We deeply acknowledge valuable contributions that each of the authors made in writing these excellent chapters

    Variability-aware architectures based on hardware redundancy for nanoscale reliable computation

    Get PDF
    During the last decades, human beings have experienced a significant enhancement in the quality of life thanks in large part to the fast evolution of Integrated Circuits (IC). This unprecedented technological race, along with its significant economic impact, has been grounded on the production of complex processing systems from highly reliable compounding devices. However, the fundamental assumption of nearly ideal devices, which has been true within the past CMOS technology generations, today seems to be coming to an end. In fact, as MOSFET technology scales into nanoscale regime it approaches to fundamental physical limits and starts experiencing higher levels of variability, performance degradation, and higher rates of manufacturing defects. On the other hand, ICs with increasing number of transistors require a decrease in the failure rate per device in order to maintain the overall chip reliability. As a result, it is becoming increasingly important today the development of circuit architectures capable of providing reliable computation while tolerating high levels of variability and defect rates. The main objective of this thesis is to analyze and propose new fault-tolerant architectures based on redundancy for future technologies. Our research is founded on the principles of redundancy established by von Neumann in the 1950s and extends them to three new dimensions: 1. Heterogeneity: Most of the works on fault-tolerant architectures based on redundancy assume homogeneous variability in the replicas like von Neumann's original work. Instead, we explore the possibilities of redundancy when heterogeneity between replicas is taken into account. In this sense, we propose compensating mechanisms that select the weighting of the redundant information to maximize the overall reliability. 2. Asynchrony: Each of the replicas of a redundant system may have associated different processing delays due to variability and degradation; especially in future nanotechnologies. If we design our system to work locally in asynchronous mode then we may consider different voting policies to deal with the redundant information. Depending on how many replicas we collect before taking a decision we can obtain different trade-off between processing delay and reliability. We propose a mechanism for providing these facilities and analyze and simulate its operation. 3. Hierarchy: Finally, we explore the possibilities of redundancy applied at different hierarchy layers of complex processing systems. We propose to distribute redundancy across the various hierarchy layers and analyze the benefits that can be obtained. Drawing on the scenario of future ICs technologies, we push the concept of redundancy to its fullest expression through the study of realistic nano-device architectures. Most of the redundant architectures considered so far do not face properly the era of Terascale Computing and the nanotechnology trends. Since von Neumann applied for the first time redundancy at electronic circuits, never until now effects as common in nanoelectronics as degradation and interconnection failures have been treated directly from the standpoint of redundancy. In this thesis we address in a comprehensive manner the reliability of digital processing systems in the upcoming technology generations

    Non-volatile FPGA architecture using resistive switching devices

    Get PDF
    This dissertation reports the research work that was conducted to propose a non-volatile architecture for FPGA using resistive switching devices. This is achieved by designing a Configurable Memristive Logic Block (CMLB). The CMLB comprises of memristive logic cells (MLC) interconnected to each other using memristive switch matrices. In the MLC, novel memristive D flip-flop (MDFF), 6-bit non-volatile look-up table (NVLUT), and CMOS-based multiplexers are used. Other than the MDFF, a non-volatile D-latch (NVDL) was also designed. The MDFF and the NVDL are proposed to replace CMOS-based D flip-flops and D-latches to improve energy consumption. The CMLB shows a reduction of 8.6% of device area and 1.094 times lesser critical path delay against the SRAM-based FPGA architecture. Against similar CMOS-based circuits, the MDFF provides switching speed of 1.08 times faster; the NVLUT reduces power consumption by 6.25nW and improves device area by 128 transistors; while the memristive logic cells reduce overall device area by 60.416μm2. The NVLUT is constructed using novel 2TG1M memory cells, which has the fastest switching times of 12.14ns, compared to other similar memristive memory cells. This is due to the usage of transmission gates which improves voltage transfer from input to the memristor. The novel 2TG1M memory cell also has lower energy consumption than the CMOS-based 6T SRAM cell. The memristive-based switch matrices that interconnects the MLCs together comprises of novel 7T1M SRAM cells, which has the lowest energy-delay-area-product value of 1.61 among other memristive SRAM cells. Two memristive logic gates (MLG) were also designed (OR and AND), that introduces non-volatility into conventional logic gates. All the above circuits and design simulations were performed on an enhanced SPICE memristor model, which was improved from a previously published memristor model. The previously published memristor model was fault to not be in good agreement with memristor theory and the physical model of memristors. Therefore, the enhanced SPICE memristor model provides a memristor model which is in good agreement with the memristor theory and the physical model of memristors, which is used throughout this research work

    Non-volatile FPGA architecture using resistive switching devices

    Get PDF
    This dissertation reports the research work that was conducted to propose a non-volatile architecture for FPGA using resistive switching devices. This is achieved by designing a Configurable Memristive Logic Block (CMLB). The CMLB comprises of memristive logic cells (MLC) interconnected to each other using memristive switch matrices. In the MLC, novel memristive D flip-flop (MDFF), 6-bit non-volatile look-up table (NVLUT), and CMOS-based multiplexers are used. Other than the MDFF, a non-volatile D-latch (NVDL) was also designed. The MDFF and the NVDL are proposed to replace CMOS-based D flip-flops and D-latches to improve energy consumption. The CMLB shows a reduction of 8.6% of device area and 1.094 times lesser critical path delay against the SRAM-based FPGA architecture. Against similar CMOS-based circuits, the MDFF provides switching speed of 1.08 times faster; the NVLUT reduces power consumption by 6.25nW and improves device area by 128 transistors; while the memristive logic cells reduce overall device area by 60.416μm2. The NVLUT is constructed using novel 2TG1M memory cells, which has the fastest switching times of 12.14ns, compared to other similar memristive memory cells. This is due to the usage of transmission gates which improves voltage transfer from input to the memristor. The novel 2TG1M memory cell also has lower energy consumption than the CMOS-based 6T SRAM cell. The memristive-based switch matrices that interconnects the MLCs together comprises of novel 7T1M SRAM cells, which has the lowest energy-delay-area-product value of 1.61 among other memristive SRAM cells. Two memristive logic gates (MLG) were also designed (OR and AND), that introduces non-volatility into conventional logic gates. All the above circuits and design simulations were performed on an enhanced SPICE memristor model, which was improved from a previously published memristor model. The previously published memristor model was fault to not be in good agreement with memristor theory and the physical model of memristors. Therefore, the enhanced SPICE memristor model provides a memristor model which is in good agreement with the memristor theory and the physical model of memristors, which is used throughout this research work

    Circuit Design, Architecture and CAD for RRAM-based FPGAs

    Get PDF
    Field Programmable Gate Arrays (FPGAs) have been indispensable components of embedded systems and datacenter infrastructures. However, energy efficiency of FPGAs has become a hard barrier preventing their expansion to more application contexts, due to two physical limitations: (1) The massive usage of routing multiplexers causes delay and power overheads as compared to ASICs. To reduce their power consumption, FPGAs have to operate at low supply voltage but sacrifice performance because the transistors drive degrade when working voltage decreases. (2) Using volatile memory technology forces FPGAs to lose configurations when powered off and to be reconfigured at each power on. Resistive Random Access Memories (RRAMs) have strong potentials in overcoming the physical limitations of conventional FPGAs. First of all, RRAMs grant FPGAs non-volatility, enabling FPGAs to be "Normally powered off, Instantly powered on". Second, by combining functionality of memory and pass-gate logic in one unique device, RRAMs can greatly reduce area and delay of routing elements. Third, when RRAMs are embedded into datpaths, the performance of circuits can be independent from their working voltage, beyond the limitations of CMOS circuits. However, researches and development of RRAM-based FPGAs are in their infancy. Most of area and performance predictions were achieved without solid circuit-level simulations and sophisticated Computer Aided Design (CAD) tools, causing the predicted improvements to be less convincing. In this thesis,we present high-performance and low-power RRAM-based FPGAs fromtransistorlevel circuit designs to architecture-level optimizations and CAD tools, using theoretical analysis, industrial electrical simulators and novel CAD tools. We believe that this is the first systematic study in the field, covering: From a circuit design perspective, we propose efficient RRAM-based programming circuits and routing multiplexers through both theoretical analysis and electrical simulations. The proposed 4T(ransitor)1R(RAM) programming structure demonstrates significant improvements in programming current, when compared to most popular 2T1R programming structure. 4T1R-based routingmultiplexer designs are proposed by considering various physical design parasitics, such as intrinsic capacitance of RRAMs and wells doping organization. The proposed 4T1R-based multiplexers outperformbest CMOS implementations significantly in area, delay and power at both nominal and near-Vt regime. From a CAD perspective, we develop a generic FPGA architecture exploration tool, FPGASPICE, modeling a full FPGA fabric with SPICE and Verilog netlists. FPGA-SPICE provides different levels of testbenches and techniques to split large SPICE netlists, in order to obtain better trade-off between simulation time and accuracy. FPGA-SPICE can capture area and power characteristics of SRAM-based and RRAM-based FPGAs more accurately than the currently best analyticalmodels. From an architecture perspective, we propose architecture-level optimizations for RRAMbased FPGAs and quantify their minimumrequirements for RRAM devices. Compared to the best SRAM-based FPGAs, an optimized RRAM-based FPGA architecture brings significant reduction in area, delay and power respectively. In particular, RRAM-based FPGAs operating in the near-Vt regime demonstrate a 5x power improvement without delay overhead as compared to optimized SRAM-based FPGA operating at nominal working voltage

    Ant colony optimization on runtime reconfigurable architectures

    Get PDF
    corecore