13 research outputs found

    MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators

    Full text link
    As a result of the increasing demand for deep neural network (DNN)-based services, efforts to develop dedicated hardware accelerators for DNNs are growing rapidly. However,while accelerators with high performance and efficiency on convolutional deep neural networks (Conv-DNNs) have been developed, less progress has been made with regards to fully-connected DNNs (FC-DNNs). In this paper, we propose MATIC (Memory Adaptive Training with In-situ Canaries), a methodology that enables aggressive voltage scaling of accelerator weight memories to improve the energy-efficiency of DNN accelerators. To enable accurate operation with voltage overscaling, MATIC combines the characteristics of destructive SRAM reads with the error resilience of neural networks in a memory-adaptive training process. Furthermore, PVT-related voltage margins are eliminated using bit-cells from synaptic weights as in-situ canaries to track runtime environmental variation. Demonstrated on a low-power DNN accelerator that we fabricate in 65 nm CMOS, MATIC enables up to 60-80 mV of voltage overscaling (3.3x total energy reduction versus the nominal voltage), or 18.6x application error reduction.Comment: 6 pages, 12 figures, 3 tables. Published at Design, Automation and Test in Europe Conference and Exhibition (DATE) 201

    On the Critical Role of Ferroelectric Thickness for Negative Capacitance Device-Circuit Interaction

    Get PDF
    This paper demonstrates the critical role that Ferroelectric (FE) layer thickness (tFE) plays in Negative Capacitance (NC) transistors connecting device and circuit levels together. The study is done through fully-calibrated TCAD simulations for a 14nm FDSOI technology node, exploring the impact of tFE on the figures of merit of n-type and p-type devices, voltage transfer characteristic (VTC) and noise margin of inverter as well as the speed of buffer circuits. First, we analyze the device electrical parameters (e.g., ION, SS, ION/IOFF and Cgg) by varying tFE up to the maximum level at which hysteresis in the I-V characteristic starts. Then, we analyze the deleterious impact of Negative Differential Resistance (NDR), due to the drain to gate coupling, demonstrating how it imposes an additional constraint limiting the maximum tFE. We show the consequences of NDR effects on the VTC and noise margin of inverter, which are essential components for constructing robust clock trees in any chip. We demonstrate how the considerable increase in the gate’s capacitance due to FE seriously degrades the circuit’s performance imposing further constraints limiting the maximum tFE. Further, we analyze the impact of tFE on the SRAM cell static performance metrics such hold noise margin (HNM), read noise margin (RNM) and write noise margin (WNM) at supply voltages of 0.7V and 0.4V. We demonstrate that the HNM and RNM in a NC-FDSOI FET based SRAM cell are higher then those of the baseline FDSOI FET based SRAM cell noise margin and further increase with tFE. However, the WNM in general follows a non monotonic trend w.r.t tFE, and the trend also depends on the supply voltage. Finally, we optimize the design of the SRAM cell considering overall performance metrics. All in all, our analysis provides guidance for device and circuit designers to select the optimal FE thickness for NCFETs in which hysteresis-free operations, reliability, and performance are optimized

    Characterization of File Memory Compiler

    Get PDF
    As the usage of hand held devices is increasing rapidly, it is a serious requirement to design the SoC with much smaller size. Majority of the area on the SoC is occupied by the memories used like RAM, Cache etc. It is required to reduce the size of these memories on the SoC. Also as these SoC run on batteries such memories must consume very less power. So the requirement is to design high density embedded memories with less in area, with less power consumption and meeting the designer timing requirements like access time, setup and hold time constraints. The main aim of the proposed work is, once the design is done, the memory compiler need to be characterized whether it is meeting the design requirements for the given instance and given process, voltage and temperature corners and to get the memory timing and power information in datasheet and liberty formats

    Design of High Performance SRAM Based Memory Chip

    Get PDF
    The semiconductor memory SRAM uses bi-stable latch circuit to store the logic data 1 or 0. It differs from Dynamic RAM (DRAM) which needs periodic refreshment operation for the storage of logic data. Depending upon the frequency of operation SRAM power consumption varies i.e. it consumes very high power at higher frequencies like DRAM. The Cache memory present in the microprocessor needs high speed memory hence SRAM can be used for that purpose in microprocessors. The DRAM is normally used in the Main memory of processors, where importance is given to the density than its speed. The SRAM is also used in industrial subsystems, scientific and automotive electronics. In this thesis 16-Kb Memory is designed by using memory banking method in UMC 90nm technology ,which operates at a frequency of 1GHz.The post layout simulation for the complete design is performed and also obtained power analysis for the overall design. All peripherals like pre-charge, Row Decoder, Word line driver, Sense amplifier, Column Decoder/Mux and write driver are designed and layouts of all the above peripherals also drawn in an optimised manner such that their layout occupies minimum area. The 6T SRAM cell is designed with operating frequency of 8 GHz and stability analysis are also performed for single SRAM cell. The layout of Single SRAM cell is drawn in a symmetric manner, such that two adjacent cells can share same contact, which results reduction in the area of cell layout. The Static Noise Margin, Read noise margin and Write Noise Margin of single cell are found to be 240mV, 115mV and 425mV respectively for a supply voltage of 1V.The effect of pull-up ratio and cell ratio on the stability of SRAM cell is observed

    Bit Error Robustness for Energy-Efficient DNN Accelerators

    Get PDF
    Deep neural network (DNN) accelerators received considerable attention in past years due to saved energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption significantly, however, causes bit-level failures in the memory storing the quantized DNN weights. In this paper, we show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors in (quantized) DNN weights significantly. This leads to high energy savings from both low-voltage operation as well as low-precision quantization. Our approach generalizes across operating voltages and accelerators, as demonstrated on bit errors from profiled SRAM arrays. We also discuss why weight clipping alone is already a quite effective way to achieve robustness against bit errors. Moreover, we specifically discuss the involved trade-offs regarding accuracy, robustness and precision: Without losing more than 1% in accuracy compared to a normally trained 8-bit DNN, we can reduce energy consumption on CIFAR-10 by 20%. Higher energy savings of, e.g., 30%, are possible at the cost of 2.5% accuracy, even for 4-bit DNNs

    Random and Adversarial Bit Error Robustness: {E}nergy-Efficient and Secure {DNN} Accelerators

    Get PDF
    Deep neural network (DNN) accelerators received considerable attention in recent years due to the potential to save energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption significantly, however, causes bit-level failures in the memory storing the quantized DNN weights. Furthermore, DNN accelerators have been shown to be vulnerable to adversarial attacks on voltage controllers or individual bits. In this paper, we show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) or adversarial bit error training (AdvBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads not only to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators. Our approach generalizes across operating voltages and accelerators, as demonstrated on bit errors from profiled SRAM arrays, and achieves robustness against both targeted and untargeted bit-level attacks. Without losing more than 0.8%/2% in test accuracy, we can reduce energy consumption on CIFAR10 by 20%/30% for 8/4-bit quantization using RandBET. Allowing up to 320 adversarial bit errors, AdvBET reduces test error from above 90% (chance level) to 26.22% on CIFAR10

    Contributions on using embedded memory circuits as physically unclonable functions considering reliability issues

    Get PDF
    [eng] Moving towards Internet-of-Things (IoT) era, hardware security becomes a crucial research topic, because of the growing demand of electronic products that are remotely connected through networks. Novel hardware security primitives based on manufacturing process variability are proposed to enhance the security of the IoT systems. As a trusted root that provides physical randomness, a physically unclonable function is an essential base for hardware security. SRAM devices are becoming one of the most promising alternatives for the implementation of embedded physical unclonable functions as the start-up value of each bit-cell depends largely on the variability related with the manufacturing process. Not all bit-cells experience the same degree of variability, so it is possible that some cells randomly modify their logical starting value, while others will start-up always at the same value. However, physically unclonable function applications, such as identification and key generation, require more constant logical starting value to assure high reliability in PUF response. For this reason, some kind of post-processing is needed to correct the errors in the PUF response. Unfortunately, those cells that have more constant logic output are difficult to be detected in advance. This work characterizes by simulation the start-up value reproducibility proposing several metrics suitable for reliability estimation during design phases. The aim is to be able to predict by simulation the percentage of cells that will be suitable to be used as PUF generators. We evaluate the metrics results and analyze the start-up values reproducibility considering different external perturbation sources like several power supply ramp up times, previous internal values in the bit-cell, and different temperature scenarios. The characterization metrics can be exploited to estimate the number of suitable SRAM cells for use in PUF implementations that can be expected from a specific SRAM design.[cat] En l’era de la Internet de les coses (IoT), garantir la seguretat del hardware ha esdevingut un tema de recerca crucial, en especial a causa de la creixent demanda de productes electrònics que es connecten remotament a través de xarxes. Per millorar la seguretat dels sistemes IoT, s’han proposat noves solucions hardware basades en la variabilitat dels processos de fabricació. Les funcions físicament inclonables (PUF) constitueixen una font fiable d’aleatorietat física i són una base essencial per a la seguretat hardware. Les memòries SRAM s’estan convertint en una de les alternatives més prometedores per a la implementació de funcions físicament inclonables encastades. Això és així ja que el valor d’encesa de cada una de les cel·les que formen els bits de la memòria depèn en gran mesura de la variabilitat pròpia del procés de fabricació. No tots els bits tenen el mateix grau de variabilitat, així que algunes cel·les canvien el seu estat lògic d’encesa de forma aleatòria entre enceses, mentre que d’altres sempre assoleixen el mateix valor en totes les enceses. No obstant això, les funcions físicament inclonables, que s’utilitzen per generar claus d’identificació, requereixen un valor lògic d’encesa constant per tal d’assegurar una resposta fiable del PUF. Per aquest motiu, normalment es necessita algun tipus de postprocessament per corregir els possibles errors presents en la resposta del PUF. Malauradament, les cel·les que presenten una resposta més constant són difícils de detectar a priori. Aquest treball caracteritza per simulació la reproductibilitat del valor d’encesa de cel·les SRAM, i proposa diverses mètriques per estimar la fiabilitat de les cel·les durant les fases de disseny de la memòria. L'objectiu és ser capaç de predir per simulació el percentatge de cel·les que seran adequades per ser utilitzades com PUF. S’avaluen els resultats de diverses mètriques i s’analitza la reproductibilitat dels valors d’encesa de les cel·les considerant diverses fonts de pertorbacions externes, com diferents rampes de tensió per a l’encesa, els valors interns emmagatzemats prèviament en les cel·les, i diferents temperatures. Es proposa utilitzar aquestes mètriques per estimar el nombre de cel·les SRAM adients per ser implementades com a PUF en un disseny d‘SRAM específic.[spa] En la era de la Internet de las cosas (IoT), garantizar la seguridad del hardware se ha convertido en un tema de investigación crucial, en especial a causa de la creciente demanda de productos electrónicos que se conectan remotamente a través de redes. Para mejorar la seguridad de los sistemas IoT, se han propuesto nuevas soluciones hardware basadas en la variabilidad de los procesos de fabricación. Las funciones físicamente inclonables (PUF) constituyen una fuente fiable de aleatoriedad física y son una base esencial para la seguridad hardware. Las memorias SRAM se están convirtiendo en una de las alternativas más prometedoras para la implementación de funciones físicamente inclonables empotradas. Esto es así, puesto que el valor de encendido de cada una de las celdas que forman los bits de la memoria depende en gran medida de la variabilidad propia del proceso de fabricación. No todos los bits tienen el mismo grado de variabilidad. Así pues, algunas celdas cambian su estado lógico de encendido de forma aleatoria entre encendidos, mientras que otras siempre adquieren el mismo valor en todos los encendidos. Sin embargo, las funciones físicamente inclonables, que se utilizan para generar claves de identificación, requieren un valor lógico de encendido constante para asegurar una respuesta fiable del PUF. Por este motivo, normalmente se necesita algún tipo de posprocesado para corregir los posibles errores presentes en la respuesta del PUF. Desafortunadamente, las celdas que presentan una respuesta más constante son difíciles de detectar a priori. Este trabajo caracteriza por simulación la reproductibilidad del valor de encendido de celdas SRAM, y propone varias métricas para estimar la fiabilidad de las celdas durante las fases de diseño de la memoria. El objetivo es ser capaz de predecir por simulación el porcentaje de celdas que serán adecuadas para ser utilizadas como PUF. Se evalúan los resultados de varias métricas y se analiza la reproductibilidad de los valores de encendido de las celdas considerando varias fuentes de perturbaciones externas, como diferentes rampas de tensión para el encendido, los valores internos almacenados previamente en las celdas, y diferentes temperaturas. Se propone utilizar estas métricas para estimar el número de celdas SRAM adecuadas para ser implementadas como PUF en un diseño de SRAM específico
    corecore