13 research outputs found
MATIC: Learning Around Errors for Efficient Low-Voltage Neural Network Accelerators
As a result of the increasing demand for deep neural network (DNN)-based
services, efforts to develop dedicated hardware accelerators for DNNs are
growing rapidly. However,while accelerators with high performance and
efficiency on convolutional deep neural networks (Conv-DNNs) have been
developed, less progress has been made with regards to fully-connected DNNs
(FC-DNNs). In this paper, we propose MATIC (Memory Adaptive Training with
In-situ Canaries), a methodology that enables aggressive voltage scaling of
accelerator weight memories to improve the energy-efficiency of DNN
accelerators. To enable accurate operation with voltage overscaling, MATIC
combines the characteristics of destructive SRAM reads with the error
resilience of neural networks in a memory-adaptive training process.
Furthermore, PVT-related voltage margins are eliminated using bit-cells from
synaptic weights as in-situ canaries to track runtime environmental variation.
Demonstrated on a low-power DNN accelerator that we fabricate in 65 nm CMOS,
MATIC enables up to 60-80 mV of voltage overscaling (3.3x total energy
reduction versus the nominal voltage), or 18.6x application error reduction.Comment: 6 pages, 12 figures, 3 tables. Published at Design, Automation and
Test in Europe Conference and Exhibition (DATE) 201
On the Critical Role of Ferroelectric Thickness for Negative Capacitance Device-Circuit Interaction
This paper demonstrates the critical role that Ferroelectric (FE) layer thickness (tFE) plays in Negative Capacitance (NC) transistors connecting device and circuit levels together. The study is done through fully-calibrated TCAD simulations for a 14nm FDSOI technology node, exploring the impact of tFE on the figures of merit of n-type and p-type devices, voltage transfer characteristic (VTC) and noise margin of inverter as well as the speed of buffer circuits. First, we analyze the device electrical parameters (e.g., ION, SS, ION/IOFF and Cgg) by varying tFE up to the maximum level at which hysteresis in the I-V characteristic starts. Then, we analyze the deleterious impact of Negative Differential Resistance (NDR), due to the drain to gate coupling, demonstrating how it imposes an additional constraint limiting the maximum tFE. We show the consequences of NDR effects on the VTC and noise margin of inverter, which are essential components for constructing robust clock trees in any chip. We demonstrate how the considerable increase in the gate’s capacitance due to FE seriously degrades the circuit’s performance imposing further constraints limiting the maximum tFE. Further, we analyze the impact of tFE on the SRAM cell static performance metrics such hold noise margin (HNM), read noise margin (RNM) and write noise margin (WNM) at supply voltages of 0.7V and 0.4V. We demonstrate that the HNM and RNM in a NC-FDSOI FET based SRAM cell are higher then those of the baseline FDSOI FET based SRAM cell noise margin and further increase with tFE. However, the WNM in general follows a non monotonic trend w.r.t tFE, and the trend also depends on the supply voltage. Finally, we optimize the design of the SRAM cell considering overall performance metrics. All in all, our analysis provides guidance for device and circuit designers to select the optimal FE thickness for NCFETs in which hysteresis-free operations, reliability, and performance are optimized
Characterization of File Memory Compiler
As the usage of hand held devices is increasing rapidly, it is a serious requirement to design the SoC with much smaller size. Majority of the area on the SoC is occupied by the memories used like RAM, Cache etc. It is required to reduce the size of these memories on the SoC. Also as these SoC run on batteries such memories must consume very less power. So the requirement is to design high density embedded memories with less in area, with less power consumption and meeting the designer timing requirements like access time, setup and hold time constraints. The main aim of the proposed work is, once the design is done, the memory compiler need to be characterized whether it is meeting the design requirements for the given instance and given process, voltage and temperature corners and to get the memory timing and power information in datasheet and liberty formats
Design of High Performance SRAM Based Memory Chip
The semiconductor memory SRAM uses bi-stable latch circuit to store the logic data 1 or 0. It differs from Dynamic RAM (DRAM) which needs periodic refreshment operation for the storage of logic data. Depending upon the frequency of operation SRAM power consumption varies i.e. it consumes very high power at higher frequencies like DRAM. The Cache memory present in the microprocessor needs high speed memory hence SRAM can be used for that purpose in microprocessors. The DRAM is normally used in the Main memory of processors, where importance is given to the density than its speed. The SRAM is also used in industrial subsystems, scientific and automotive electronics. In this thesis 16-Kb Memory is designed by using memory banking method in UMC 90nm technology ,which operates at a frequency of 1GHz.The post layout simulation for the complete design is performed and also obtained power analysis for the overall design. All peripherals like pre-charge, Row Decoder, Word line driver, Sense amplifier, Column Decoder/Mux and write driver are designed and layouts of all the above peripherals also drawn in an optimised manner such that their layout occupies minimum area. The 6T SRAM cell is designed with operating frequency of 8 GHz and stability analysis are also performed for single SRAM cell. The layout of Single SRAM cell is drawn in a symmetric manner, such that two adjacent cells can share same contact, which results reduction in the area of cell layout. The Static Noise Margin, Read noise margin and Write Noise Margin of single cell are found to be 240mV, 115mV and 425mV respectively for a supply voltage of 1V.The effect of pull-up ratio and cell ratio on the stability of SRAM cell is observed
Bit Error Robustness for Energy-Efficient DNN Accelerators
Deep neural network (DNN) accelerators received considerable attention in
past years due to saved energy compared to mainstream hardware. Low-voltage
operation of DNN accelerators allows to further reduce energy consumption
significantly, however, causes bit-level failures in the memory storing the
quantized DNN weights. In this paper, we show that a combination of robust
fixed-point quantization, weight clipping, and random bit error training
(RandBET) improves robustness against random bit errors in (quantized) DNN
weights significantly. This leads to high energy savings from both low-voltage
operation as well as low-precision quantization. Our approach generalizes
across operating voltages and accelerators, as demonstrated on bit errors from
profiled SRAM arrays. We also discuss why weight clipping alone is already a
quite effective way to achieve robustness against bit errors. Moreover, we
specifically discuss the involved trade-offs regarding accuracy, robustness and
precision: Without losing more than 1% in accuracy compared to a normally
trained 8-bit DNN, we can reduce energy consumption on CIFAR-10 by 20%. Higher
energy savings of, e.g., 30%, are possible at the cost of 2.5% accuracy, even
for 4-bit DNNs
Random and Adversarial Bit Error Robustness: {E}nergy-Efficient and Secure {DNN} Accelerators
Deep neural network (DNN) accelerators received considerable attention in recent years due to the potential to save energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption significantly, however, causes bit-level failures in the memory storing the quantized DNN weights. Furthermore, DNN accelerators have been shown to be vulnerable to adversarial attacks on voltage controllers or individual bits. In this paper, we show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) or adversarial bit error training (AdvBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads not only to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators. Our approach generalizes across operating voltages and accelerators, as demonstrated on bit errors from profiled SRAM arrays, and achieves robustness against both targeted and untargeted bit-level attacks. Without losing more than 0.8%/2% in test accuracy, we can reduce energy consumption on CIFAR10 by 20%/30% for 8/4-bit quantization using RandBET. Allowing up to 320 adversarial bit errors, AdvBET reduces test error from above 90% (chance level) to 26.22% on CIFAR10
Contributions on using embedded memory circuits as physically unclonable functions considering reliability issues
[eng] Moving towards Internet-of-Things (IoT) era, hardware security becomes a crucial
research topic, because of the growing demand of electronic products that are remotely
connected through networks. Novel hardware security primitives based on
manufacturing process variability are proposed to enhance the security of the IoT
systems. As a trusted root that provides physical randomness, a physically unclonable
function is an essential base for hardware security.
SRAM devices are becoming one of the most promising alternatives for the
implementation of embedded physical unclonable functions as the start-up value of
each bit-cell depends largely on the variability related with the manufacturing process.
Not all bit-cells experience the same degree of variability, so it is possible that some cells
randomly modify their logical starting value, while others will start-up always at the
same value. However, physically unclonable function applications, such as identification
and key generation, require more constant logical starting value to assure high reliability
in PUF response. For this reason, some kind of post-processing is needed to correct the
errors in the PUF response.
Unfortunately, those cells that have more constant logic output are difficult to be
detected in advance. This work characterizes by simulation the start-up value
reproducibility proposing several metrics suitable for reliability estimation during design
phases. The aim is to be able to predict by simulation the percentage of cells that will be
suitable to be used as PUF generators. We evaluate the metrics results and analyze the
start-up values reproducibility considering different external perturbation sources like several power supply ramp up times, previous internal values in the bit-cell, and
different temperature scenarios. The characterization metrics can be exploited to
estimate the number of suitable SRAM cells for use in PUF implementations that can be
expected from a specific SRAM design.[cat] En l’era de la Internet de les coses (IoT), garantir la seguretat del hardware ha
esdevingut un tema de recerca crucial, en especial a causa de la creixent demanda de
productes electrònics que es connecten remotament a través de xarxes. Per millorar la
seguretat dels sistemes IoT, s’han proposat noves solucions hardware basades en la
variabilitat dels processos de fabricació. Les funcions físicament inclonables (PUF)
constitueixen una font fiable d’aleatorietat física i són una base essencial per a la
seguretat hardware.
Les memòries SRAM s’estan convertint en una de les alternatives més prometedores per
a la implementació de funcions físicament inclonables encastades. Això és així ja que el
valor d’encesa de cada una de les cel·les que formen els bits de la memòria depèn en
gran mesura de la variabilitat pròpia del procés de fabricació. No tots els bits tenen el
mateix grau de variabilitat, així que algunes cel·les canvien el seu estat lògic d’encesa de
forma aleatòria entre enceses, mentre que d’altres sempre assoleixen el mateix valor
en totes les enceses. No obstant això, les funcions físicament inclonables, que s’utilitzen
per generar claus d’identificació, requereixen un valor lògic d’encesa constant per tal
d’assegurar una resposta fiable del PUF. Per aquest motiu, normalment es necessita
algun tipus de postprocessament per corregir els possibles errors presents en la resposta
del PUF. Malauradament, les cel·les que presenten una resposta més constant són
difícils de detectar a priori.
Aquest treball caracteritza per simulació la reproductibilitat del valor d’encesa de cel·les
SRAM, i proposa diverses mètriques per estimar la fiabilitat de les cel·les durant les fases de disseny de la memòria. L'objectiu és ser capaç de predir per simulació el percentatge
de cel·les que seran adequades per ser utilitzades com PUF. S’avaluen els resultats de
diverses mètriques i s’analitza la reproductibilitat dels valors d’encesa de les cel·les
considerant diverses fonts de pertorbacions externes, com diferents rampes de tensió
per a l’encesa, els valors interns emmagatzemats prèviament en les cel·les, i diferents
temperatures. Es proposa utilitzar aquestes mètriques per estimar el nombre de cel·les
SRAM adients per ser implementades com a PUF en un disseny d‘SRAM específic.[spa] En la era de la Internet de las cosas (IoT), garantizar la seguridad del hardware se ha
convertido en un tema de investigación crucial, en especial a causa de la creciente
demanda de productos electrónicos que se conectan remotamente a través de redes.
Para mejorar la seguridad de los sistemas IoT, se han propuesto nuevas soluciones
hardware basadas en la variabilidad de los procesos de fabricación. Las funciones
físicamente inclonables (PUF) constituyen una fuente fiable de aleatoriedad física y son
una base esencial para la seguridad hardware.
Las memorias SRAM se están convirtiendo en una de las alternativas más prometedoras
para la implementación de funciones físicamente inclonables empotradas. Esto es así,
puesto que el valor de encendido de cada una de las celdas que forman los bits de la
memoria depende en gran medida de la variabilidad propia del proceso de fabricación.
No todos los bits tienen el mismo grado de variabilidad. Así pues, algunas celdas cambian
su estado lógico de encendido de forma aleatoria entre encendidos, mientras que otras
siempre adquieren el mismo valor en todos los encendidos. Sin embargo, las funciones
físicamente inclonables, que se utilizan para generar claves de identificación, requieren
un valor lógico de encendido constante para asegurar una respuesta fiable del PUF. Por
este motivo, normalmente se necesita algún tipo de posprocesado para corregir los
posibles errores presentes en la respuesta del PUF. Desafortunadamente, las celdas que
presentan una respuesta más constante son difíciles de detectar a priori.
Este trabajo caracteriza por simulación la reproductibilidad del valor de encendido de
celdas SRAM, y propone varias métricas para estimar la fiabilidad de las celdas durante las fases de diseño de la memoria. El objetivo es ser capaz de predecir por simulación el
porcentaje de celdas que serán adecuadas para ser utilizadas como PUF. Se evalúan los
resultados de varias métricas y se analiza la reproductibilidad de los valores de
encendido de las celdas considerando varias fuentes de perturbaciones externas, como
diferentes rampas de tensión para el encendido, los valores internos almacenados
previamente en las celdas, y diferentes temperaturas. Se propone utilizar estas métricas
para estimar el número de celdas SRAM adecuadas para ser implementadas como PUF
en un diseño de SRAM específico