330 research outputs found

    Robust low-power digital circuit design in nano-CMOS technologies

    Get PDF
    Device scaling has resulted in large scale integrated, high performance, low-power, and low cost systems. However the move towards sub-100 nm technology nodes has increased variability in device characteristics due to large process variations. Variability has severe implications on digital circuit design by causing timing uncertainties in combinational circuits, degrading yield and reliability of memory elements, and increasing power density due to slow scaling of supply voltage. Conventional design methods add large pessimistic safety margins to mitigate increased variability, however, they incur large power and performance loss as the combination of worst cases occurs very rarely. In-situ monitoring of timing failures provides an opportunity to dynamically tune safety margins in proportion to on-chip variability that can significantly minimize power and performance losses. We demonstrated by simulations two delay sensor designs to detect timing failures in advance that can be coupled with different compensation techniques such as voltage scaling, body biasing, or frequency scaling to avoid actual timing failures. Our simulation results using 45 nm and 32 nm technology BSIM4 models indicate significant reduction in total power consumption under temperature and statistical variations. Future work involves using dual sensing to avoid useless voltage scaling that incurs a speed loss. SRAM cache is the first victim of increased process variations that requires handcrafted design to meet area, power, and performance requirements. We have proposed novel 6 transistors (6T), 7 transistors (7T), and 8 transistors (8T)-SRAM cells that enable variability tolerant and low-power SRAM cache designs. Increased sense-amplifier offset voltage due to device mismatch arising from high variability increases delay and power consumption of SRAM design. We have proposed two novel design techniques to reduce offset voltage dependent delays providing a high speed low-power SRAM design. Increasing leakage currents in nano-CMOS technologies pose a major challenge to a low-power reliable design. We have investigated novel segmented supply voltage architecture to reduce leakage power of the SRAM caches since they occupy bulk of the total chip area and power. Future work involves developing leakage reduction methods for the combination logic designs including SRAM peripherals

    Near-Threshold Computing

    Get PDF
    Valmistustekniikoiden kehittyessä IC-piireille saadaan mahtumaan yhä enemmän transistoreja. Monimutkaisemmat piirit mahdollistavat suurempien laskutoimitusmäärien suorittamisen aikayksikössä. Piirien aktiivisuuden lisääntyessä myös niiden energiankulutus lisääntyy, ja tämä puolestaan lisää piirin lämmöntuotantoa. Liiallinen lämpö rajoittaa piirien toimintaa. Tämän takia tarvitaan tekniikoita, joilla piirien energiankulutusta saadaan pienennettyä. Uudeksi tutkimuskohteeksi ovat tulleet pienet laitteet, jotka seuraavat esimerkiksi ihmiskehon toimintaa, rakennuksia tai siltoja. Tällaisten laitteiden on oltava energiankulutukseltaan pieniä, jotta ne voivat toimia pitkiä aikoja ilman akkujen lataamista. Near-Threshold Computing on tekniikka, jolla pyritään pienentämään integroitujen piirien energiankulutusta. Periaatteena on käyttää piireillä pienempää käyttöjännitettä kuin piirivalmistaja on niille alunperin suunnitellut. Tämä hidastaa ja haittaa piirin toimintaa. Jos kuitenkin laitteen toiminnassa pystyään hyväksymään huonompi laskentateho ja pienentynyt toimintavarmuus, voidaan saavuttaa säästöä energiankulutuksessa. Tässä diplomityössä tarkastellaan Near-Threshold Computing -tekniikkaa eri näkökulmista: aluksi perustuen kirjallisuudesta löytyviin aikaisempiin tutkimuksiin, ja myöhemmin tutkimalla Near-Threshold Computing -tekniikan soveltamista kahden tapaustutkimuksen kautta. Tapaustutkimuksissa tarkastellaan FO4-invertteriä sekä 6T SRAM -solua piirisimulaatioiden avulla. Näiden komponenttien käyttäytymisen Near-Threshold Computing –jännitteillä voidaan tulkita antavan kattavan kuvan suuresta osasta tavanomaisen IC-piirin pinta-alaa ja energiankulusta. Tapaustutkimuksissa käytetään 130 nm teknologiaa, ja niissä mallinnetaan todellisia piirivalmistusprosessin tuotteita ajamalla useita Monte Carlo -simulaatioita. Tämä valmistuskustannuksiltaan huokea teknologia yhdistettynä Near-Threshold Computing -tekniikkaan mahdollistaa matalan energiankulutuksen piirien valmistaminen järkevään hintaan. Tämän diplomityön tulokset näyttävät, että Near-Threshold Computing pienentää piirien energiankulutusta merkittävästi. Toisaalta, piirien nopeus heikkenee, ja yleisesti käytetty 6T SRAM -muistisolu muuttuu epäluotettavaksi. Pidemmät polut logiikkapiireissä sekä transistorien kasvattaminen muistisoluissa osoitetaan tehokkaiksi vastatoimiksi Near- Threshold Computing -tekniikan huonoja puolia vastaan. Tulokset antavat perusteita matalan energiankulutuksen IC-piirien suunnittelussa sille, kannattaako käyttää normaalia käyttöjännitettä, vai laskea sitä, jolloin piirin hidastuminen ja epävarmempi käyttäytyminen pitää ottaa huomioon.Siirretty Doriast

    Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency

    Get PDF
    As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts. Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis

    Contributions on using embedded memory circuits as physically unclonable functions considering reliability issues

    Get PDF
    [eng] Moving towards Internet-of-Things (IoT) era, hardware security becomes a crucial research topic, because of the growing demand of electronic products that are remotely connected through networks. Novel hardware security primitives based on manufacturing process variability are proposed to enhance the security of the IoT systems. As a trusted root that provides physical randomness, a physically unclonable function is an essential base for hardware security. SRAM devices are becoming one of the most promising alternatives for the implementation of embedded physical unclonable functions as the start-up value of each bit-cell depends largely on the variability related with the manufacturing process. Not all bit-cells experience the same degree of variability, so it is possible that some cells randomly modify their logical starting value, while others will start-up always at the same value. However, physically unclonable function applications, such as identification and key generation, require more constant logical starting value to assure high reliability in PUF response. For this reason, some kind of post-processing is needed to correct the errors in the PUF response. Unfortunately, those cells that have more constant logic output are difficult to be detected in advance. This work characterizes by simulation the start-up value reproducibility proposing several metrics suitable for reliability estimation during design phases. The aim is to be able to predict by simulation the percentage of cells that will be suitable to be used as PUF generators. We evaluate the metrics results and analyze the start-up values reproducibility considering different external perturbation sources like several power supply ramp up times, previous internal values in the bit-cell, and different temperature scenarios. The characterization metrics can be exploited to estimate the number of suitable SRAM cells for use in PUF implementations that can be expected from a specific SRAM design.[cat] En l’era de la Internet de les coses (IoT), garantir la seguretat del hardware ha esdevingut un tema de recerca crucial, en especial a causa de la creixent demanda de productes electrònics que es connecten remotament a través de xarxes. Per millorar la seguretat dels sistemes IoT, s’han proposat noves solucions hardware basades en la variabilitat dels processos de fabricació. Les funcions físicament inclonables (PUF) constitueixen una font fiable d’aleatorietat física i són una base essencial per a la seguretat hardware. Les memòries SRAM s’estan convertint en una de les alternatives més prometedores per a la implementació de funcions físicament inclonables encastades. Això és així ja que el valor d’encesa de cada una de les cel·les que formen els bits de la memòria depèn en gran mesura de la variabilitat pròpia del procés de fabricació. No tots els bits tenen el mateix grau de variabilitat, així que algunes cel·les canvien el seu estat lògic d’encesa de forma aleatòria entre enceses, mentre que d’altres sempre assoleixen el mateix valor en totes les enceses. No obstant això, les funcions físicament inclonables, que s’utilitzen per generar claus d’identificació, requereixen un valor lògic d’encesa constant per tal d’assegurar una resposta fiable del PUF. Per aquest motiu, normalment es necessita algun tipus de postprocessament per corregir els possibles errors presents en la resposta del PUF. Malauradament, les cel·les que presenten una resposta més constant són difícils de detectar a priori. Aquest treball caracteritza per simulació la reproductibilitat del valor d’encesa de cel·les SRAM, i proposa diverses mètriques per estimar la fiabilitat de les cel·les durant les fases de disseny de la memòria. L'objectiu és ser capaç de predir per simulació el percentatge de cel·les que seran adequades per ser utilitzades com PUF. S’avaluen els resultats de diverses mètriques i s’analitza la reproductibilitat dels valors d’encesa de les cel·les considerant diverses fonts de pertorbacions externes, com diferents rampes de tensió per a l’encesa, els valors interns emmagatzemats prèviament en les cel·les, i diferents temperatures. Es proposa utilitzar aquestes mètriques per estimar el nombre de cel·les SRAM adients per ser implementades com a PUF en un disseny d‘SRAM específic.[spa] En la era de la Internet de las cosas (IoT), garantizar la seguridad del hardware se ha convertido en un tema de investigación crucial, en especial a causa de la creciente demanda de productos electrónicos que se conectan remotamente a través de redes. Para mejorar la seguridad de los sistemas IoT, se han propuesto nuevas soluciones hardware basadas en la variabilidad de los procesos de fabricación. Las funciones físicamente inclonables (PUF) constituyen una fuente fiable de aleatoriedad física y son una base esencial para la seguridad hardware. Las memorias SRAM se están convirtiendo en una de las alternativas más prometedoras para la implementación de funciones físicamente inclonables empotradas. Esto es así, puesto que el valor de encendido de cada una de las celdas que forman los bits de la memoria depende en gran medida de la variabilidad propia del proceso de fabricación. No todos los bits tienen el mismo grado de variabilidad. Así pues, algunas celdas cambian su estado lógico de encendido de forma aleatoria entre encendidos, mientras que otras siempre adquieren el mismo valor en todos los encendidos. Sin embargo, las funciones físicamente inclonables, que se utilizan para generar claves de identificación, requieren un valor lógico de encendido constante para asegurar una respuesta fiable del PUF. Por este motivo, normalmente se necesita algún tipo de posprocesado para corregir los posibles errores presentes en la respuesta del PUF. Desafortunadamente, las celdas que presentan una respuesta más constante son difíciles de detectar a priori. Este trabajo caracteriza por simulación la reproductibilidad del valor de encendido de celdas SRAM, y propone varias métricas para estimar la fiabilidad de las celdas durante las fases de diseño de la memoria. El objetivo es ser capaz de predecir por simulación el porcentaje de celdas que serán adecuadas para ser utilizadas como PUF. Se evalúan los resultados de varias métricas y se analiza la reproductibilidad de los valores de encendido de las celdas considerando varias fuentes de perturbaciones externas, como diferentes rampas de tensión para el encendido, los valores internos almacenados previamente en las celdas, y diferentes temperaturas. Se propone utilizar estas métricas para estimar el número de celdas SRAM adecuadas para ser implementadas como PUF en un diseño de SRAM específico

    Exploiting Application Behaviors for Resilient Static Random Access Memory Arrays in the Near-Threshold Computing Regime

    Get PDF
    Near-Threshold Computing embodies an intriguing choice for mobile processors due to the promise of superior energy efficiency, extending the battery life of these devices while reducing the peak power draw. However, process, voltage, and temperature variations cause a significantly high failure rate of Level One cache cells in the near-threshold regime a stark contrast to designs in the super-threshold regime, where fault sites are rare. This thesis work shows that faulty cells in the near-threshold regime are highly clustered in certain regions of the cache. In addition, popular mobile benchmarks are studied to investigate the impact of run-time workloads on timing faults manifestation. A technique to mitigate the run-time faults is proposed. This scheme maps frequently used data to healthy cache regions by exploiting the application cache behaviors. The results show up to 78% gain in performance over two other state-of-the-art techniques

    Statistical analysis and design of subthreshold operation memories

    Get PDF
    This thesis presents novel methods based on a combination of well-known statistical techniques for faster estimation of memory yield and their application in the design of energy-efficient subthreshold memories. The emergence of size-constrained Internet-of-Things (IoT) devices and proliferation of the wearable market has brought forward the challenge of achieving the maximum energy efficiency per operation in these battery operated devices. Achieving this sought-after minimum energy operation is possible under sub-threshold operation of the circuit. However, reliable memory operation is currently unattainable at these ultra-low operating voltages because of the memory circuit's vanishing noise margins which shrink further in the presence of random process variations. The statistical methods, presented in this thesis, make the yield optimization of the sub-threshold memories computationally feasible by reducing the SPICE simulation overhead. We present novel modifications to statistical sampling techniques that reduce the SPICE simulation overhead in estimating memory failure probability. These sampling scheme provides 40x reduction in finding most probable failure point and 10x reduction in estimating failure probability using the SPICE simulations compared to the existing proposals. We then provide a novel method to create surrogate models of the memory margins with better extrapolation capability than the traditional regression methods. These models, based on Gaussian process regression, encode the sensitivity of the memory margins with respect to each individual threshold variation source in a one-dimensional kernel. We find that our proposed additive kernel based models have 32% smaller out-of-sample error (that is, better extrapolation capability outside training set) than using the six-dimensional universal kernel like Radial Basis Function (RBF). The thesis also explores the topological modifications to the SRAM bitcell to achieve faster read operation at the sub-threshold operating voltages. We present a ten-transistor SRAM bitcell that achieves 2x faster read operation than the existing ten-transistor sub-threshold SRAM bitcells, while ensuring similar noise margins. The SRAM bitcell provides 70% reduction in dynamic energy at the cost of 42% increase in the leakage energy per read operation. Finally, we investigate the energy efficiency of the eDRAM gain-cells as an alternative to the SRAM bitcells in the size-constrained IoT devices. We find that reducing their write path leakage current is the only way to reduce the read energy at Minimum Energy operation Point (MEP). Further, we study the effect of transistor up-sizing under the presence of threshold voltage variations on the mean MEP read energy by performing statistical analysis based on the ANOVA test of the full-factorial experimental design.Esta tesis presenta nuevos métodos basados en una combinación de técnicas estadísticas conocidas para la estimación rápida del rendimiento de la memoria y su aplicación en el diseño de memorias de energia eficiente de sub-umbral. La aparición de los dispositivos para el Internet de las cosas (IOT) y la proliferación del mercado portátil ha presentado el reto de lograr la máxima eficiencia energética por operación de estos dispositivos operados con baterias. La eficiencia de energía es posible si se considera la operacion por debajo del umbral de los circuitos. Sin embargo, la operación confiable de memoria es actualmente inalcanzable en estos bajos niveles de voltaje debido a márgenes de ruido de fuga del circuito de memoria, los cuales se pueden reducir aún más en presencia de variaciones randomicas de procesos. Los métodos estadísticos, que se presentan en esta tesis, hacen que la optimización del rendimiento de las memorias por debajo del umbral computacionalmente factible mediante la simulación SPICE. Presentamos nuevas modificaciones a las técnicas de muestreo estadístico que reducen la sobrecarga de simulación SPICE en la estimación de la probabilidad de fallo de memoria. Estos esquemas de muestreo proporciona una reducción de 40 veces en la búsqueda de puntos de fallo más probable, y 10 veces la reducción en la estimación de la probabilidad de fallo mediante las simulaciones SPICE en comparación con otras propuestas existentes. A continuación, se proporciona un método novedoso para crear modelos sustitutos de los márgenes de memoria con una mejor capacidad de extrapolación que los métodos tradicionales de regresión. Estos modelos, basados en el proceso de regresión Gaussiano, codifican la sensibilidad de los márgenes de memoria con respecto a cada fuente de variación de umbral individual en un núcleo de una sola dimensión. Los modelos propuestos, basados en kernel aditivos, tienen un error 32% menor que el error out-of-sample (es decir, mejor capacidad de extrapolación fuera del conjunto de entrenamiento) en comparacion con el núcleo universal de seis dimensiones como la función de base radial (RBF). La tesis también explora las modificaciones topológicas a la celda binaria SRAM para alcanzar velocidades de lectura mas rapidas dentro en el contexto de operaciones en el umbral de tensiones de funcionamiento. Presentamos una celda binaria SRAM de diez transistores que consigue aumentar en 2 veces la operación de lectura en comparacion con las celdas sub-umbral de SRAM de diez transistores existentes, garantizando al mismo tiempo los márgenes de ruido similares. La celda binaria SRAM proporciona una reducción del 70% en energía dinámica a costa del aumento del 42% en la energía de fuga por las operaciones de lectura. Por último, se investiga la eficiencia energética de las células de ganancia eDRAM como una alternativa a los bitcells SRAM en los dispositivos de tamaño limitado IOT. Encontramos que la reducción de la corriente de fuga en el path de escritura es la única manera de reducir la energía de lectura en el Punto Mínimo de Energía (MEP). Además, se estudia el efecto del transistor de dimensionamiento en virtud de la presencia de variaciones de voltaje de umbral en la media de energia de lecture MEP mediante el análisis estadístico basado en la prueba de ANOVA del diseño experimental factorial completo.Postprint (published version

    Ultra-low Power FinFET SRAM Cell with improved stability suitable for low power applications

    Get PDF
    In this paper, a new 11T SRAM cell using FinFET technology has been proposed, the basic component of the cell is the 6T SRAM cell with 4 NMOS access transistors to improve the stability and also makes it a dual port memory cell. The proposed cell uses a header scheme in which one extra PMOS transistor is used which is biased at different voltages to improve the read and write stability thus, helps in reducing the leakage power and active power. The cell shows improvement in RSNM (Read Static Noise Margin) with LP8T by 2.39x at sub-threshold voltage 2.68x with D6T SRAM cell, 5.5x with TG8T. The WSNM (Write Static Noise Margin) and HM (Hold Margin) of the SRAM cell at 0.9V is 306mV and 384mV. At sub-threshold operation also it shows improvement. The Leakage power reduced by 0.125x with LP8T, 0.022x with D6T SRAM cell, TG8T and SE8T. Also, impact of process variation on cell stability is discussed

    Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

    Get PDF
    abstract: Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications. In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level. Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well. Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm. Overall, this dissertation paves the paths for memory technologies’ new applications towards the secure and energy-efficient artificial intelligence system.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    -Memory Computing Based Reliable and High Speed Schmitt trigger 10T SRAM cell design

    Get PDF
    Static random access memories (SRAM) are useful building blocks in various applications, including cache memories, integrated data storage systems, and microprocessors. The von Neumann bottleneck difficulties are solved by in-memory computing. It eliminates unnecessary frequent data transfer between memory and processing units simultaneously. In this research, the replica-based 10T SRAM design for in-memory computing (IMC) is designed by adapting the word line control scheme in 14nm CMOS technology. In order to achieve high reading and writing capability, the Schmitt trigger inverter was used for energy-saving and stable use. To speed up the writing process of the design, a single transistor is inserted between the cross-coupled inverters. In addition, to increase the node capacity, the voltage boosting circuitry is emphasized. The adaptive word line control scheme was utilized by integrating the replica column based circuit. The Replica approach regulates signal flow through the core by using a dummy column and a dummy row in RAM. To demonstrate the viability of the suggested design, the simulated outcomes are contrasted with those of existing designs. The various performance metrics examined are Read Static Noise Margin (RSNM), Write (WSNM), Hold (HSNM), Read Access Delay (RAD), Write Access Delay (WAD), Read performance and Write performance the varying supply voltage is evaluated
    corecore