12 research outputs found

    Replacing 6T SRAMs with 3T1D DRAMs in the L1 data cache to combat process variability

    Get PDF
    With continued technology scaling, process variations will be especially detrimental to six-transistor static memory structures (6T SRAMs). A memory architecture using three-transistor, one-diode DRAM (3T1D) cells in the L1 data cache tolerates wide process variations with little performance degradation, making it a promising choice for on-chip cache structures for next-generation microprocessors.Peer ReviewedPostprint (published version

    Strategies to enhance the 3T1D-DRAM cell variability robustness beyond 22 nm

    Get PDF
    3T1D cell has been stated as a valid alternative to be implemented on L1 memory cache to substitute 6T, highly affected by device variability as technology dimensions are reduced. In this work, we have shown that 22 nm 3T1D memory cells present significant tolerance to high levels of device parameter fluctuation. Moreover, we have observed that when variability is considered the write access transistor becomes a significant detrimental element on the 3T1D cell performance. Furthermore, resizing and temperature control have been presented as some valid strategies in order to mitigate the 3T1D cell variability.Peer ReviewedPostprint (author's final draft

    Mitigation strategies of the variability in 3T1D cell memories scaled beyond 22nm

    Get PDF
    Best DCIS Paper Award 20123T1D cell has been stated as a valid alternative to be implemented on L1 memory cache to substitute 6T, highly affected by device variability. In this contribution, we have shown that 22nm 3T1D memory cells present significant tolerance to high levels of device parameter fluctuation. Moreover, we have observed that the variability of the write access transistor has turn into the more detrimental device for the 3T1D cell performance. Furthermore, resizing and temperature control have been presented as some strategies to mitigate the cell variability.Peer ReviewedAward-winningPreprin

    Cache memory design in the FinFET era

    Get PDF
    The major problem in the future technology scaling is the variations in process parameters that are interpreted as imperfections in the development process. Moreover, devices are more sensitive to the environmental changes of temperature and supply volt- age as well as to ageing. All these influences are manifested in the integrated circuits as increased power consumption, reduced maximal operating frequency and increased number of failures. These effects have been partially overcome with the introduction of the FinFET technology which have solved the problem of variability caused by Random Dopant Fluctuations. However, in the next ten years channel length is projected to shrink to 10nm where the variability source generated by Line Edge Roughness will dominate, and its effects on the threshold voltage variations will become critical. The embedded memories with their cells as the basic building unit are the most prone to these effects due to their the smallest dimensions. Because of that, memories should be designed with particular care in order to make possible further technology scaling. This thesis explores upcoming 10nm FinFETs and the existing issues in the cache memory design with this technology. More- over, it tries to present some original and novel techniques on the different level of design abstraction for mitigating the effects of process and environmental variability. At first original method for simulating variability of Tri-Gate Fin- FETs is presented using conventional HSPICE simulation environment and BSIM-CMG model cards. When that is accomplished, thorough characterisation of traditional SRAM cell circuits (6T and 8T) is performed. Possibility of using Independent Gate FinFETs for increasing cell stability has been explored, also. Gain Cells appeared in the recent past as an attractive alternative for in the cache memory design. This thesis partially explores this idea by presenting and performing detailed circuit analysis of the dynamic 3T gain cell for 10nm FinFETs. At the top of this work, thesis shows one micro-architecture optimisation of high-speed cache when it is implemented by 3T gain cells. We show how the cache coherency states can be used in order to reduce refresh energy of the memory as well as reduce memory ageing.El principal problema de l'escalat la tecnologia són les variacions en els paràmetres de disseny (imperfeccions) durant procés de fabricació. D'altra banda, els dispositius també són més sensibles als canvis ambientals de temperatura, la tensió d'alimentació, així com l'envelliment. Totes aquestes influències es manifesten en els circuits integrats com l'augment de consum d'energia, la reducció de la freqüència d'operació màxima i l'augment del nombre de xips descartats. Aquests efectes s'han superat parcialment amb la introducció de la tecnologia FinFET que ha resolt el problema de la variabilitat causada per les fluctuacions de dopants aleatòries. No obstant això, en els propers deu anys, l'ample del canal es preveu que es reduirà a 10nm, on la font de la variabilitat generada per les rugositats de les línies de material dominarà, i els seu efecte en les variacions de voltatge llindar augmentarà. Les memòries encastades amb les seves cel·les com la unitat bàsica de construcció són les més propenses a sofrir aquests efectes a causa de les seves dimensions més petites. A causa d'això, cal dissenyar les memòries amb una especial cura per tal de fer possible l'escalat de la tecnologia. Aquesta tesi explora la tecnologia de FinFETs de 10nm i els problemes existents en el disseny de memòries amb aquesta tecnologia. A més a més, presentem noves tècniques originals sobre diferents nivells d'abstracció del disseny per a la mitigació dels efectes les variacions tan de procés com ambientals. En primer lloc, presentem un mètode original per a la simulació de la variabilitat de Tri-Gate FinFETs usant entorn de simulació HSPICE convencional i models de tecnologia BSIMCMG. Després, es realitza la caracterització completa dels circuits de cel·les SRAM tradicionals (6T i 8T) conjuntament amb l'ús de Gate-independent FinFETs per augmentar l'estabilitat de la cèl·lula

    Review on suitable eDRAM configurations for next nano-metric electronics era

    Get PDF
    We summarize most of our studies focused on the main reliability issues that can threat the gain-cells eDRAM behavior when it is simulated at the nano-metric device range has been collected in this review. So, to outperform their memory cell counterparts, we explored different technological proposals and operational regimes where it can be located. The best memory cell performance is observed for the 3T1D-eDRAM cell when it is based on FinFET devices. Both device variability and SEU appear as key reliability issues for memory cells at sub-22nm technology node.Peer ReviewedPostprint (author's final draft

    Reliability in the face of variability in nanometer embedded memories

    Get PDF
    In this thesis, we have investigated the impact of parametric variations on the behaviour of one performance-critical processor structure - embedded memories. As variations manifest as a spread in power and performance, as a first step, we propose a novel modeling methodology that helps evaluate the impact of circuit-level optimizations on architecture-level design choices. Choices made at the design-stage ensure conflicting requirements from higher-levels are decoupled. We then complement such design-time optimizations with a runtime mechanism that takes advantage of adaptive body-biasing to lower power whilst improving performance in the presence of variability. Our proposal uses a novel fully-digital variation tracking hardware using embedded DRAM (eDRAM) cells to monitor run-time changes in cache latency and leakage. A special fine-grain body-bias generator uses the measurements to generate an optimal body-bias that is needed to meet the required yield targets. A novel variation-tolerant and soft-error hardened eDRAM cell is also proposed as an alternate candidate for replacing existing SRAM-based designs in latency critical memory structures. In the ultra low-power domain where reliable operation is limited by the minimum voltage of operation (Vddmin), we analyse the impact of failures on cache functional margin and functional yield. Towards this end, we have developed a fully automated tool (INFORMER) capable of estimating memory-wide metrics such as power, performance and yield accurately and rapidly. Using the developed tool, we then evaluate the #effectiveness of a new class of hybrid techniques in improving cache yield through failure prevention and correction. Having a holistic perspective of memory-wide metrics helps us arrive at design-choices optimized simultaneously for multiple metrics needed for maintaining lifetime requirements

    Statistical analysis and design of subthreshold operation memories

    Get PDF
    This thesis presents novel methods based on a combination of well-known statistical techniques for faster estimation of memory yield and their application in the design of energy-efficient subthreshold memories. The emergence of size-constrained Internet-of-Things (IoT) devices and proliferation of the wearable market has brought forward the challenge of achieving the maximum energy efficiency per operation in these battery operated devices. Achieving this sought-after minimum energy operation is possible under sub-threshold operation of the circuit. However, reliable memory operation is currently unattainable at these ultra-low operating voltages because of the memory circuit's vanishing noise margins which shrink further in the presence of random process variations. The statistical methods, presented in this thesis, make the yield optimization of the sub-threshold memories computationally feasible by reducing the SPICE simulation overhead. We present novel modifications to statistical sampling techniques that reduce the SPICE simulation overhead in estimating memory failure probability. These sampling scheme provides 40x reduction in finding most probable failure point and 10x reduction in estimating failure probability using the SPICE simulations compared to the existing proposals. We then provide a novel method to create surrogate models of the memory margins with better extrapolation capability than the traditional regression methods. These models, based on Gaussian process regression, encode the sensitivity of the memory margins with respect to each individual threshold variation source in a one-dimensional kernel. We find that our proposed additive kernel based models have 32% smaller out-of-sample error (that is, better extrapolation capability outside training set) than using the six-dimensional universal kernel like Radial Basis Function (RBF). The thesis also explores the topological modifications to the SRAM bitcell to achieve faster read operation at the sub-threshold operating voltages. We present a ten-transistor SRAM bitcell that achieves 2x faster read operation than the existing ten-transistor sub-threshold SRAM bitcells, while ensuring similar noise margins. The SRAM bitcell provides 70% reduction in dynamic energy at the cost of 42% increase in the leakage energy per read operation. Finally, we investigate the energy efficiency of the eDRAM gain-cells as an alternative to the SRAM bitcells in the size-constrained IoT devices. We find that reducing their write path leakage current is the only way to reduce the read energy at Minimum Energy operation Point (MEP). Further, we study the effect of transistor up-sizing under the presence of threshold voltage variations on the mean MEP read energy by performing statistical analysis based on the ANOVA test of the full-factorial experimental design.Esta tesis presenta nuevos métodos basados en una combinación de técnicas estadísticas conocidas para la estimación rápida del rendimiento de la memoria y su aplicación en el diseño de memorias de energia eficiente de sub-umbral. La aparición de los dispositivos para el Internet de las cosas (IOT) y la proliferación del mercado portátil ha presentado el reto de lograr la máxima eficiencia energética por operación de estos dispositivos operados con baterias. La eficiencia de energía es posible si se considera la operacion por debajo del umbral de los circuitos. Sin embargo, la operación confiable de memoria es actualmente inalcanzable en estos bajos niveles de voltaje debido a márgenes de ruido de fuga del circuito de memoria, los cuales se pueden reducir aún más en presencia de variaciones randomicas de procesos. Los métodos estadísticos, que se presentan en esta tesis, hacen que la optimización del rendimiento de las memorias por debajo del umbral computacionalmente factible mediante la simulación SPICE. Presentamos nuevas modificaciones a las técnicas de muestreo estadístico que reducen la sobrecarga de simulación SPICE en la estimación de la probabilidad de fallo de memoria. Estos esquemas de muestreo proporciona una reducción de 40 veces en la búsqueda de puntos de fallo más probable, y 10 veces la reducción en la estimación de la probabilidad de fallo mediante las simulaciones SPICE en comparación con otras propuestas existentes. A continuación, se proporciona un método novedoso para crear modelos sustitutos de los márgenes de memoria con una mejor capacidad de extrapolación que los métodos tradicionales de regresión. Estos modelos, basados en el proceso de regresión Gaussiano, codifican la sensibilidad de los márgenes de memoria con respecto a cada fuente de variación de umbral individual en un núcleo de una sola dimensión. Los modelos propuestos, basados en kernel aditivos, tienen un error 32% menor que el error out-of-sample (es decir, mejor capacidad de extrapolación fuera del conjunto de entrenamiento) en comparacion con el núcleo universal de seis dimensiones como la función de base radial (RBF). La tesis también explora las modificaciones topológicas a la celda binaria SRAM para alcanzar velocidades de lectura mas rapidas dentro en el contexto de operaciones en el umbral de tensiones de funcionamiento. Presentamos una celda binaria SRAM de diez transistores que consigue aumentar en 2 veces la operación de lectura en comparacion con las celdas sub-umbral de SRAM de diez transistores existentes, garantizando al mismo tiempo los márgenes de ruido similares. La celda binaria SRAM proporciona una reducción del 70% en energía dinámica a costa del aumento del 42% en la energía de fuga por las operaciones de lectura. Por último, se investiga la eficiencia energética de las células de ganancia eDRAM como una alternativa a los bitcells SRAM en los dispositivos de tamaño limitado IOT. Encontramos que la reducción de la corriente de fuga en el path de escritura es la única manera de reducir la energía de lectura en el Punto Mínimo de Energía (MEP). Además, se estudia el efecto del transistor de dimensionamiento en virtud de la presencia de variaciones de voltaje de umbral en la media de energia de lecture MEP mediante el análisis estadístico basado en la prueba de ANOVA del diseño experimental factorial completo.Postprint (published version

    Replica Technique for Adaptive Refresh Timing of Gain-Cell-Embedded DRAM

    Get PDF
    Gain cells have recently been shown to be a viable alternative to static random access memory in low-power applications due to their low leakage currents and high density. The primary component of power consumption in these arrays is the dynamic power consumed during periodic refresh operations. Refresh timing is traditionally set according to a worst-case evaluation of retention time under extreme process variations, and worst-case access statistics, leading to frequent power-hungry refresh cycles. In this brief we present a replica technique for automatically tracking the retention time of a gain-cell-embedded dynamic-random-access-memory macrocell according to process variations and operating statistics, thereby reducing the data retention power of the array. A 2-kb array was designed and fabricated in a mature 0.18-mu m CMOS process, appropriate for integration in ultralow power applications, such as biomedical sensors. Measurements show efficient retention time tracking across a range of supply voltages and access statistics, lowering the refresh frequency by more than 5x, as compared with traditional worst-case design

    State of the art of memories from requirement to necessity

    Get PDF
    openIl progresso tecnologico nel campo computazionale ha portato all'esigenze di sistemi di storage sempre più efficienti. A seconda dell'esigenza si preferisce migliorare alcuni aspetti rispetto ad altri, come la velocità, lo spazione, la densità o il consumo. In questo documento illustro memorie volatili e non volatili, con una maggior attenzione per le Random Access Memory, sia statiche che dinamiche, andando ad analizzare i pregi ed i difetti delle recenti innovazioni

    Tangle: Route-oriented dynamic voltage minimization for variation-afflicted, energy-efficient on-chip networks

    Full text link
    On-chip networks are especially vulnerable to within-die pa-rameter variations. Since they connect distant parts of the chip, they need to be designed to work under the most unfavorable parameter values in the chip. This results in energy-inefficient designs. To improve the energy efficiency of on-chip networks, this paper presents a novel approach that relies on monitoring the errors of messages as they traverse the network. Based on the observed errors of messages, the system dynamically decreases or increases the voltage (Vdd) of groups of network routers. With this approach, called Tangle, the differentVdd val-ues applied to different groups of network routers progressively converge to their lowest, variation-aware, error-free values — always keeping the network frequency unchanged. This saves substantial network energy. In a simulated 64-router network with 4 Vdd domains, Tangle reduces the network energy con-sumption by an average of 22 % with negligible performance impact. In a future network design with one Vdd domain per router, Tangle lowers the network Vdd by an average of 21%, reducing the network energy consumption by an average of 28 % with negligible performance impact. 1
    corecore