20 research outputs found

    Impact on performance and energy of the retention time and processor frequency in L1 macrocell-based data caches

    Full text link
    [EN] Cache memories dissipate an important amount of the energy budget in current microprocessors. This is mainly due to cache cells are typically implemented with six transistors. To tackle this design concern, recent research has focused on the proposal of new cache cells. An n-bit cache cell, namely macrocell, has been proposed in a previous work. This cell combines SRAM and eDRAM technologies with the aim of reducing energy consumption while maintaining the performance. The capacitance of eDRAM cells impacts on energy consumption and performance since these cells lose their state once the retention time expires. On such a case, data must be fetched from a lower level of the memory hierarchy, so negatively impacting on performance and energy consumption. As opposite, if the capacitance is too high, energy would be wasted without bringing performance benefits. This paper identifies the optimal capacitance for a given processor frequency. To this end, the tradeoff between performance and energy consumption of a macrocell-based cache has been evaluated varying the capacitance and frequency. Experimental results show that, compared to a conventional cache, performance losses are lower than 2% and energy savings are up to 55% for a cache with 10 fF capacitors and frequencies higher than 1 GHz. In addition, using trench capacitors, a 4-bit macrocell reduces by 29% the area of four conventional SRAM cells.This work was supported in part by Spanish CICYT under Grant TIN2009-14475-C04-01, by Consolider-Ingenio 2010 under Grant CSD2006-00046, and by European community’s Seventh Framework Programme (FP7/2007-2013) under Grant 289154.Valero Bresó, A.; Sahuquillo Borrás, J.; Lorente Garcés, VJ.; Petit Martí, SV.; López Rodríguez, PJ.; Duato Marín, JF. (2012). Impact on performance and energy of the retention time and processor frequency in L1 macrocell-based data caches. IEEE Transactions on Very Large Scale Integration (VLSI) Systems. 20(6):1108-1117. https://doi.org/10.1109/TVLSI.2011.2142202S1108111720

    Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches

    Full text link
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other worksSRAM and DRAM have been the predominant technologies used to implement memory cells in computer systems, each one having its advantages and shortcomings. SRAM cells are faster and require no refresh since reads are not destructive. In contrast, DRAM cells provide higher density and minimal leakage energy since there are no paths within the cell from Vdd to ground. Recently, DRAM cells have been embedded in logic-based technology (eDRAM), thus overcoming the speed limit of typical DRAM cells. In this paper, we propose a hybrid n-bit macrocell that implements one SRAM cell and n-1 eDRAM cells. This cell is aimed at being used in an n-way set-associative first-level data cache. Architectural mechanisms (e.g., special writeback policies) have been devised to completely avoid refresh logic. Performance, energy, and area have been analyzed in detail. Experimental results show that using typical eDRAM capacitors, and compared to a conventional cache, a 4-way set-associative hybrid cache reduces both energy consumption and area up to 54 and 29 percent, respectively, while having negligible impact on performance (less than 2 percent).This work was supported by the Spanish Ministerio de Ciencia e Innovacion (MICINN), and jointly financed with Plan E funds under Grant TIN2009-14475-C04-01 and by Consolider-Ingenio 2010 under Grant CSD2006-00046.Valero Bresó, A.; Petit Martí, SV.; Sahuquillo Borrás, J.; López Rodríguez, PJ.; Duato Marín, JF. (2012). Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches. IEEE Transactions on Computers. 61(9):1231-1242. https://doi.org/10.1109/TC.2011.138S1231124261

    Robustness Comparison of Emerging Devices for Portable Applications

    Get PDF
    Extensive development in portable devices imposes pressing need for designing VLSI circuits with ultralow power (ULP) consumption. Subthreshold operating region is found to be an attractive solution for achieving ultralow power. However, it limits the circuit speed due to use of parasitic leakage current as drive current. Maintaining power dissipation at ultralow level with enhanced speed will further broaden the application area of subthreshold circuits even towards the field programmable gate arrays and real-time portable domain. Operating the Si-MOSFET in subthreshold regions degrades the circuit performance in terms of speed and also increases the well-designed circuit parameter spreading due to process, voltage, and temperature variations. This may cause the subthreshold circuit failure at very low supply voltage. It is essential to examine the robustness of most emerging devices against PVT variations. Therefore, this paper investigates and compares the performance of most promising upcoming devices like CNFET and DG FinFET in subthreshold regions. Effect of PVT variation on performance of CNFET and DG FinFET has been explored and it is found that CNFET is more robust than DG-FinFET under subthreshold conditions against PVT variations

    Programmable mixed-signal circuits

    Get PDF
    A novel concept for programmable mixed-signal circuits is presented based on programmable transmission gates. For implementation, memristively switching devices are suggested as the most promising candidates for realization of fast and small-footprint signal routing switches with small resistance and capacity. As a proof-of-concept, LT Spice simulations of digital and analogue example circuits implemented by the new concept are demonstrated. It is discussed how important design parameters can be tuned in the circuity. Compared to competing technologies such as Field Programmable Analogue Arrays or Application-Specific Integrated Circuits, the presented concept allows for development of ultra-flexible, reconfigurable, and cheap embedded mixed-signal circuits for applications where only limited space is available or high bandwidth is required

    Identification and Rejuvenation of NBTI-Critical Logic Paths in Nanoscale Circuits

    Get PDF
    The Negative Bias Temperature Instability (NBTI) phenomenon is agreed to be one of the main reliability concerns in nanoscale circuits. It increases the threshold voltage of pMOS transistors, thus, slows down signal propagation along logic paths between flip-flops. NBTI may cause intermittent faults and, ultimately, the circuit’s permanent functional failures. In this paper, we propose an innovative NBTI mitigation approach by rejuvenating the nanoscale logic along NBTI-critical paths. The method is based on hierarchical identification of NBTI-critical paths and the generation of rejuvenation stimuli using an Evolutionary Algorithm. A new, fast, yet accurate model for computation of NBTI-induced delays at gate-level is developed. This model is based on intensive SPICE simulations of individual gates. The generated rejuvenation stimuli are used to drive those pMOS transistors to the recovery phase, which are the most critical for the NBTI-induced path delay. It is intended to apply the rejuvenation procedure to the circuit, as an execution overhead, periodically. Experimental results performed on a set of designs demonstrate reduction of NBTI-induced delays by up to two times with an execution overhead of 0.1 % or less. The proposed approach is aimed at extending the reliable lifetime of nanoelectronics

    A Computing-in-Memory-based One-Class Hyperdimensional Computing Model for Outlier Detection

    Full text link
    In this work, we present ODHD, an algorithm for outlier detection based on hyperdimensional computing (HDC), a non-classical learning paradigm. Along with the HDC-based algorithm, we propose IM-ODHD, a computing-in-memory (CiM) implementation based on hardware/software (HW/SW) codesign for improved latency and energy efficiency. The training and testing phases of ODHD may be performed with conventional CPU/GPU hardware or our IM-ODHD, SRAM-based CiM architecture using the proposed HW/SW codesign techniques. We evaluate the performance of ODHD on six datasets from different application domains using three metrics, namely accuracy, F1 score, and ROC-AUC, and compare it with multiple baseline methods such as OCSVM, isolation forest, and autoencoder. The experimental results indicate that ODHD outperforms all the baseline methods in terms of these three metrics on every dataset for both CPU/GPU and CiM implementations. Furthermore, we perform an extensive design space exploration to demonstrate the tradeoff between delay, energy efficiency, and performance of ODHD. We demonstrate that the HW/SW codesign implementation of the outlier detection on IM-ODHD is able to outperform the GPU-based implementation of ODHD by at least 331.5x/889x in terms of training/testing latency (and on average 14.0x/36.9x in terms of training/testing energy consumption

    A New Hybrid Fault-Tolerant Architecture for Digital CMOS Circuits and Systems

    Full text link
    International audienceThis paper presents a new hybrid fault-tolerant architecture for robustness improvement of digital CMOS circuits and systems. It targets all kinds of errors in combinational part of logic circuits and thus, can be combined with advanced SEU protection techniques for sequential elements while reducing the power consumption. The proposed architecture combines different types of redundancies: information redundancy for error detection, temporal redundancy for soft error correction and hardware redundancy for hard error correction. Moreover, it uses a pseudo-dynamic comparator for SET and timing errors detection. Besides, the proposed method also aims to reduce power consumption of fault-tolerant architectures while keeping a comparable area overhead compared to existing solutions. Results on the largest ISCAS'85 and ITC'99 benchmark circuits show that our approach has an area cost of about 3 % to 6 % with a power consumption saving of about 33 % compared to TMR architectures

    An hybrid eDRAM/SRAM macrocell to implement first-level data caches

    Full text link
    SRAM and DRAM cells have been the predominant technologies used to implement memory cells in computer systems, each one having its advantages and shortcomings. SRAM cells are faster and require no refresh since reads are not destructive. In contrast, DRAM cells provide higher density and minimal leakage energy since there are no paths within the cell from Vdd to ground. Recently, DRAM cells have been embedded in logic-based technology, thus overcoming the speed limit of typical DRAM cells. In this paper we propose an n-bit macrocell that implements one static cell, and n-1 dynamic cells. This cell is aimed at being used in an n-way set-associative first-level data cache. Our study shows that in a four-way set-associative cache with this macrocell compared to an SRAM based with the same capacity, leakage is reduced by about 75% and area more than half with a minimal impact on performance. Architectural mechanisms have also been devised to avoid refresh logic. Experimental results show that no performance is lost when the retention time is larger than 50K processor cycles. In addition, the proposed delayed writeback policy that avoids refreshing performs a similar amount of writebacks than a conventional cache with the same organization, so no power wasting is incurred
    corecore