118 research outputs found

    TSV placement optimization for liquid cooled 3D-ICs with emerging NVMs

    Get PDF
    Three dimensional integrated circuits (3D-ICs) are a promising solution to the performance bottleneck in planar integrated circuits. One of the salient features of 3D-ICs is their ability to integrate heterogeneous technologies such as emerging non-volatile memories (NVMs) in a single chip. However, thermal management in 3D-ICs is a significant challenge, owing to the high heat flux (~ 250 W/cm2). Several research groups have focused either on run-time or design-time mechanisms to reduce the heat flux and did not consider 3D-ICs with heterogeneous stacks. The goal of this work is to achieve a balanced thermal gradient in 3D-ICs, while reducing the peak temperatures. In this research, placement algorithms for design-time optimization and choice of appropriate cooling mechanisms for run-time modulation of temperature are proposed. Specifically, an architectural framework which introduce weight-based simulated annealing (WSA) algorithm for thermal-aware placement of through silicon vias (TSVs) with inter-tier liquid cooling is proposed for design-time. In addition, integrating a dedicated stack of emerging NVMs such as RRAM, PCRAM and STTRAM, a run-time simulation framework is developed to analyze the thermal and performance impact of these NVMs in 3D-MPSoCs with inter-tier liquid cooling. Experimental results of WSA algorithm implemented on MCNC91 and GSRC benchmarks demonstrate up to 11 K reduction in the average temperature across the 3D-IC chip. In addition, power density arrangement in WSA improved the uniformity by 5%. Furthermore, simulation results of PARSEC benchmarks with NVM L2 cache demonstrates a temperature reduction of 12.5 K (RRAM) compared to SRAM in 3D-ICs. Especially, RRAM has proved to be thermally efficient replacement for SRAM with 34% lower energy delay product (EDP) and 9.7 K average temperature reduction

    Gestión de jerarquías de memoria híbridas a nivel de sistema

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadoras y Automática y de Ku Leuven, Arenberg Doctoral School, Faculty of Engineering Science, leída el 11/05/2017.In electronics and computer science, the term ‘memory’ generally refers to devices that are used to store information that we use in various appliances ranging from our PCs to all hand-held devices, smart appliances etc. Primary/main memory is used for storage systems that function at a high speed (i.e. RAM). The primary memory is often associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices. The secondary/auxiliary memory, in comparison provides program and data storage that is slower to access but offers larger capacity. Examples include external hard drives, portable flash drives, CDs, and DVDs. These devices and media must be either plugged in or inserted into a computer in order to be accessed by the system. Since secondary storage technology is not always connected to the computer, it is commonly used for backing up data. The term storage is often used to describe secondary memory. Secondary memory stores a large amount of data at lesser cost per byte than primary memory; this makes secondary storage about two orders of magnitude less expensive than primary storage. There are two main types of semiconductor memory: volatile and nonvolatile. Examples of non-volatile memory are ‘Flash’ memory (sometimes used as secondary, sometimes primary computer memory) and ROM/PROM/EPROM/EEPROM memory (used for firmware such as boot programs). Examples of volatile memory are primary memory (typically dynamic RAM, DRAM), and fast CPU cache memory (typically static RAM, SRAM, which is fast but energy-consuming and offer lower memory capacity per are a unit than DRAM). Non-volatile memory technologies in Si-based electronics date back to the 1990s. Flash memory is widely used in consumer electronic products such as cellphones and music players and NAND Flash-based solid-state disks (SSDs) are increasingly displacing hard disk drives as the primary storage device in laptops, desktops, and even data centers. The integration limit of Flash memories is approaching, and many new types of memory to replace conventional Flash memories have been proposed. The rapid increase of leakage currents in Silicon CMOS transistors with scaling poses a big challenge for the integration of SRAM memories. There is also the case of susceptibility to read/write failure with low power schemes. As a result of this, over the past decade, there has been an extensive pooling of time, resources and effort towards developing emerging memory technologies like Resistive RAM (ReRAM/RRAM), STT-MRAM, Domain Wall Memory and Phase Change Memory(PRAM). Emerging non-volatile memory technologies promise new memories to store more data at less cost than the expensive-to build silicon chips used by popular consumer gadgets including digital cameras, cell phones and portable music players. These new memory technologies combine the speed of static random-access memory (SRAM), the density of dynamic random-access memory (DRAM), and the non-volatility of Flash memory and so become very attractive as another possibility for future memory hierarchies. The research and information on these Non-Volatile Memory (NVM) technologies has matured over the last decade. These NVMs are now being explored thoroughly nowadays as viable replacements for conventional SRAM based memories even for the higher levels of the memory hierarchy. Many other new classes of emerging memory technologies such as transparent and plastic, three-dimensional(3-D), and quantum dot memory technologies have also gained tremendous popularity in recent years...En el campo de la informática, el término ‘memoria’ se refiere generalmente a dispositivos que son usados para almacenar información que posteriormente será usada en diversos dispositivos, desde computadoras personales (PC), móviles, dispositivos inteligentes, etc. La memoria principal del sistema se utiliza para almacenar los datos e instrucciones de los procesos que se encuentre en ejecución, por lo que se requiere que funcionen a alta velocidad (por ejemplo, DRAM). La memoria principal está implementada habitualmente mediante memorias semiconductoras direccionables, siendo DRAM y SRAM los principales exponentes. Por otro lado, la memoria auxiliar o secundaria proporciona almacenaje(para ficheros, por ejemplo); es más lenta pero ofrece una mayor capacidad. Ejemplos típicos de memoria secundaria son discos duros, memorias flash portables, CDs y DVDs. Debido a que estos dispositivos no necesitan estar conectados a la computadora de forma permanente, son muy utilizados para almacenar copias de seguridad. La memoria secundaria almacena una gran cantidad de datos aun coste menor por bit que la memoria principal, siendo habitualmente dos órdenes de magnitud más barata que la memoria primaria. Existen dos tipos de memorias de tipo semiconductor: volátiles y no volátiles. Ejemplos de memorias no volátiles son las memorias Flash (algunas veces usadas como memoria secundaria y otras veces como memoria principal) y memorias ROM/PROM/EPROM/EEPROM (usadas para firmware como programas de arranque). Ejemplos de memoria volátil son las memorias DRAM (RAM dinámica), actualmente la opción predominante a la hora de implementar la memoria principal, y las memorias SRAM (RAM estática) más rápida y costosa, utilizada para los diferentes niveles de cache. Las tecnologías de memorias no volátiles basadas en electrónica de silicio se remontan a la década de1990. Una variante de memoria de almacenaje por carga denominada como memoria Flash es mundialmente usada en productos electrónicos de consumo como telefonía móvil y reproductores de música mientras NAND Flash solid state disks(SSDs) están progresivamente desplazando a los dispositivos de disco duro como principal unidad de almacenamiento en computadoras portátiles, de escritorio e incluso en centros de datos. En la actualidad, hay varios factores que amenazan la actual predominancia de memorias semiconductoras basadas en cargas (capacitivas). Por un lado, se está alcanzando el límite de integración de las memorias Flash, lo que compromete su escalado en el medio plazo. Por otra parte, el fuerte incremento de las corrientes de fuga de los transistores de silicio CMOS actuales, supone un enorme desafío para la integración de memorias SRAM. Asimismo, estas memorias son cada vez más susceptibles a fallos de lectura/escritura en diseños de bajo consumo. Como resultado de estos problemas, que se agravan con cada nueva generación tecnológica, en los últimos años se han intensificado los esfuerzos para desarrollar nuevas tecnologías que reemplacen o al menos complementen a las actuales. Los transistores de efecto campo eléctrico ferroso (FeFET en sus siglas en inglés) se consideran una de las alternativas más prometedores para sustituir tanto a Flash (por su mayor densidad) como a DRAM (por su mayor velocidad), pero aún está en una fase muy inicial de su desarrollo. Hay otras tecnologías algo más maduras, en el ámbito de las memorias RAM resistivas, entre las que cabe destacar ReRAM (o RRAM), STT-RAM, Domain Wall Memory y Phase Change Memory (PRAM)...Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Reconfigurable RRAM-based computing: A Case study for reliability enhancement

    Get PDF
    Emerging hybrid-CMOS nanoscale devices and architectures offer greater degree of integration and performance capabilities. However, the high power densities, hard error frequency, process variations, and device wearout affect the overall system reliability. Reactive design techniques, such as redundancy, account for component failures by mitigating them to prevent system failures. These techniques incur high area and power overhead. This research focuses on exploring hybrid CMOS/Resistive RAM (RRAM) architectures that enhance the system reliability by performing computation in RRAM cache whenever CMOS logic units fail, essentially masking the area overhead of redundant logic when not in use. The proposed designs are validated using the Gem5 performance simulator and McPAT power simulator running single-core SPEC2006 benchmarks and multi-core PARSEC benchmarks. The simulation results are used to evaluate the efficacy of reliability enhancement techniques using RRAM. The average runtime when using RRAM for functional unit replacement was between ~1.5 and ~2.5 times longer than the baseline for a single-core architecture, ~1.25 and ~2 times longer for an 8-core architecture, and ~1.2 and ~1.5 times longer for a 16-core architecture. Average energy consumption when using RRAM for functional unit replacement was between ~2 and ~5 times more than the baseline for a single-core architecture, and ~1.25 and ~2.75 times more for multi-core architectures. The performance degradation and energy consumption increase is justified by the prevention of system failure and enhanced reliability. Overall, the proposed architecture shows promise for use in multi-core systems. Average performance degradation decreases as more cores are used due to more total functional units being available, preventing a slow RRAM functional unit from becoming a bottleneck

    Evaluation of STT-MRAM main memory for HPC and real-time systems

    Get PDF
    It is questionable whether DRAM will continue to scale and will meet the needs of next-generation systems. Therefore, significant effort is invested in research and development of novel memory technologies. One of the candidates for nextgeneration memory is Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM is an emerging non-volatile memory with a lot of potential that could be exploited for various requirements of different computing systems. Being a novel technology, STT-MRAM devices are already approaching DRAM in terms of capacity, frequency and device size. Special STT-MRAM features such as intrinsic radiation hardness, non-volatility, zero stand-by power and capability to function in extreme temperatures also make it particularly suitable for aerospace, avionics and automotive applications. Despite of being a conceivable alternative for main memory technology, to this day, academic research of STT-MRAM main memory remains marginal. This is mainly due to the unavailability of publicly available detailed timing parameters of this novel technology, which are required to perform a cycle accurate main memory simulation. Some researchers adopt simplistic memory models to simulate main memory, but such models can introduce significant errors in the analysis of the overall system performance. Therefore, detailed timing parameters are a must-have for any evaluation or architecture exploration study of STT-MRAM main memory. These detailed parameters are not publicly available because STT-MRAM manufacturers are reluctant to release any delicate information on the technology. This thesis demonstrates an approach to perform a cycle accurate simulation of STT-MRAM main memory, being the first to release detailed timing parameters of this technology from academia, essentially enabling researchers to conduct reliable system level simulation of STT-MRAM using widely accepted existing simulation infrastructure. Our results show that, in HPC domain STT-MRAM provide performance comparable to DRAM. Results from the power estimation indicates that STT-MRAM power consumption increases significantly for Activation/Precharge power while Burst power increases moderately and Background power does not deviate much from DRAM. The thesis includes detailed STT-MRAM main memory timing parameters to the main repositories of DramSim2 and Ramulator, two of the most widely used and accepted state-of-the-art main memory simulators. The STT-MRAM timing parameters that has been originated as a part of this thesis, are till date the only reliable and publicly available timing information on this memory technology published from academia. Finally, the thesis analyzes the feasibility of using STT-MRAM in real-time embedded systems by investigating STT-MRAM main memory impact on average system performance and WCET. STT-MRAM's suitability for the real-time embedded systems is validated on benchmarks provided by the European Space Agency (ESA), EEMBC Autobench and MediaBench suite by analyzing performance and WCET impact. In quantitative terms, our results show that STT-MRAM main memory in real-time embedded systems provides performance and WCET comparable to conventional DRAM, while opening up opportunities to exploit various advantages.Es cuestionable si DRAM continuará escalando y cumplirá con las necesidades de los sistemas de la próxima generación. Por lo tanto, se invierte un esfuerzo significativo en la investigación y el desarrollo de nuevas tecnologías de memoria. Uno de los candidatos para la memoria de próxima generación es la Spin-Transfer Torque Magnetic Random Access Memory (STT-MRAM). STT-MRAM es una memoria no volátil emergente con un gran potencial que podría ser explotada para diversos requisitos de diferentes sistemas informáticos. Al ser una tecnología novedosa, los dispositivos STT-MRAM ya se están acercando a la DRAM en términos de capacidad, frecuencia y tamaño del dispositivo. Las características especiales de STTMRAM, como la dureza intrínseca a la radiación, la no volatilidad, la potencia de reserva cero y la capacidad de funcionar en temperaturas extremas, también lo hacen especialmente adecuado para aplicaciones aeroespaciales, de aviónica y automotriz. A pesar de ser una alternativa concebible para la tecnología de memoria principal, hasta la fecha, la investigación académica de la memoria principal de STT-MRAM sigue siendo marginal. Esto se debe principalmente a la falta de disponibilidad de los parámetros de tiempo detallados públicamente disponibles de esta nueva tecnología, que se requieren para realizar un ciclo de simulación de memoria principal precisa. Algunos investigadores adoptan modelos de memoria simplistas para simular la memoria principal, pero tales modelos pueden introducir errores significativos en el análisis del rendimiento general del sistema. Por lo tanto, los parámetros de tiempo detallados son indispensables para cualquier evaluación o estudio de exploración de la arquitectura de la memoria principal de STT-MRAM. Estos parámetros detallados no están disponibles públicamente porque los fabricantes de STT-MRAM son reacios a divulgar información delicada sobre la tecnología. Esta tesis demuestra un enfoque para realizar un ciclo de simulación precisa de la memoria principal de STT-MRAM, siendo el primero en lanzar parámetros de tiempo detallados de esta tecnología desde la academia, lo que esencialmente permite a los investigadores realizar una simulación confiable a nivel de sistema de STT-MRAM utilizando una simulación existente ampliamente aceptada infraestructura. Nuestros resultados muestran que, en el dominio HPC, STT-MRAM proporciona un rendimiento comparable al de la DRAM. Los resultados de la estimación de potencia indican que el consumo de potencia de STT-MRAM aumenta significativamente para la activation/Precharge power, mientras que la Burst power aumenta moderadamente y la Background power no se desvía mucho de la DRAM. La tesis incluye parámetros detallados de temporización memoria principal de STT-MRAM a los repositorios principales de DramSim2 y Ramulator, dos de los simuladores de memoria principal más avanzados y más utilizados y aceptados. Los parámetros de tiempo de STT-MRAM que se han originado como parte de esta tesis, son hasta la fecha la única información de tiempo confiable y disponible al público sobre esta tecnología de memoria publicada desde la academia. Finalmente, la tesis analiza la viabilidad de usar STT-MRAM en real-time embedded systems mediante la investigación del impacto de la memoria principal de STT-MRAM en el rendimiento promedio del sistema y WCET. La idoneidad de STTMRAM para los real-time embedded systems se valida en los applicaciones proporcionados por la European Space Agency (ESA), EEMBC Autobench y MediaBench, al analizar el rendimiento y el impacto de WCET. En términos cuantitativos, nuestros resultados muestran que la memoria principal de STT-MRAM en real-time embedded systems proporciona un desempeño WCET comparable al de una memoria DRAM convencional, al tiempo que abre oportunidades para explotar varias ventajas

    The Landscape of Compute-near-memory and Compute-in-memory: A Research and Commercial Overview

    Full text link
    In today's data-centric world, where data fuels numerous application domains, with machine learning at the forefront, handling the enormous volume of data efficiently in terms of time and energy presents a formidable challenge. Conventional computing systems and accelerators are continually being pushed to their limits to stay competitive. In this context, computing near-memory (CNM) and computing-in-memory (CIM) have emerged as potentially game-changing paradigms. This survey introduces the basics of CNM and CIM architectures, including their underlying technologies and working principles. We focus particularly on CIM and CNM architectures that have either been prototyped or commercialized. While surveying the evolving CIM and CNM landscape in academia and industry, we discuss the potential benefits in terms of performance, energy, and cost, along with the challenges associated with these cutting-edge computing paradigms
    corecore