1,762 research outputs found

    Discount Counting for Fast Flow Statistics on Flow Size and Flow Volume

    Full text link

    Microarchitectural techniques to reduce energy consumption in the memory hierarchy

    Get PDF
    This thesis states that dynamic profiling of the memory reference stream can improve energy and performance in the memory hierarchy. The research presented in this theses provides multiple instances of using lightweight hardware structures to profile the memory reference stream. The objective of this research is to develop microarchitectural techniques to reduce energy consumption at different levels of the memory hierarchy. Several simple and implementable techniques were developed as a part of this research. One of the techniques identifies and eliminates redundant refresh operations in DRAM and reduces DRAM refresh power. Another, reduces leakage energy in L2 and higher level caches for multiprocessor systems. The emphasis of this research has been to develop several techniques of obtaining energy savings in caches using a simple hardware structure called the counting Bloom filter (CBF). CBFs have been used to predict L2 cache misses and obtain energy savings by not accessing the L2 cache on a predicted miss. A simple extension of this technique allows CBFs to do way-estimation of set associative caches to reduce energy in cache lookups. Another technique using CBFs track addresses in a Virtual Cache and reduce false synonym lookups. Finally this thesis presents a technique to reduce dynamic power consumption in level one caches using significance compression. The significant energy and performance improvements demonstrated by the techniques presented in this thesis suggest that this work will be of great value for designing memory hierarchies of future computing platforms.Ph.D.Committee Chair: Lee, Hsien-Hsin S.; Committee Member: Cahtterjee,Abhijit; Committee Member: Mukhopadhyay, Saibal; Committee Member: Pande, Santosh; Committee Member: Yalamanchili, Sudhaka

    URSIM reference manual

    Get PDF
    technical reportSimulation has emerged as an important method for evaluating new ideas in both uniprocessor and multiprocessor architecture. Compared to building real hardware, simulation provides at least two advantages. First, it provides the flexibility to modify various architectural parameters and components and to analyze the benefits of such modifications. Second, simulation allows detailed statistics collection, providing a better understanding of the tradcoffs involved and facilitating further performance tuning

    The Design of A High Capacity and Energy Efficient Phase Change Main Memory

    Get PDF
    Higher energy-efficiency has become essential in servers for a variety of reasons that range from heavy power and thermal constraints, environmental issues and financial savings. With main memory responsible for at least 30% of the energy consumed by a server, a low power main memory is fundamental to achieving this energy efficiency DRAM has been the technology of choice for main memory for the last three decades primarily because it traditionally combined relatively low power, high performance, low cost and high density. However, with DRAM nearing its density limit, alternative low-power memory technologies, such as Phase-change memory (PCM), have become a feasible replacement. PCM limitations, such as limited endurance and low write performance, preclude simple drop-in replacement and require new architectures and algorithms to be developed. A PCM main memory architecture (PMMA) is introduced in this dissertation, utilizing both DRAM and PCM, to create an energy-efficient main memory that is able to replace a DRAM-only memory. PMMA utilizes a number of techniques and architectural changes to achieve a level of performance that is par with DRAM. PMMA achieves gains in energy-delay of up to 65%, with less than 5% of performance loss and extremely high energy gains. To address the other major shortcoming of PCM, namely limited endurance, a novel, low- overhead wear-leveling algorithm that builds on PMMA is proposed that increases the lifetime of PMMA to match the expected server lifetime so that both server and memory subsystems become obsolete at about the same time. We also study how to better use the excess capacity, traditionally available on PCM devices, to obtain the highest lifetime possible. We show that under specific endurance distributions, the naive choice does not achieve the highest lifetime. We devise rules that empower the designer to select algorithms and parameters to achieve higher lifetime or simplify the design knowing the impact on the lifetime. The techniques presented also apply to other storage class memories (SCM) memories that suffer from limited endurance

    Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches

    Full text link
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other worksSRAM and DRAM have been the predominant technologies used to implement memory cells in computer systems, each one having its advantages and shortcomings. SRAM cells are faster and require no refresh since reads are not destructive. In contrast, DRAM cells provide higher density and minimal leakage energy since there are no paths within the cell from Vdd to ground. Recently, DRAM cells have been embedded in logic-based technology (eDRAM), thus overcoming the speed limit of typical DRAM cells. In this paper, we propose a hybrid n-bit macrocell that implements one SRAM cell and n-1 eDRAM cells. This cell is aimed at being used in an n-way set-associative first-level data cache. Architectural mechanisms (e.g., special writeback policies) have been devised to completely avoid refresh logic. Performance, energy, and area have been analyzed in detail. Experimental results show that using typical eDRAM capacitors, and compared to a conventional cache, a 4-way set-associative hybrid cache reduces both energy consumption and area up to 54 and 29 percent, respectively, while having negligible impact on performance (less than 2 percent).This work was supported by the Spanish Ministerio de Ciencia e Innovacion (MICINN), and jointly financed with Plan E funds under Grant TIN2009-14475-C04-01 and by Consolider-Ingenio 2010 under Grant CSD2006-00046.Valero Bresó, A.; Petit Martí, SV.; Sahuquillo Borrás, J.; López Rodríguez, PJ.; Duato Marín, JF. (2012). Design, performance, and energy consumption of eDRAM/SRAM macrocells for L1 data caches. IEEE Transactions on Computers. 61(9):1231-1242. https://doi.org/10.1109/TC.2011.138S1231124261

    Programming Persistent Memory

    Get PDF
    Beginning and experienced programmers will use this comprehensive guide to persistent memory programming. You will understand how persistent memory brings together several new software/hardware requirements, and offers great promise for better performance and faster application startup times—a huge leap forward in byte-addressable capacity compared with current DRAM offerings. This revolutionary new technology gives applications significant performance and capacity improvements over existing technologies. It requires a new way of thinking and developing, which makes this highly disruptive to the IT/computing industry. The full spectrum of industry sectors that will benefit from this technology include, but are not limited to, in-memory and traditional databases, AI, analytics, HPC, virtualization, and big data. Programming Persistent Memory describes the technology and why it is exciting the industry. It covers the operating system and hardware requirements as well as how to create development environments using emulated or real persistent memory hardware. The book explains fundamental concepts; provides an introduction to persistent memory programming APIs for C, C++, JavaScript, and other languages; discusses RMDA with persistent memory; reviews security features; and presents many examples. Source code and examples that you can run on your own systems are included. What You’ll Learn Understand what persistent memory is, what it does, and the value it brings to the industry Become familiar with the operating system and hardware requirements to use persistent memory Know the fundamentals of persistent memory programming: why it is different from current programming methods, and what developers need to keep in mind when programming for persistence Look at persistent memory application development by example using the Persistent Memory Development Kit (PMDK) Design and optimize data structures for persistent memory Study how real-world applications are modified to leverage persistent memory Utilize the tools available for persistent memory programming, application performance profiling, and debugging Who This Book Is For C, C++, Java, and Python developers, but will also be useful to software, cloud, and hardware architects across a broad spectrum of sectors, including cloud service providers, independent software vendors, high performance compute, artificial intelligence, data analytics, big data, etc
    • …
    corecore