Search CORE

1,647 research outputs found

Recommended from our members

Reliabilibity Aware Thermal Management of Real-time Multi-core Systems

Author: Xu Shikang
Publication venue: ScholarWorks@UMass Amherst
Publication date: 18/03/2015
Field of study

Continued scaling of CMOS technology has led to increasing working temperature of VLSI circuits. High temperature brings a greater probability of permanent errors (failure) in VLSI circuits, which is a critical threat for real-time systems. As the multi-core architecture is gaining in popularity, this research proposes an adaptive workload assignment approach for multi-core real-time systems to balance thermal stress among cores. While previously developed scheduling algorithms use temperature as the criterion, the proposed algorithm uses reliability of each core in the system to dynamically assign tasks to cores. The simulation results show that the proposed algorithm gains as large as 10% benefit in system reliability compared with commonly used static assignment while algorithms using temperature as criterion gain 4%. The reliability difference between cores, which indicates the imbalance of thermal stress on each core, is as large as 25 times smaller when proposed algorithm is applied

ScholarWorks@UMass Amherst

Recommended from our members

A Dynamic Reconfiguration Framework to Maximize Performance/Power in Asymmetric Multicore Processors

Author: Annamalai Arunachalam
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2013
Field of study

Recent trends in technology scaling have shifted the processing paradigm to multicores. Depending on the characteristics of the cores, the multicores can be either symmetric or asymmetric. Prior research has shown that Asymmetric Multicore Processors (AMPs) outperform their symmetric (SMP) counterparts within a given resource and power budget. But, due to the heterogeneity in core-types and time-varying workload behavior, thread-to-core assignment is always a challenge in AMPs. As the computational requirements vary significantly across different applications and with time, there is a need to dynamically allocate appropriate computational resources on demand to suit the applications’ current needs, in order to maximize the performance and minimize the energy consumption. Performance/power of the applications could be further increased by dynamically adapting the voltage and frequency of the cores to better fit the changing characteristics of the workloads. Not only can a core be forced to a low power mode when its activity level is low, but the power saved by doing so could be opportunistically re-budgeted to the other cores to boost the overall system throughput. To this end, we propose a novel solution that seamlessly combines heterogeneity with a Dynamic Reconfiguration Framework (DRF). The proposed dynamic reconfiguration framework is equipped with Dynamic Resource Allocation (DRA) and Voltage/Frequency Adaptation (DVFA) capabilities to adapt the core resources and operating conditions at runtime to the changing demands of the applications. As a proof of concept, we illustrate our proposed approach using a dual-core AMP and demonstrate significant performance/power benefits over various baselines

ScholarWorks@UMass Amherst

IMPROVING THE PERFORMANCE AND ENERGY EFFICIENCY OF EMERGING MEMORY SYSTEMS

Author: Guo Yuhua
Publication venue: VCU Scholars Compass
Publication date: 01/01/2018
Field of study

Modern main memory is primarily built using dynamic random access memory (DRAM) chips. As DRAM chip scales to higher density, there are mainly three problems that impede DRAM scalability and performance improvement. First, DRAM refresh overhead grows from negligible to severe, which limits DRAM scalability and causes performance degradation. Second, although memory capacity has increased dramatically in past decade, memory bandwidth has not kept pace with CPU performance scaling, which has led to the memory wall problem. Third, DRAM dissipates considerable power and has been reported to account for as much as 40% of the total system energy and this problem exacerbates as DRAM scales up. To address these problems, 1) we propose Rank-level Piggyback Caching (RPC) to alleviate DRAM refresh overhead by servicing memory requests and refresh operations in parallel; 2) we propose a high performance and bandwidth efficient approach, called SELF, to breaking the memory bandwidth wall by exploiting die-stacked DRAM as a part of memory; 3) we propose a cost-effective and energy-efficient architecture for hybrid memory systems composed of high bandwidth memory (HBM) and phase change memory (PCM), called Dual Role HBM (DR-HBM). In DR-HBM, hot pages are tracked at a cost-effective way and migrated to the HBM to improve performance, while cold pages are stored at the PCM to save energy

VCU Scholars Compass

Recommended from our members

Dynamic Processor Reconfiguration for Power, Performance and Reliability Management

Author: Srinivasan Sudarshan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 10/11/2016
Field of study

Technology advancements allowed more transistors to be packed in a smaller area, while the improved performance helped in achieving higher clock frequencies. This, unfortunately led to a power density problem, forcing processor industry to lower the clock frequency and integrate multiple cores on the same die. Depending on core characteristics, the multiple cores in the die could be symmetric or asymmetric. Asymmetric multi-core processors (AMPs) have been proposed as an alternative to symmetric multi-cores to improve power efficiency. AMPs comprise of cores that implement the same ISA, but differ in performance and power characteristics due to varying sizes of micro-architectural resources. As the computational bottleneck of a workload shifts from one resource to another during its course of execution, reassigning it to another core (where it runs more efficiently), can improve the overall power efficiency. Thus achieving high power efficiency in AMPs requires (i) a diverse set of cores that are optimized for various program phases, (ii) runtime analysis to determine the best core to run on, and (iii) low overhead of re-assigning a thread to a different core type. Decisions to swap threads between AMPs are made at coarse grain granularity of millions of instructions, to mitigate the impact of thread migration overhead. But the computational needs of the program rapidly change during the course of its execution. The best core configuration for an application such that, both power consumption and performance are optimized, changes over time rapidly at fine granularity of thousands of instructions. This dissertation explores ways to design core micro-architecture such that high power efficiency could be achieved, if switching overhead could be lowered, enabling fine grain switching. To take advantage of power saving opportunities at fine grain granularity, this thesis explores reconfigurable/morphable architectures where core resources are reconfigured on demand to suit the needs of the executing application. At first, we explore reconfigurable architectures consisting of two kinds of cores: out-of-order (OOO) big cores and in-order (InO) small cores. The big cores provide higher performance while the small cores are more power efficient. In this proposed architecture, OOO core reconfigures into InO core at run time. Our proposed online management scheme decides to switch between these core types such that we obtain significant power benefits without impacting performance. We also observe that, resource requirements of applications can be quite diverse and consequently, resource bottlenecks or excesses can vary considerably. Thus, reconfiguration between just two core modes may not fully exploit power and performance improvement opportunities. We therefore, explore reconfigurable architectures consisting of diverse core types that not limited to big and little cores. A single core can reconfigure into multiple core modes where each mode has unique power and performance characteristics. Workload performance on a particular core mode depends on a large set of processor resources. Some workloads are highly memory intensive, some exhibit large instruction dependency, some experience high rates of branch mis-prediction, while other workloads exhibit large exploitable instruction level parallelism. A diverse set of core modes is needed, that could address shifting resource needs during various program phases of an application. Different trade-offs in power and performance could be achieved by reducing or expanding the size of various resource. Trade-offs for each core mode are also affected by operating voltage and frequency. We therefore, propose joint core resource resizing with dynamic voltage and frequency scaling (DVFS), which is important for applications whose performance is sensitive to changes in frequency. Thus, at fine granularity, the core should adapt to varying instruction window sizes, execution bandwidth and frequency to meet the demands of the workload at run-time to improve power efficiency. Many current processors employ DVFS aggressively to improve power efficiency and maximize performance. This dissertation studies the tradeoff in power efficiency in using fine grain DVFS and reconfigurable architectures mentioned above.We also explore another important problem due to continued scaling of devices which results in higher vulnerability to soft-errors. We consider dynamic core reconfiguration from the perspectives of both power efficiency and vulnerability to soft-errors. An online management scheme is proposed such that core reconfiguration upon a thread switch not only improves power efficiency but also does not increase the vulnerability to soft errors. In summary, we propose in this thesis several solutions for improving power efficiency by integrating heterogeneity within the core. We also address how popular power reduction techniques like DVFS are comparable to our approach. Finally, we address reliability challenges along with improving power efficiency

ScholarWorks@UMass Amherst

DELICIOUS: Deadline-Aware Approximate Computing in Cache-Conscious Multicore

Author: Agarwal Sukarn
Chakraborty Shounak
Gangopadhyay Rahul
McDonald-Maier Klaus
Saha Sangeet
Själander Magnus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/12/2022
Field of study

Edinburgh Research Explorer

Huddersfield Research Portal

The Design of A High Capacity and Energy Efficient Phase Change Main Memory

Author: Ferreira Alexandre Peixoto
Publication venue
Publication date: 22/06/2011
Field of study

Higher energy-efficiency has become essential in servers for a variety of reasons that range from heavy power and thermal constraints, environmental issues and financial savings. With main memory responsible for at least 30% of the energy consumed by a server, a low power main memory is fundamental to achieving this energy efficiency DRAM has been the technology of choice for main memory for the last three decades primarily because it traditionally combined relatively low power, high performance, low cost and high density. However, with DRAM nearing its density limit, alternative low-power memory technologies, such as Phase-change memory (PCM), have become a feasible replacement. PCM limitations, such as limited endurance and low write performance, preclude simple drop-in replacement and require new architectures and algorithms to be developed. A PCM main memory architecture (PMMA) is introduced in this dissertation, utilizing both DRAM and PCM, to create an energy-efficient main memory that is able to replace a DRAM-only memory. PMMA utilizes a number of techniques and architectural changes to achieve a level of performance that is par with DRAM. PMMA achieves gains in energy-delay of up to 65%, with less than 5% of performance loss and extremely high energy gains. To address the other major shortcoming of PCM, namely limited endurance, a novel, low- overhead wear-leveling algorithm that builds on PMMA is proposed that increases the lifetime of PMMA to match the expected server lifetime so that both server and memory subsystems become obsolete at about the same time. We also study how to better use the excess capacity, traditionally available on PCM devices, to obtain the highest lifetime possible. We show that under specific endurance distributions, the naive choice does not achieve the highest lifetime. We devise rules that empower the designer to select algorithms and parameters to achieve higher lifetime or simplify the design knowing the impact on the lifetime. The techniques presented also apply to other storage class memories (SCM) memories that suffer from limited endurance

D-Scholarship@Pitt