2,170 research outputs found

    Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

    Full text link
    The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM

    Exploring performance and power properties of modern multicore chips via simple machine models

    Full text link
    Modern multicore chips show complex behavior with respect to performance and power. Starting with the Intel Sandy Bridge processor, it has become possible to directly measure the power dissipation of a CPU chip and correlate this data with the performance properties of the running code. Going beyond a simple bottleneck analysis, we employ the recently published Execution-Cache-Memory (ECM) model to describe the single- and multi-core performance of streaming kernels. The model refines the well-known roofline model, since it can predict the scaling and the saturation behavior of bandwidth-limited loop kernels on a multicore chip. The saturation point is especially relevant for considerations of energy consumption. From power dissipation measurements of benchmark programs with vastly different requirements to the hardware, we derive a simple, phenomenological power model for the Sandy Bridge processor. Together with the ECM model, we are able to explain many peculiarities in the performance and power behavior of multicore processors, and derive guidelines for energy-efficient execution of parallel programs. Finally, we show that the ECM and power models can be successfully used to describe the scaling and power behavior of a lattice-Boltzmann flow solver code.Comment: 23 pages, 10 figures. Typos corrected, DOI adde

    The "MIND" Scalable PIM Architecture

    Get PDF
    MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND architecture

    An Artificial Neural Networks based Temperature Prediction Framework for Network-on-Chip based Multicore Platform

    Get PDF
    Continuous improvement in silicon process technologies has made possible the integration of hundreds of cores on a single chip. However, power and heat have become dominant constraints in designing these massive multicore chips causing issues with reliability, timing variations and reduced lifetime of the chips. Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on the die. Typical DTM schemes only address core level thermal issues. However, the Network-on-chip (NoC) paradigm, which has emerged as an enabling methodology for integrating hundreds to thousands of cores on the same die can contribute significantly to the thermal issues. Moreover, the typical DTM is triggered reactively based on temperature measurements from on-chip thermal sensor requiring long reaction times whereas predictive DTM method estimates future temperature in advance, eliminating the chance of temperature overshoot. Artificial Neural Networks (ANNs) have been used in various domains for modeling and prediction with high accuracy due to its ability to learn and adapt. This thesis concentrates on designing an ANN prediction engine to predict the thermal profile of the cores and Network-on-Chip elements of the chip. This thermal profile of the chip is then used by the predictive DTM that combines both core level and network level DTM techniques. On-chip wireless interconnect which is recently envisioned to enable energy-efficient data exchange between cores in a multicore environment, will be used to provide a broadcast-capable medium to efficiently distribute thermal control messages to trigger and manage the DTM schemes

    Evaluation of temperature-performance trade-offs in wireless network-on-chip architectures

    Get PDF
    Continued scaling of device geometries according to Moore\u27s Law is enabling complete end-user systems on a single chip. Massive multicore processors are enablers for many information and communication technology (ICT) innovations spanning various domains, including healthcare, defense, and entertainment. In the design of high-performance massive multicore chips, power and heat are dominant constraints. Temperature hotspots witnessed in multicore systems exacerbate the problem of reliability in deep submicron technologies. Hence, there is a great need to explore holistic power and thermal optimization and management strategies for the massive multicore chips. High power consumption not only raises chip temperature and cooling cost, but also decreases chip reliability and performance. Thus, addressing thermal concerns at different stages of the design and operation is critical to the success of future generation systems. The performance of a multicore chip is also influenced by its overall communication infrastructure, which is predominantly a Network-on-Chip (NoC). The existing method of implementing a NoC with planar metal interconnects is deficient due to high latency, significant power consumption, and temperature hotspots arising out of long, multi-hop wireline links used in data exchange. On-chip wireless networks are envisioned as an enabling technology to design low power and high bandwidth massive multicore architectures. However, optimizing wireless NoCs for best performance does not necessarily guarantee a thermally optimal interconnection architecture. The wireless links being highly efficient attract very high traffic densities which in turn results in temperature hotspots. Therefore, while the wireless links result in better performance and energy-efficiency, they can also cause temperature hotspots and undermine the reliability of the system. Consequently, the location and utilization of the wireless links is an important factor in thermal optimization of high performance wireless Networks-on-Chip. Architectural innovation in conjunction with suitable power and thermal management strategies is the key for designing high performance yet energy-efficient massive multicore chips. This work contributes to exploration of various the design methodologies for establishing wireless NoC architectures that achieve the best trade-offs between temperature, performance and energy-efficiency. It further demonstrates that incorporating Dynamic Thermal Management (DTM) on a multicore chip designed with such temperature and performance optimized Wireless Network-on-Chip architectures improves thermal profile while simultaneously providing lower latency and reduced network energy dissipation compared to its conventional counterparts

    Temperature Evaluation of NoC Architectures and Dynamically Reconfigurable NoC

    Get PDF
    Advancements in the field of chip fabrication led to the integration of a large number of transistors in a small area, giving rise to the multi–core processor era. Massive multi–core processors facilitate innovation and research in the field of healthcare, defense, entertainment, meteorology and many others. Reduction in chip area and increase in the number of on–chip cores is accompanied by power and temperature issues. In high performance multi–core chips, power and heat are predominant constraints. High performance massive multicore systems suffer from thermal hotspots, exacerbating the problem of reliability in deep submicron technologies. High power consumption not only increases the chip temperature but also jeopardizes the integrity of the system. Hence, there is a need to explore holistic power and thermal optimization and management strategies for massive on–chip multi–core environments. In multi–core environments, the communication fabric plays a major role in deciding the efficiency of the system. In multi–core processor chips this communication infrastructure is predominantly a Network–on–Chip (NoC). Tradition NoC designs incorporate planar interconnects as a result these NoCs have long, multi–hop wireline links for data exchange. Due to the presence of multi–hop planar links such NoC architectures fall prey to high latency, significant power dissipation and temperature hotspots. Networks inspired from nature are envisioned as an enabling technology to achieve highly efficient and low power NoC designs. Adopting wireless technology in such architectures enhance their performance. Placement of wireless interconnects (WIs) alters the behavior of the network and hence a random deployment of WIs may not result in a thermally optimal solution. In such scenarios, the WIs being highly efficient would attract high traffic densities resulting in thermal hotspots. Hence, the location and utilization of the wireless links is a key factor in obtaining a thermal optimal highly efficient Network–on–chip. Optimization of the NoC framework alone is incapable of addressing the effects due to the runtime dynamics of the system. Minimal paths solely optimized for performance in the network may lead to excessive utilization of certain NoC components leading to thermal hotspots. Hence, architectural innovation in conjunction with suitable power and thermal management strategies is the key for designing high performance and energy–efficient multicore systems. This work contributes at exploring various wired and wireless NoC architectures that achieve best trade–offs between temperature, performance and energy–efficiency. It further proposes an adaptive routing scheme which factors in the thermal profile of the chip. The proposed routing mechanism dynamically reacts to the thermal profile of the chip and takes measures to avoid thermal hotspots, achieving a thermally efficient dynamically reconfigurable network on chip architecture
    • …
    corecore