79,299 research outputs found
Dynamic Power Management of High Performance Network on Chip
With increased density of modern System on Chip(SoC) communication between nodes has become a major problem. Network on Chip is a novel on chip communication paradigm to solve this by using highly scalable and efficient packet switched network. The addition of intelligent networking on the chip adds to the chip’s power consumption thus making management of communication power an interesting and challenging research problem. While VLSI techniques have evolved over time to enable power reduction in the circuit level, the highly dynamic nature of modern large SoC demand more than that. This dissertation explores some innovative dynamic solutions to manage the ever increasing communication power in the post sub-micron era.
Today’s highly integrated SoCs require great level of cross layer optimizations to provide maximum efficiency. This dissertation aims at the dynamic power management problem from top. Starting with a system level distribution and management down to microarchitecture enhancements were found necessary to deliver maximum power efficiency. A distributed power budget sharing technique is proposed. To efficiently satisfy the established power budget, a novel flow control and throttling technique is proposed. Finally power efficiency of underlying microarchitecture is explored and novel buffer and link management techniques are developed.
All of the proposed techniques yield improvement in power-performance efficiency of the NoC infrastructure
Evaluation of temperature-performance trade-offs in wireless network-on-chip architectures
Continued scaling of device geometries according to Moore\u27s Law is enabling complete end-user systems on a single chip. Massive multicore processors are enablers for many information and communication technology (ICT) innovations spanning various domains, including healthcare, defense, and entertainment. In the design of high-performance massive multicore chips, power and heat are dominant constraints. Temperature hotspots witnessed in multicore systems exacerbate the problem of reliability in deep submicron technologies. Hence, there is a great need to explore holistic power and thermal optimization and management strategies for the massive multicore chips. High power consumption not only raises chip temperature and cooling cost, but also decreases chip reliability and performance. Thus, addressing thermal concerns at different stages of the design and operation is critical to the success of future generation systems. The performance of a multicore chip is also influenced by its overall communication infrastructure, which is predominantly a Network-on-Chip (NoC). The existing method of implementing a NoC with planar metal interconnects is deficient due to high latency, significant power consumption, and temperature hotspots arising out of long, multi-hop wireline links used in data exchange. On-chip wireless networks are envisioned as an enabling technology to design low power and high bandwidth massive multicore architectures. However, optimizing wireless NoCs for best performance does not necessarily guarantee a thermally optimal interconnection architecture. The wireless links being highly efficient attract very high traffic densities which in turn results in temperature hotspots. Therefore, while the wireless links result in better performance and energy-efficiency, they can also cause temperature hotspots and undermine the reliability of the system. Consequently, the location and utilization of the wireless links is an important factor in thermal optimization of high performance wireless Networks-on-Chip. Architectural innovation in conjunction with suitable power and thermal management strategies is the key for designing high performance yet energy-efficient massive multicore chips. This work contributes to exploration of various the design methodologies for establishing wireless NoC architectures that achieve the best trade-offs between temperature, performance and energy-efficiency. It further demonstrates that incorporating Dynamic Thermal Management (DTM) on a multicore chip designed with such temperature and performance optimized Wireless Network-on-Chip architectures improves thermal profile while simultaneously providing lower latency and reduced network energy dissipation compared to its conventional counterparts
Electro-Thermal Codesign in Liquid Cooled 3D ICs: Pushing the Power-Performance Limits
The performance improvement of today's computer systems is usually accompanied by increased chip power consumption and system temperature. Modern CPUs dissipate an average of 70-100W power while spatial and temporal power variations result in hotspots with even higher power density (up to 300W/cm^2). The coming years will continue to witness a significant increase in CPU power dissipation due to advanced multi-core architectures and 3D integration technologies. Nowadays the problems of increased chip power density, leakage power and system temperatures have become major obstacles for further improvement in chip performance. The conventional air cooling based heat sink has been proved to be insufficient for three dimensional integrated circuits (3D-ICs). Hence better cooling solutions are necessary. Micro-fluidic cooling, which integrates micro-channel heat sinks into silicon substrates of the chip and uses liquid flow to remove heat inside the chip, is an effective active cooling scheme for 3D-ICs. While the micro-fluidic cooling provides excellent cooling to 3D-ICs, the associated overhead (cooling power consumed by the pump to inject the coolant through micro-channels) is significant. Moreover, the 3D-IC structure also imposes constraints on micro-channel locations (basically resource conflict with through-silicon-vias TSVs or other structures).
In this work, we investigate optimized micro-channel configurations that address the aforementioned considerations. We develop three micro-channel structures (hotspot optimized cooling configuration, bended micro-channel and hybrid cooling network) that can provide sufficient cooling to 3D-IC with minimum cooling power overhead, while at the same time, compatible with the existing electrical structure such as TSVs. These configurations can achieve up to 70% cooling power savings compared with the configuration without any optimization. Based on these configurations, we then develop a micro-fluidic cooling based dynamic thermal management approach that maintains the chip temperature through controlling the fluid flow rate (pressure drop) through micro-channels. These cooling configurations are designed after the electrical parts, and therefore, compatible with the current standard IC design flow.
Furthermore, the electrical, thermal, cooling and mechanical aspects of 3D-IC are interdependent. Hence the conventional design flow that designs the cooling configuration after electrical aspect is finished will result in inefficiencies. In order to overcome this problem, we then investigate electrical-thermal co-design methodology for 3D-ICs. Two co-design problems are explored: TSV assignment and micro-channel placement co-design, and gate sizing and fluidic cooling co-design. The experimental results show that the co-design enables a fundamental power-performance improvement over the conventional design flow which separates the electrical and cooling design. For example, the gate sizing and fluidic cooling co-design achieves 12% power savings under the same circuit timing constraint and 16% circuit speedup under the same power budget
An Artificial Neural Networks based Temperature Prediction Framework for Network-on-Chip based Multicore Platform
Continuous improvement in silicon process technologies has made possible the integration of hundreds of cores on a single chip. However, power and heat have become dominant constraints in designing these massive multicore chips causing issues with reliability, timing variations and reduced lifetime of the chips. Dynamic Thermal Management (DTM) is a solution to avoid high temperatures on the die. Typical DTM schemes only address core level thermal issues. However, the Network-on-chip (NoC) paradigm, which has emerged as an enabling methodology for integrating hundreds to thousands of cores on the same die can contribute significantly to the thermal issues. Moreover, the typical DTM is triggered reactively based on temperature measurements from on-chip thermal sensor requiring long reaction times whereas predictive DTM method estimates future temperature in advance, eliminating the chance of temperature overshoot. Artificial Neural Networks (ANNs) have been used in various domains for modeling and prediction with high accuracy due to its ability to learn and adapt. This thesis concentrates on designing an ANN prediction engine to predict the thermal profile of the cores and Network-on-Chip elements of the chip. This thermal profile of the chip is then used by the predictive DTM that combines both core level and network level DTM techniques. On-chip wireless interconnect which is recently envisioned to enable energy-efficient data exchange between cores in a multicore environment, will be used to provide a broadcast-capable medium to efficiently distribute thermal control messages to trigger and manage the DTM schemes
A verilog-hdl implementation of virtual channels in a network-on-chip router
As the feature size is continuously decreasing and integration density is increasing,
interconnections have become a dominating factor in determining the overall
quality of a chip. Due to the limited scalability of system bus, it cannot meet the
requirement of current System-on-Chip (SoC) implementations where only a limited
number of functional units can be supported. Long global wires also cause many
design problems, such as routing congestion, noise coupling, and difficult timing closure.
Network-on-Chip (NoC) architectures have been proposed to be an alternative
to solve the above problems by using a packet-based communication network. The
processing elements (PEs) communicate with each other by exchanging messages over
the network and these messages go through buffers in each router. Buffers are one of
the major resource used by the routers in virtual channel flow control.
In this thesis, we analyze two kinds of buffer allocation approaches, static and
dynamic buffer allocations. These approaches aim to increase throughput and minimize
latency by means of virtual channel flow control. In statically allocated buffer
architecture, size and organization are design time decisions and thus, do not perform
optimally for all traffic conditions. In addition, statically allocated virtual channel
consumes a waste of area and significant leakage power. However, dynamic buffer allocation
scheme claims that buffer utilization can be increased using dynamic virtual
channels. Dynamic virtual channel regulator (ViChaR), have been proposed to use
centralized buffer architecture which dynamically allocates virtual channels and buffer slots in real-time depending on traffic conditions. This ViChaR’s dynamic buffer management
scheme increases buffer utilization, but it also increases design complexity. In
this research, we reexamine performance, power consumption, and area of ViChaR’s
buffer architecture through implementation. We implement a generic router and a
ViChaR architecture using Verilog-HDL. These RTL codes are verified by dynamic
simulation, and synthesized by Design Compiler to get area and power consumption.
In addition, we get latency through Static Timing Analysis. The results show that a
ViChaR’s dynamic buffer management scheme increases the latency and power consumption
significantly even though it could increase buffer utilization. Therefore, we
need a novel design to achieve high buffer utilization without a loss
Combined Dynamic Thermal Management Exploiting Broadcast-Capable Wireless Network-on-Chip Architecture
With the continuous scaling of device dimensions, the number of cores on a single die is constantly increasing. This integration of hundreds of cores on a single die leads to high power dissipation and thermal issues in modern Integrated Circuits (ICs). This causes problems related to reliability, timing violations and lifetime of electronic devices. Dynamic Thermal Management (DTM) techniques have emerged as potential solutions that mitigate the increasing temperatures on a die. However, considering the scaling of system sizes and the adoption of the Network-on-Chip (NoC) paradigm to serve as the interconnection fabric exacerbates the problem as both cores and NoC elements contribute to the increased heat dissipation on the chip.
Typically, DTM techniques can either be proactive or reactive. Proactive DTM techniques, where the system has the ability to predict the thermal profile of the chip ahead of time are more desirable than reactive DTM techniques where the system utilizes thermal sensors to determine the current temperature of the chip.
Moreover, DTM techniques either address core or NoC level thermal issues separately. Hence, this thesis proposes a combined proactive DTM technique that integrates both core level and NoC level DTM techniques. The combined DTM mechanism includes a dynamic temperature-aware routing approach for the NoC level elements, and includes task reallocation heuristics for the core level elements.
On-chip wireless interconnects recently envisioned to enable energy-efficient data exchange between cores in a multicore chip will be used to provide a broadcast-capable medium to efficiently distribute thermal control messages to trigger and manage the DTM. Combining the proactive DTM technique with on-chip wireless interconnects, the on-chip temperature is restricted within target temperatures without significantly affecting the performance of the NoC based interconnection fabric of the multicore chip
Dynamic Power Management for Neuromorphic Many-Core Systems
This work presents a dynamic power management architecture for neuromorphic
many core systems such as SpiNNaker. A fast dynamic voltage and frequency
scaling (DVFS) technique is presented which allows the processing elements (PE)
to change their supply voltage and clock frequency individually and
autonomously within less than 100 ns. This is employed by the neuromorphic
simulation software flow, which defines the performance level (PL) of the PE
based on the actual workload within each simulation cycle. A test chip in 28 nm
SLP CMOS technology has been implemented. It includes 4 PEs which can be scaled
from 0.7 V to 1.0 V with frequencies from 125 MHz to 500 MHz at three distinct
PLs. By measurement of three neuromorphic benchmarks it is shown that the total
PE power consumption can be reduced by 75%, with 80% baseline power reduction
and a 50% reduction of energy per neuron and synapse computation, all while
maintaining temporary peak system performance to achieve biological real-time
operation of the system. A numerical model of this power management model is
derived which allows DVFS architecture exploration for neuromorphics. The
proposed technique is to be used for the second generation SpiNNaker
neuromorphic many core system
- …