2 research outputs found
Recommended from our members
Physics-Based Electromigration Modeling and Analysis and Optimization
Long-term reliability is a major concern in modern VLSI design. Literature has shown that reliability gets worse as technology advances. It is expected that the future VLSI systems would have shorter reliability-induced lifetime comparing with previous generations. Being one of the most serious reliability effects, electromigration (EM) is a physical phenomenon of the migration of metal atoms due to the momentum exchange between atoms and the conducting electrons. It can cause wire resistance change or open circuit and result in functional failure of the circuit. Power-ground networks are the most vulnerable part to EM effect among all the interconnect wires since the current flow on this part is the largest on the chip. With new generation oftechnology node and aggressive design strategies, more accurate and efficient EM models are required. However, traditional EM approaches are very conservative and cannot meet current aggressive design strategies. Besides circuit level, EM also need to be thoroughly studied in system level due to limited power and temperature budgets among cores on chip. This research focuses on developing physical level EM model for VLSI circuits and system level EM optimization for multi-core systems in order to overcome the aforementioned problems. Specifically, for physical level, we develop two EM immortality check methods and a power grid EM check method. Firstly, a voltage based EM immortality analysis has been developed. Immortality condition in nucleation phase can be determined fast and accurately for multi-segment interconnect wires. Secondly, a saturation volume based incubation phase immortality check method has been proposed. This method can further reduce the redundancy in VLSI circuit design by immortality check in multiphase. Furthermore, both immortality check methods are integrated into a new power grid EM check methodology (EMspice) as filter for EM analysis. These filters can accelerate the simulation by filtering out immortal trees so that we only need to do simulation on fewer trees that are mortal. Coupled EM simulation considering both hydrostatic stress and electronic current/voltage in the power grid network will be applied to these mortal trees. This tool can work seamlessly with commercial synthesis flow. Besides physical level reliability models, system level reliability optimization is also discussed in this research. A deep reinforcement learning based EM optimization has been proposed for multi-core system. Both long term reliability effect (hard error) and transient soft error are considered. Energy can be optimized with all the reliability and other constraints fast and accurately compared to existing reliability management techniques. Last but not least, a scheduling based reliability optimization method for multi-core systems has been proposed. NBTI, HCI and EM are considered jointly. Lifetime of the system can be improved significantly compared to traditional methods which mainly focus on utilization
Thermal and QoS-Aware Embedded Systems
While embedded systems such as smartphones and smart cars become essential parts of our lives, they face urgent thermal challenges. Extreme thermal conditions (i.e., both high and low temperatures) degrade system reliability, even risking safety; devices in the cold environments unexpectedly go offline, whereas extremely high device temperatures can cause device failures or battery explosions. These thermal limits become close to the norm because of ever-increasing chip power densities and
application complexities. Embedded systems in the wild, however, lack adaptive and effective solutions to overcome such thermal challenges. An adaptive thermal management solution must cope with various runtime thermal scenarios under a changing ambient temperature. An effective solution requires the understanding of the dynamic thermal behaviors of underlying hardware and application workloads to ensure thermal and application quality-of-service (QoS) requirements. This thesis proposes a suite of adaptive and effective thermal management solutions to address different aspects of real-world thermal challenges faced by modern embedded systems.
First, we present BPM, a battery-aware power management framework for mobile
devices to address the unexpected device shutoffs in cold environments. We develop BPM as a background service that characterizes and controls real-time battery behaviors to maintain operable conditions even in cold environments. We then propose eTEC, building on the thermoelectric cooling solution, which adaptively controls cooling and computational power to avoid mobile devices overheating. For the real-time embedded systems such as cars, we present RT-TRM, a thermal-aware resource management framework that monitors changing ambient temperatures and allocates system resources to individual tasks. Next, we target in-vehicle vision systems running on CPUs–GPU system-on-chips and develop CPU–GPU co-scheduling to tackle thermal imbalance across CPUs caused by GPU heat. We evaluate all of these solutions using representative mobile/automotive platforms and workloads, demonstrating their effectiveness in meeting thermal and QoS requirements.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153350/1/ymoonlee_1.pd