72 research outputs found

    Cross-layer design of thermally-aware 2.5D systems

    Full text link
    Over the past decade, CMOS technology scaling has slowed down. To sustain the historic performance improvement predicted by Moore's Law, in the mid-2000s the computing industry moved to using manycore systems and exploiting parallelism. The on-chip power densities of manycore systems, however, continued to increase after the breakdown of Dennard's Scaling. This leads to the `dark silicon' problem, whereby not all cores can operate at the highest frequency or can be turned on simultaneously due to thermal constraints. As a result, we have not been able to take full advantage of the parallelism in manycore systems. One of the 'More than Moore' approaches that is being explored to address this problem is integration of diverse functional components onto a substrate using 2.5D integration technology. 2.5D integration provides opportunities to exploit chiplet placement flexibility to address the dark silicon problem and mitigate the thermal stress of today's high-performance systems. These opportunities can be leveraged to improve the overall performance of the manycore heterogeneous computing systems. Broadly, this thesis aims at designing thermally-aware 2.5D systems. More specifically, to address the dark silicon problem of manycore systems, we first propose a single-layer thermally-aware chiplet organization methodology for homogeneous 2.5D systems. The key idea is to strategically insert spacing between the chiplets of a 2.5D manycore system to lower the operating temperature, and thus reclaim dark silicon by allowing more active cores and/or higher operating frequency under a temperature threshold. We investigate manufacturing cost and thermal behavior of 2.5D systems, then formulate and solve an optimization problem that jointly maximizes performance and minimizes manufacturing cost. We then enhance our methodology by incorporating a cross-layer co-optimization approach. We jointly maximize performance and minimize manufacturing cost and operating temperature across logical, physical, and circuit layers. We propose a novel gas-station link design that enables pipelining in passive interposers. We then extend our thermally-aware optimization methodology for network routing and chiplet placement of heterogeneous 2.5D systems, which consist of central processing unit (CPU) chiplets, graphics processing unit (GPU) chiplets, accelerator chiplets, and/or memory stacks. We jointly minimize the total wirelength and the system temperature. Our enhanced methodology increases the thermal design power budget and thereby improves thermal-constraint performance of the system

    REDUCING POWER DURING MANUFACTURING TEST USING DIFFERENT ARCHITECTURES

    Get PDF
    Power during manufacturing test can be several times higher than power consumption in functional mode. Excessive power during test can cause IR drop, over-heating, and early aging of the chips. In this dissertation, three different architectures have been introduced to reduce test power in general cases as well as in certain scenarios, including field test. In the first architecture, scan chains are divided into several segments. Every segment needs a control bit to enable capture in a segment when new faults are detectable on that segment for that pattern. Otherwise, the segment should be disabled to reduce capture power. We group the control bits together into one or more control chains. To address the extra pin(s) required to shift data into the control chain(s) and significant post processing in the first architecture, we explored a second architecture. The second architecture stitches the control bits into the chains they control as EECBs (embedded enable capture bits) in between the segments. This allows an ATPG software tool to automatically generate the appropriate EECB values for each pattern to maintain the fault coverage. This also works in the presence of an on-chip decompressor. The last architecture focuses primarily on the self-test of a device in a 3D stacked IC when an existing FPGA in the stack can be programmed as a tester. We show that the energy expended during test is significantly less than would be required using low power patterns fed by an on-chip decompressor for the same very short scan chains

    Reliable Design of Three-Dimensional Integrated Circuits

    Get PDF

    Heterogeneous Chip Multiprocessor: Data Representation, Mixed-Signal Processing Tiles, and System Design

    Get PDF
    With the emergence of big data, the need for more computationally intensive processors that can handle the increased processing demand has risen. Conventional computing paradigms based on the Von Neumann model that separates computational and memory structures have become outdated and less efficient for this increased demand. As the speed and memory density of processors have increased significantly over the years, these models of computing, which rely on a constant stream of data between the processor and memory, see less gains due to finite bandwidth and latency. Moreover, in the presence of extreme scaling, these conventional systems, implemented in submicron integrated circuits, have become even more susceptible to process variability, static leakage current, and more. In this work, alternative paradigms, predicated on distributive processing with robust data representation and mixed-signal processing tiles, are explored for constructing more efficient and scalable computing systems in application specific integrated circuits (ASICs). The focus of this dissertation work has been on heterogeneous chip multi-processor (CMP) design and optimization across different levels of abstraction. On the level of data representation, a different modality of representation based on random pulse density modulation (RPDM) coding is explored for more efficient processing using stochastic computation. On the level of circuit description, mixed-signal integrated circuits that exploit charge-based computing for energy efficient fixed point arithmetic are designed. Consequently, 8 different chips that test and showcase these circuits were fabricated in submicron CMOS processes. Finally, on the architectural level of description, a compact instruction-set processor and controller that facilitates distributive computing on System-On-Chips (SoCs) is designed. In addition to this, a robust bufferless network architecture is designed with a network simulator, and I/O cells are designed for SoCs. The culmination of this thesis work has led to the design and fabrication of a heterogeneous chip multi- processor prototype comprised of over 12,000 VVM cores, warp/dewarp processors, cache, and additional processors, which can be applied towards energy efficient large-scale data processing

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Thermal, Power Delivery and Reliability Management for 3D ICS

    Get PDF
    Three-dimensional (3D) integration technology is promising to continuously improve the performance of electronic devices by vertically stacking multiple active layers and connecting them with Through-Silicon-Vias (TSVs). Meanwhile, the thermal and power integrity problems are exacerbated since the power flux in 3D integrated circuits (3D ICs) increases linearly with the number of stacked layers. Moreover, the TSV structure in 3D ICs introduces new reliability problems since TSVs are vulnerable to various failure mechanisms (e.g. electromigration) and the failure of power-ground TSVs will cause voltage drop thereby significantly degrading the performance of 3D ICs. To make things worse, the high temperature, thermal gradient and power load in 3D ICs accelerate the failure of TSVs. Therefore, in order to push the 3D integration technology to full commercialization, the thermal, power integrity and reliability problem should be properly addressed in both design-time and run-time. In 3D ICs, the heat flux will easily exceed the capability of the traditional air cooling. Therefore, several aggressive cooling methods are applied to remove heat from the 3D IC, which include micro-fluidic cooling, the phase change material based cooling etc. These cooling schemes are usually implemented close to the heat source to gain high heat removal capability, thus causing more challenges to the design of 3D ICs. Unfortunately, physical design tools for 3D ICs with those aggressive cooling methods are lack. In this thesis, we will focus on 3D ICs with micro-fluidic (MF) cooling. The physical design for this kind of 3D ICs involves complex trade-offs between the circuit performance, power delivery noise, and temperature. For example, both TSVs and micro-cavities for MF cooling are fabricated in the substrate region. Therefore, they will compete in space: the allocation of signal TSVs should avoid micro-cavities to realize a feasible design, thus enforcing more constraints to the physical placement of 3D ICs. Moreover, power delivery networks (PDNs) in 3D ICs are enabled by power-ground (P/G) TSVs. The number and distribution of P/G TSVs are also constrained by micro-cavities which will influence the power integrity of the 3D IC. In addition, the capability of MF cooling degrades downstream the flow of coolant thereby causing large in-layer temperature gradient. The spatial temperature variance will affect the reliability of 3D ICs. in order to avoid it, the gate/modules in 3D ICs should be placed properly. In order to address the trade-offs 3D ICs with MF cooling, different design-time methods for application specific ICs (ASICs) and field programmable gate arrays (FPGAs) are proposed, respectively. For 3D ASICs, we propose a co-design method that integrates the design of MF cooling heat sink and P/G TSVs to the physical placement for 3D ICs. Experiments on publicly available benchmarks show that using our method, we can achieve better results compared to the traditional sequential design flow. The case for 3D FPGAs is more complicated than ASICs since the routing and logic resources are fixed and the chip power and temperature is hard to estimate until the circuit is routed. Therefore, in this thesis, we first build a design space exploration (DSE) framework to study how MF cooling affects the design of 3D FPGAs. Following this, we utilize an existing 3D FPGA placement and routing tool to develop a cooling-aware placement framework for 3D FPGAs to reduce the temperature gradient. Since the activity of 3D ICs cannot be completely estimated at the design stage, the run-time management, besides design-time methods, is required to address the thermal, power and reliability problems in 3D ICs. However, the vertically stacked structure makes the run-time management for 3D ICs more complicated than 2D ICs. The major reason of this is that the power supply noise and temperature can be coupled across layers in 3D ICs. This means the activity of one layer may affect the performance and reliability of other layers through voltage/temperature coupling. As a result, we cannot perform run-time management for each layer (perhaps implemented with dierent chips) of 3D ICs separately as in 2D systems. Therefore, the space of control nodes will become larger and more complicated. To make things worse, the existing run-time management techniques have various drawbacks (e.g. large off-line characterization overhead, poor scalability etc. ), which needs more eort to improve. In this thesis, we propose a phase-driven Q-learning based run-time management technique which can tune the activity of the processor to maximize the 3D CPU performance subject to the reliability constraint

    Signal and power integrity co-simulation using the multi-layer finite difference method

    Get PDF
    Mixed signal system-on-package (SoP) technology is a key enabler for increasing functional integration, especially in mobile and wireless systems. Due to the presence of multiple dissimilar modules, each having unique power supply requirements, the design of the power distribution network (PDN) becomes critical. Typically, this PDN is designed as alternating layers of power and ground planes with signal interconnects routed in between or on top of the planes. The goal for the simulation of multi-layer power/ground planes, is the following: Given a stack-up and other geometrical information, it is required to find the network parameters (S/Y/Z) between port locations. Commercial packages have extremely complicated stack-ups, and the trend to increasing integration at the package level only points to increasing complexity. It is computationally intractable to solve these problems using these existing methods. The approach proposed in this thesis for obtaining the response of the PDN is the multi-layer finite difference method (M-FDM). A surface mesh / finite difference based approach is developed, which leads to a system matrix that is sparse and banded, and can be solved efficiently. The contributions of this research are the following: 1. The development of a PDN modeler for multi-layer packages and boards called the the multi-layer finite difference method. 2. The enhancement of M-FDM using multi-port connection networks to include the effect of fringe fields and gap coupling. 3. An adaptive triangular mesh based scheme called the multi-layer finite element method (MFEM) to address the limitations of M-FDM 4. The use of modal decomposition for the co-simulation of signal nets with the PDN. 5. The use of a robust GA-based optimizer for the selection and placement of decoupling capacitors in multi-layer geometries. 6. Implementation of these methods in a tool called MSDT 1.Ph.D.Committee Chair: Madhavan Swaminathan; Committee Member: Andrew F. Peterson; Committee Member: David C. Keezer; Committee Member: Saibal Mukhopadyay; Committee Member: Suresh Sitarama

    Circuit Design Obfuscation for Hardware Security

    Get PDF
    Nowadays, chip design and chip fabrication are normally conducted separately by independent companies. Most integrated circuit (IC) design companies are now adopting a fab-less model: they outsource the chip fabrication to offshore foundries while concentrating their effort and resource on the chip design. Although it is cost-effective, the outsourced design faces various security threats since the offshore foundries might not be trustworthy. Attacks on the outsourced IC design can take on many forms, such as piracy, counterfeiting, overproduction and malicious modification, which are referred to as IC supply chain attacks. In this work, we investigate several circuit design obfuscation techniques to prevent the IC supply chain attacks by untrusted foundries. Logic locking is a gate-level design obfuscation technique that's proposed to protect the outsourced IC designs from piracy and counterfeiting by untrusted foundries. A locked IC preserves the correct functionality only when a correct key is provided. Recently, the security of logic locking is threatened by a strong attack called SAT attack, which can decipher the correct key of most logic locking techniques within a few hours even for a reasonably large key-size. In this dissertation, we investigate design techniques to improve the security of logic locking in three directions. Firstly, we propose a new locking technique called Anti-SAT to thwart the SAT attack. The Anti-SAT can make the complexity of SAT attack grow exponentially in key-size, hence making the attack computationally infeasible. Secondly, we consider an approximate version of SAT attack and investigate its application on fault-tolerant hardware such as neural network chips. Countermeasure to this approximate SAT attack is proposed and validated with rigorous proof and experiments. Lastly, we explore new opportunities in obfuscating the parametric characteristics of a circuit design (e.g. timing) so that another layer of defense can be added to existing countermeasures. Split fabrication based on 3D integration technology is another approach to obfuscate the outsourced IC designs. 3D integration is a technology that integrates multiple 2D dies to create a single high-performance chip, referred to as 3D IC. With 3D integration, a designer can choose a portion of IC design at his discretion and send them to a trusted foundry for secure fabrication while outsourcing the rest to untrusted foundries for advanced fabrication technology. In this dissertation, we propose a security-aware physical design flow for interposer-based 3D IC (also known as 2.5D IC). The design flow consists of security-aware partitioning and placement phases, which aim at obfuscating the circuit while preventing potential attacks such as proximity attack. Simulation results show that our proposed design flow is effective for producing secure chip layouts against the IC supply chain attacks. The circuit design obfuscation techniques presented in this dissertation enable future chip designers to take security into consideration at an early phase while optimizing the chip's performance, power, and reliability
    corecore