225 research outputs found

    Voltage island-driven floorplanning.

    Get PDF
    Ma, Qiang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (leaves 78-80).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Background --- p.1Chapter 1.2 --- Floorplanning --- p.2Chapter 1.3 --- Motivations --- p.4Chapter 1.4 --- Design Implementation of Voltage Islands --- p.5Chapter 1.5 --- Problem Formulation --- p.8Chapter 1.6 --- Progress on the Problem --- p.10Chapter 1.7 --- Contributions --- p.12Chapter 1.8 --- Thesis Organization --- p.14Chapter 2 --- Literature Review on MSV --- p.15Chapter 2.1 --- Introduction --- p.15Chapter 2.2 --- MSV at Post-floorplan/Post Placement Stage --- p.16Chapter 2.2.1 --- """Post-Placement Voltage Island Generation under Performance Requirement""" --- p.16Chapter 2.2.2 --- """Post-Placement Voltage Island Generation""" --- p.18Chapter 2.2.3 --- """Timing-Constrained and Voltage-Island-Aware Voltage Assignment""" --- p.19Chapter 2.2.4 --- """Voltage Island Generation under Performance Requirement for SoC Designs""" --- p.20Chapter 2.2.5 --- """An ILP Algorithm for Post-Floorplanning Voltage-Island Generation Considering Power-Network Planning""" --- p.21Chapter 2.3 --- MSV at Floorplan/Placement Stage --- p.22Chapter 2.3.1 --- """Architecting Voltage Islands in Core-based System-on-a- Chip Designs""" --- p.22Chapter 2.3.2 --- """Voltage Island Aware Floorplanning for Power and Timing Optimization""" --- p.23Chapter 2.4 --- Summary --- p.27Chapter 3 --- MSV Driven Floorplanning --- p.29Chapter 3.1 --- Introduction --- p.29Chapter 3.2 --- Problem Formulation --- p.32Chapter 3.3 --- Algorithm Overview --- p.33Chapter 3.4 --- Optimal Island Partitioning and Voltage Assignment --- p.33Chapter 3.4.1 --- Voltage Islands in Non-subtrees --- p.35Chapter 3.4.2 --- Proof of Optimality --- p.36Chapter 3.4.3 --- Handling Island with Power Down Mode --- p.37Chapter 3.4.4 --- Speedup in Implementation and Complexity --- p.38Chapter 3.4.5 --- Varying Background Chip-level Voltage --- p.39Chapter 3.5 --- Simulated Annealing --- p.39Chapter 3.5.1 --- Moves --- p.39Chapter 3.5.2 --- Cost Function --- p.40Chapter 3.6 --- Experimental Results --- p.40Chapter 3.6.1 --- Extension to Minimize Level Shifters --- p.45Chapter 3.6.2 --- Extension to Consider Power Network Routing --- p.46Chapter 3.7 --- Summary --- p.46Chapter 4 --- MSV Driven Floorplanning with Timing --- p.49Chapter 4.1 --- Introduction --- p.49Chapter 4.2 --- Problem Formulation --- p.52Chapter 4.3 --- Algorithm Overview --- p.56Chapter 4.4 --- Voltage Assignment Problem --- p.56Chapter 4.4.1 --- Lagrangian Relaxation --- p.58Chapter 4.4.2 --- Transformation into the Primal Minimum Cost Flow Problem --- p.60Chapter 4.4.3 --- Cost-Scaling Algorithm --- p.64Chapter 4.4.4 --- Solution Transformation --- p.66Chapter 4.5 --- Simulated Annealing --- p.69Chapter 4.5.1 --- Moves --- p.69Chapter 4.5.2 --- Speeding up heuristic --- p.69Chapter 4.5.3 --- Cost Function --- p.70Chapter 4.5.4 --- Annealing Schedule --- p.71Chapter 4.6 --- Experimental Results --- p.71Chapter 4.7 --- Summary --- p.72Chapter 5 --- Conclusion --- p.76Bibliography --- p.8

    Clustering-Based Simultaneous Task and Voltage Scheduling for NoC Systems

    Get PDF
    Network-on-Chip (NoC) is emerging as a promising communication structure, which is scalable with respect to chip complexity. Meanwhile, latest chip designs are increasingly leveraging multiple voltage-frequency domains for energy-efficiency improvement. In this work, we propose a simultaneous task and voltage scheduling algorithm for energy minimization in NoC based designs. The energy-latency tradeoff is handled by Lagrangian relaxation. The core algorithm is a clustering based approach which not only assigns voltage levels and starting time to each task (or Processing Element) but also naturally finds voltage-frequency clusters. Compared to a recent previous work, which performs task scheduling and voltage assignment sequentially, our method leads to an average of 20 percent energy reduction

    Thermal/performance trade-off in network-on-chip architectures

    Get PDF
    Multi-core architectures are a promising paradigm to exploit the huge integration density reached by high-performance systems. Indeed, integration density and technology scaling are causing undesirable operating temperatures, having net impact on reduced reliability and increased cooling costs. Dynamic Thermal Management (DTM) approaches have been proposed in literature to control temperature profile at run-time, while design-time approaches generally provide floorplan-driven solutions to cope with temperature constraints. Nevertheless, a suitable approach to collect performance, thermal and reliability metrics has not been proposed, yet. This work presents a novel methodology to jointly optimize temperature/performance trade-off in reliable high-performance parallel architectures with security constraints achieved by workload physical isolation on each core. The proposed methodology is based on a linear formal model relating temperature and duty-cycle on one side, and performance and duty-cycle on the other side. Extensive experimental results on real-world use-case scenarios show the goodness of the proposed model, suitable for design-time system-wide optimization to be used in conjunction with DTM technique

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems

    NoC Topology Synthesis for Supporting Shutdown of Voltage Islands in SoCs

    Get PDF
    In many Systems on Chips (SoCs), the cores are clustered in to voltage islands. When cores in an island are unused, the entire island can be shutdown to reduce the leakage power consumption. However, today, the interconnect architecture is a bottleneck in allowing the shutdown of the islands. In this paper, we present a synthesis approach to obtain customized application-specific Networks on Chips (NoCs) that can support the shutdown of voltage islands. Our results on realistic SoC benchmarks show that the re- sulting NoC designs only have a negligible overhead in SoC active power consumption (average of 3%) and area (average of 0.5%) to support the shutdown of islands. The shutdown support provided can lead to a significant leakage and hence total power savings

    Emerging Run-Time Reconfigurable FPGA and CAD Tools

    Get PDF
    Field-programmable gate array (FPGA) is a post fabrication reconfigurable device to accelerate domain specific computing systems. It offers offer high operation speed and low power consumption. However, the design flexibility and performance of FPGAs are severely constrained by the costly on-chip memories, e.g. static random access memory (SRAM) and FLASH memory. The objective of my dissertation is to explore the opportunity and enable the use of the emerging resistance random access memory (ReRAM) in FPGA design. The emerging ReRAM technology features high storage density, low access power consumption, and CMOS compatibility, making it a promising candidate for FPGA implementation. In particular, ReRAM has advantages of the fast access and nonvolatility, enabling the on-chip storage and access of configuration data. In this dissertation, I first propose a novel three-dimensional stacking scheme, namely, high-density interleaved memory (HIM). The structure improves the density of ReRAM meanwhile effectively reducing the signal interference induced by sneak paths in crossbar arrays. To further enhance the access speed and design reliability, a fast sensing circuit is also presented which includes a new sense amplifier scheme and reference cell configuration. The proposed ReRAM FPGA leverages a similar architecture as conventional SRAM based FPGAs but utilizes ReRAM technology in all component designs. First, HIM is used to implement look-up table (LUT) and block random access memories (BRAMs) for func- tionality process. Second, a 2R1T, two ReRAM cells and one transistor, nonvolatile switch design is applied to construct connection blocks (CBs) and switch blocks (SBs) for signal transition. Furthermore, unified BRAM (uBRAM) based on the current BRAM architecture iv is introduced, offering both configuration and temporary data storage. The uBRAMs provides extremely high density effectively and enlarges the FPGA capacity, potentially saving multiple contexts of configuration. The fast configuration scheme from uBRAM to logic and routing components also makes fast run-time partial reconfiguration (PR) much easier, improving the flexibility and performance of the entire FPGA system. Finally, modern place and route tools are designed for homogeneous fabric of FPGA. The PR feature, however, requires the support of heterogeneous logic modules in order to differentiate PR modules from static ones and therefore maintain the signal integration. The existing approaches still reply on designers’ manual effort, which significantly prolongs design time and lowers design efficiency. In this dissertation, I integrate PR support into VPR – an academic place and route tool by introducing a B*-tree modular placer (BMP) and PR-aware router. As such, users are able to explore new architectures or map PR applications to a variety of FPGAs. More importantly, this enhanced feature can also support fast design automation, e.g. mapping IP core, loading pre-synthesizing logic modules, etc

    Heurísticas bioinspiradas para el problema de Floorplanning 3D térmico de dispositivos MPSoCs

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 20-06-2013Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Physical Planning and Uncore Power Management for Multi-Core Processors

    Get PDF
    For the microprocessor technology of today and the foreseeable future, multi-core is a key engine that drives performance growth under very tight power dissipation constraints. While previous research has been mostly focused on individual processor cores, there is a compelling need for studying how to efficiently manage shared resources among cores, including physical space, on-chip communication and on-chip storage. In managing physical space, floorplanning is the first and most critical step that largely affects communication efficiency and cost-effectiveness of chip designs. We consider floorplanning with regularity constraints that requires identical processing/memory cores to form an array. Such regularity can greatly facilitate design modularity and therefore shorten design turn-around time. Very little attention has been paid to automatic floorplanning considering regularity constraints because manual floorplanning has difficulty handling the complexity as chip core count increases. In this dissertation work, we investigate the regularity constraints in a simulated-annealing based floorplanner for multi/many core processor designs. A simple and effective technique is proposed to encode the regularity constraints in sequence-pair, which is a classic format of data representation in automatic floorplanning. To the best of our knowledge, this is the first work on regularity-constrained floorplanning in the context of multi/many core processor designs. On-chip communication and shared last level cache (LLC) play a role that is at least as equally important as processor cores in terms of chip performance and power. This dissertation research studies dynamic voltage and frequency scaling for on-chip network and LLC, which forms a single uncore domain of voltage and frequency. This is in contrast to most previous works where the network and LLC are partitioned and associated with processor cores based on physical proximity. The single shared domain can largely avoid the interfacing overhead across domain boundaries and is practical and very useful for industrial products. Our goal is to minimize uncore energy dissipation with little, e.g., 5% or less, performance degradation. The first part of this study is to identify a metric that can reflect the chip performance determined by uncore voltage/frequency. The second part is about how to monitor this metric with low overhead and high fidelity. The last part is the control policy that decides uncore voltage/frequency based on monitoring results. Our approach is validated through full system simulations on public architecture benchmarks
    corecore