1,455 research outputs found

    Exploiting Properties of CMP Cache Traffic in Designing Hybrid Packet/Circuit Switched NoCs

    Get PDF
    Chip multiprocessors with few to tens of processing cores are already commercially available. Increased scaling of technology is making it feasible to integrate even more cores on a single chip. Providing the cores with fast access to data is vital to overall system performance. When a core requires access to a piece of data, the core's private cache memory is searched first. If a miss occurs, the data is looked up in the next level(s) of the memory hierarchy, where often one or more levels of cache are shared between two or more cores. Communication between the cores and the slices of the on-chip shared cache is carried through the network-on-chip(NoC). Interestingly, the cache and NoC mutually affect the operation of each other; communication over the NoC affects the access latency of cache data, while the cache organization generates the coherence and data messages, thus affecting the communication patterns and latency over the NoC. This thesis considers hybrid packet/circuit switched NoCs, i.e., packet switched NoCs enhanced with the ability to configure circuits. The communication and performance benefit that come from using circuits is predicated on amortizing the time cost incurred for configuring the circuits. To address this challenge, NoC designs are proposed that take advantage of properties of the cache traffic, namely temporal locality and predictability, to amortize or hide the circuit configuration time cost. First, a coarse-grained circuit configuration policy is proposed that exploits the temporal locality in the cache traffic to periodically configure circuits for the heavily communicating nodes. This allows the design of a locality-aware cache that promotes temporal communication locality through data placement, while designing suitable data replacement and migration policies. Next, a fine-grained configuration policy, called Déjà Vu switching, is proposed for leveraging predictability of data messages by initiating a circuit configuration as soon as a cache hit is detected and before the data becomes available. Its benefit is demonstrated for saving interconnect energy in multi-plane NoCs. Finally, a more proactive configuration policy is proposed for fast caches, where circuit reservations are initiated by request messages, which can greatly improve communication latency and system performance

    Exploration and Design of Power-Efficient Networked Many-Core Systems

    Get PDF
    Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

    The MANGO clockless network-on-chip: Concepts and implementation

    Get PDF

    Hierarchical Agent-based Adaptation for Self-Aware Embedded Computing Systems

    Get PDF
    Siirretty Doriast

    Dynamic Power Management of High Performance Network on Chip

    Get PDF
    With increased density of modern System on Chip(SoC) communication between nodes has become a major problem. Network on Chip is a novel on chip communication paradigm to solve this by using highly scalable and efficient packet switched network. The addition of intelligent networking on the chip adds to the chip’s power consumption thus making management of communication power an interesting and challenging research problem. While VLSI techniques have evolved over time to enable power reduction in the circuit level, the highly dynamic nature of modern large SoC demand more than that. This dissertation explores some innovative dynamic solutions to manage the ever increasing communication power in the post sub-micron era. Today’s highly integrated SoCs require great level of cross layer optimizations to provide maximum efficiency. This dissertation aims at the dynamic power management problem from top. Starting with a system level distribution and management down to microarchitecture enhancements were found necessary to deliver maximum power efficiency. A distributed power budget sharing technique is proposed. To efficiently satisfy the established power budget, a novel flow control and throttling technique is proposed. Finally power efficiency of underlying microarchitecture is explored and novel buffer and link management techniques are developed. All of the proposed techniques yield improvement in power-performance efficiency of the NoC infrastructure

    Energy-Efficient Interconnection Networks for High-Performance Computing

    Get PDF
    In recent years, energy has become one of the most important factors for de- signing and operating large scale computing systems. This is particularly true in high-performance computing, where systems often consist of thousands of nodes. Especially after the end of Dennard’s scaling, the demand for energy- proportionality in components, where energy is depending linearly on utilization, increases continuously. As the main contributor to the overall power consumption, processors have received the main attention so far. The increasing energy proportionality of processors, however, shifts the focus to other components such as interconnection networks. Their share of the overall power consumption is expected to increase to 20% or more while other components further increase their efficiency in the near future. Hence, it is crucial to improve energy proportionality in interconnection networks likewise to reduce overall power and energy consumption. To facilitate these attempts, this work provides comprehensive studies about energy saving in interconnection networks at different levels. First, interconnection networks differ fundamentally from other components in their underlying technology. To gain a deeper understanding of these differences and to identify targets for energy savings, this work provides a detailed power analysis of current network hardware. Furthermore, various applications at different scales are analyzed regarding their communication patterns and locality properties. The findings show that communication makes up only a small fraction of the execution time and networks are actually idling most of the time. Another observation is that point-to-point communication often only occurs within various small subsets of all participants, which indicates that a coordinated mapping could further decrease network traffic. Based on these studies, three different energy-saving policies are designed, which all differ in their implementation and focus. Then, these policies are evaluated in an event-based, power-aware network simulator. While two policies that operate completely local at link level, enable significant energy savings of more than 90% in most analyses, the hybrid one does not provide further benefits despite significant additional design effort. Additionally, these studies include network design parameters, such as transition time between different link configurations, as well as the three most common topologies in supercomputing systems. The final part of this work addresses the interactions of congestion management and energy-saving policies. Although both network management strategies aim for different goals and use opposite approaches, they complement each other and can increase energy efficiency in all studies as well as improve the performance overhead as opposed to plain energy saving

    Resource allocation and scalability in dynamic wavelength-routed optical networks.

    Get PDF
    This thesis investigates the potential benefits of dynamic operation of wavelength-routed optical networks (WRONs) compared to the static approach. It is widely believed that dynamic operation of WRONs would overcome the inefficiencies of the static allocation in improving resource use. By rapidly allocating resources only when and where required, dynamic networks could potentially provide the same service that static networks but at decreased cost, very attractive to network operators. This hypothesis, however, has not been verified. It is therefore the focus of this thesis to investigate whether dynamic operation of WRONs can save significant number of wavelengths compared to the static approach whilst maintaining acceptable levels of delay and scalability. Firstly, the wavelength-routed optical-burst-switching (WR-OBS) network architecture is selected as the dynamic architecture to be studied, due to its feasibility of implementation and its improved network performance. Then, the wavelength requirements of dynamic WR-OBS are evaluated by means of novel analysis and simulation and compared to that of static networks for uniform and non-uniform traffic demand. It is shown that dynamic WR-OBS saves wavelengths with respect to the static approach only at low loads and especially for sparsely connected networks and that wavelength conversion is a key capability to significantly increase the benefits of dynamic operation. The mean delay introduced by dynamic operation of WR-OBS is then assessed. The results show that the extra delay is not significant as to violate end-to-end limits of time-sensitive applications. Finally, the limiting scalability of WR-OBS as a function of the lightpath allocation algorithm computational complexity is studied. The trade-off between the request processing time and blocking probability is investigated and a new low-blocking and scalable lightpath allocation algorithm which improves the mentioned trade-off is proposed. The presented algorithms and results can be used in the analysis and design of dynamic WRONs

    Particle swarm optimization for routing and wavelength assignment in next generation WDM networks.

    Get PDF
    PhDAll-optical Wave Division Multiplexed (WDM) networking is a promising technology for long-haul backbone and large metropolitan optical networks in order to meet the non-diminishing bandwidth demands of future applications and services. Examples could include archival and recovery of data to/from Storage Area Networks (i.e. for banks), High bandwidth medical imaging (for remote operations), High Definition (HD) digital broadcast and streaming over the Internet, distributed orchestrated computing, and peak-demand short-term connectivity for Access Network providers and wireless network operators for backhaul surges. One desirable feature is fast and automatic provisioning. Connection (lightpath) provisioning in optically switched networks requires both route computation and a single wavelength to be assigned for the lightpath. This is called Routing and Wavelength Assignment (RWA). RWA can be classified as static RWA and dynamic RWA. Static RWA is an NP-hard (non-polynomial time hard) optimisation task. Dynamic RWA is even more challenging as connection requests arrive dynamically, on-the-fly and have random connection holding times. Traditionally, global-optimum mathematical search schemes like integer linear programming and graph colouring are used to find an optimal solution for NP-hard problems. However such schemes become unusable for connection provisioning in a dynamic environment, due to the computational complexity and time required to undertake the search. To perform dynamic provisioning, different heuristic and stochastic techniques are used. Particle Swarm Optimisation (PSO) is a population-based global optimisation scheme that belongs to the class of evolutionary search algorithms and has successfully been used to solve many NP-hard optimisation problems in both static and dynamic environments. In this thesis, a novel PSO based scheme is proposed to solve the static RWA case, which can achieve optimal/near-optimal solution. In order to reduce the risk of premature convergence of the swarm and to avoid selecting local optima, a search scheme is proposed to solve the static RWA, based on the position of swarm‘s global best particle and personal best position of each particle. To solve dynamic RWA problem, a PSO based scheme is proposed which can provision a connection within a fraction of a second. This feature is crucial to provisioning services like bandwidth on demand connectivity. To improve the convergence speed of the swarm towards an optimal/near-optimal solution, a novel chaotic factor is introduced into the PSO algorithm, i.e. CPSO, which helps the swarm reach a relatively good solution in fewer iterations. Experimental results for PSO/CPSO based dynamic RWA algorithms show that the proposed schemes perform better compared to other evolutionary techniques like genetic algorithms, ant colony optimization. This is both in terms of quality of solution and computation time. The proposed schemes also show significant improvements in blocking probability performance compared to traditional dynamic RWA schemes like SP-FF and SP-MU algorithms

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems
    corecore