155 research outputs found

    Power-aware Manhattan routing on chip multiprocessors

    Get PDF
    Nous nous intéressons au routage des communications dans un processeur multi-cœur (CMP). Le but est de trouver un routage valide, c'est-à-dire un routage dans lequel la quantité de données routée entre deux cœurs voisins ne dépasse pas la bande passante maximale, et tel que la puissance dissipée dans les communications est minimale. Nous nous positionnons au niveau système : nous supposons que des applications, sous forme de graphes de tâches, s'exécutent sur le CMP, chaque tâche étant déjà assignée à un cœur. Nous avons donc un ensemble de communications à router entre les cœurs. Nous utilisons un modèle classique, dans lequel la puissance dissipée par un lien de communication est la somme d'une partie statique et d'une partie dynamique, cette dernière dépendant de la fréquence du lien. Cette fréquence est ajustable et proportionnelle à la bande passante. La politique la plus utilisée est le routage XY : chaque communication est en- voyée horizontalement, puis verticalement. Cependant si nous nous autorisons à utiliser les chemins de Manhattan entre la source et la destination, la puissance dissipée peut être considérablement réduite. De plus, il est parfois possible de trouver une solution, alors qu'il n'en existait pas avec un routage XY. Dans ce papier, nous comparons le routage XY et le routage via des chemins de Manhattan, aussi bien d'un point de vue théorique que d'un point de vue pratique. Nous considérons deux variantes du routage par chemins de Manhattan : dans un routage à chemin unique, un seul chemin peut être utilisé pour chaque communication, tandis que le routage à chemin multiples nous permet d'éclater une communication et de lui faire emprunter plusieurs routes. Nous établissons la NP-complétude du problème consistant à trouver un routage Manhattan qui minimise la puissance dissipée, exhibons la borne supérieure minimale du ratio entre la puissance dissipée par un routage XY et celle dissipée par un routage Manhattan, et pour terminer, nous effectuons des simulations pour étudier les performances de nos heuristiques de routage Manhattan

    Cache Equalizer: A Cache Pressure Aware Block Placement Scheme for Large-Scale Chip Multiprocessors

    Get PDF
    This paper describes Cache Equalizer (CE), a novel distributed cache management scheme for large scale chip multiprocessors (CMPs). Our work is motivated by large asymmetry in cache sets usages. CE decouples the physical locations of cache blocks from their addresses for the sake of reducing misses caused by destructive interferences. Temporal pressure at the on-chip last-level cache, is continuously collected at a group (comprised of cache sets) granularity, and periodically recorded at the memory controller to guide the placement process. An incoming block is consequently placed at a cache group that exhibits the minimum pressure. CE provides Quality of Service (QoS) by robustly offering better performance than the baseline shared NUCA cache. Simulation results using a full-system simulator demonstrate that CE outperforms shared NUCA caches by an average of 15.5% and by as much as 28.5% for the benchmark programs we examined. Furthermore, evaluations manifested the outperformance of CE versus related CMP cache designs

    Energy and performance-aware application mapping for inhomogeneous 3D networks-on-chip

    Get PDF
    Three dimensional Networks-on-Chip (3D NoCs) have evolved as an ideal solution to the communication demands and complexity of future high density many core architectures. However, the design practicality of 3D NoCs faces several challenges such as thermal issues, high power consumption and area overhead of 3D routers as well as high complexity and cost of vertical link implementation. To mitigate the performance and manufacturing cost of 3D NoCs, inhomogeneous architectures have emerged to combine 2D and 3D routers in 3D NoCs producing lower area and energy consumption while maintaining the performance of homogeneous 3D NoCs. Due to the limited number of vertical links, application mapping on inhomogeneous 3D NoCs can be complex. However, application mapping has a great impact on the performance and energy consumption of NoCs. This paper presents an energy and performance aware application mapping algorithm for inhomogeneous 3D NoCs. The algorithm has been evaluated with various realistic traffic patterns and compared with existing mapping algorithms. Experimental results show NoCs mapped with the proposed algorithm have lower energy consumption and significant reduction in packet delays compared to the existing algorithms and comparable average packet latency with Branch-and-Bound

    Temperature Evaluation of NoC Architectures and Dynamically Reconfigurable NoC

    Get PDF
    Advancements in the field of chip fabrication led to the integration of a large number of transistors in a small area, giving rise to the multi–core processor era. Massive multi–core processors facilitate innovation and research in the field of healthcare, defense, entertainment, meteorology and many others. Reduction in chip area and increase in the number of on–chip cores is accompanied by power and temperature issues. In high performance multi–core chips, power and heat are predominant constraints. High performance massive multicore systems suffer from thermal hotspots, exacerbating the problem of reliability in deep submicron technologies. High power consumption not only increases the chip temperature but also jeopardizes the integrity of the system. Hence, there is a need to explore holistic power and thermal optimization and management strategies for massive on–chip multi–core environments. In multi–core environments, the communication fabric plays a major role in deciding the efficiency of the system. In multi–core processor chips this communication infrastructure is predominantly a Network–on–Chip (NoC). Tradition NoC designs incorporate planar interconnects as a result these NoCs have long, multi–hop wireline links for data exchange. Due to the presence of multi–hop planar links such NoC architectures fall prey to high latency, significant power dissipation and temperature hotspots. Networks inspired from nature are envisioned as an enabling technology to achieve highly efficient and low power NoC designs. Adopting wireless technology in such architectures enhance their performance. Placement of wireless interconnects (WIs) alters the behavior of the network and hence a random deployment of WIs may not result in a thermally optimal solution. In such scenarios, the WIs being highly efficient would attract high traffic densities resulting in thermal hotspots. Hence, the location and utilization of the wireless links is a key factor in obtaining a thermal optimal highly efficient Network–on–chip. Optimization of the NoC framework alone is incapable of addressing the effects due to the runtime dynamics of the system. Minimal paths solely optimized for performance in the network may lead to excessive utilization of certain NoC components leading to thermal hotspots. Hence, architectural innovation in conjunction with suitable power and thermal management strategies is the key for designing high performance and energy–efficient multicore systems. This work contributes at exploring various wired and wireless NoC architectures that achieve best trade–offs between temperature, performance and energy–efficiency. It further proposes an adaptive routing scheme which factors in the thermal profile of the chip. The proposed routing mechanism dynamically reacts to the thermal profile of the chip and takes measures to avoid thermal hotspots, achieving a thermally efficient dynamically reconfigurable network on chip architecture

    Hardware/Software Co-design for Multicore Architectures

    Get PDF
    Siirretty Doriast

    Channel routing for integrated optics

    Get PDF
    pre-printIncreasing scope and applications of integrated optics necessitates the development of automated techniques for physical design of optical systems. This paper presents an automated, planar channel routing technique for integrated optical waveguides. Integrated optics is a planar technology and lacks the inherent signal restoration capabilities of static-CMOS. Therefore, signal loss minimization-as a function of waveguide crossings and bends-is the primary objective of this technique. This is in contrast to track and wire-length minimization of traditional VLSI routing. Our optical channel router guarantees minimal waveguide crossings by drawing upon sorting-based techniques for waveguide routing. To further improve our solutions in terms of signal loss, we extend the router to reduce the number of bends produced during routing. Finally, we implement the optical channel routing technique and describe the experimental results, comparing the costs of routing solutions with respect to waveguide crossings, bends, and channel height
    • …
    corecore