Search CORE

21 research outputs found

A NoC-based hybrid message-passing/shared-memory approach to CMP design

Author: Agarwal
Daemen
Forsell
Grecu
Karniadakis
Lorensen
Mario R. Casu
Massimo Ruo Roch
Maurizio Zamboni
Owens
Paulin
Radulescu
Sergio V. Tota
Snir
Tota
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Towards a power consumption estimation model for routers over TCP and UDP protocols

Author: Mohammed Ali Qusay
Publication venue
Publication date: 01/01/2015
Field of study

Due to the growing development in the information and communication technology (ICT) industry, the usage of routers has increased rapidly. Meanwhile, these devices that are produced and developed today consume a definite amount of power, Furthermore, with limited focus on power estimation techniques and the increased demands of networking devices, it led to an increase of the vitality consumption as a result. While new high capacity router components are installed, energy intake in system elements will be rising due to the higher capability network consuming larger component of the vitality. This study considers providing estimating power model in different traffic settings over TCP and UDP protocols, this study is mainly concerned about the transport protocols power consumption. Isolating the power consuming components within an electronic system is a very precise process that requires deep understanding of the role of each component within the system and a thorough study of the component datasheet. The study started by simulating the protocols mechanism then followed by protoclos power measurements, a simple simulation has been provided for Xilinx Virtex-5, it is very complicated to simulate the whole system due to the need of an external devices, so the simulation focused on wavelengths, frequencies and traffic types. This study found that the estimated power stokes was high when the 1480nm, 1580nm, and 1750nm power source increase. while there were differrence in the consumed power while transiting different types of traffic such as CBR and HTTP through UDP and TCP. The effect of different frequencies has been noticed also while applying different frequencies to the protocols. So it is believed that this study may enhance the power scenarios in the network and routers throug applying different techniques to UDP and TC

Universiti Utara Malaysia: UUM eTheses

Exploration and Design of Power-Efficient Networked Many-Core Systems

Author: Rahmani-Sane Amir-Mohammad
Publication venue: Turku Centre for Computer Science
Publication date: 14/12/2012
Field of study

Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

UTUPub

A RECONFIGURABLE AND EXTENSIBLE EXPLORATION PLATFORM FOR FUTURE HETEROGENEOUS SYSTEMS

Author: Gagliardi Mirko
Publication venue
Publication date: 10/12/2018
Field of study

Accelerator-based -or heterogeneous- computing has become increasingly important in a variety of scenarios, ranging from High-Performance Computing (HPC) to embedded systems. While most solutions use sometimes custom-made components, most of today’s systems rely on commodity highend CPUs and/or GPU devices, which deliver adequate performance while ensuring programmability, productivity, and application portability. Unfortunately, pure general-purpose hardware is affected by inherently limited power-efficiency, that is, low GFLOPS-per-Watt, now considered as a primary metric. The many-core model and architectural customization can play here a key role, as they enable unprecedented levels of power-efficiency compared to CPUs/GPUs. However, such paradigms are still immature and deeper exploration is indispensable. This dissertation investigates customizability and proposes novel solutions for heterogeneous architectures, focusing on mechanisms related to coherence and network-on-chip (NoC). First, the work presents a non-coherent scratchpad memory with a configurable bank remapping system to reduce bank conflicts. The experimental results show the benefits of both using a customizable hardware bank remapping function and non-coherent memories for some types of algorithms. Next, we demonstrate how a distributed synchronization master better suits many-cores than standard centralized solutions. This solution, inspired by the directory-based coherence mechanism, supports concurrent synchronizations without relying on memory transactions. The results collected for different NoC sizes provided indications about the area overheads incurred by our solution and demonstrated the benefits of using a dedicated hardware synchronization support. Finally, this dissertation proposes an advanced coherence subsystem, based on the sparse directory approach, with a selective coherence maintenance system which allows coherence to be deactivated for blocks that do not require it. Experimental results show that the use of a hybrid coherent and non-coherent architectural mechanism along with an extended coherence protocol can enhance performance. The above results were all collected by means of a modular and customizable heterogeneous many-core system developed to support the exploration of power-efficient high-performance computing architectures. The system is based on a NoC and a customizable GPU-like accelerator core, as well as a reconfigurable coherence subsystem, ensuring application-specific configuration capabilities. All the explored solutions were evaluated on this real heterogeneous system, which comes along with the above methodological results as part of the contribution in this dissertation. In fact, as a key benefit, the experimental platform enables users to integrate novel hardware/software solutions on a full-system scale, whereas existing platforms do not always support a comprehensive heterogeneous architecture exploration

Università degli Studi di Napoli Federico Il Open Archive

Développement d'architectures HW/SW tolérantes aux fautes et auto-calibrantes pour les technologies Intégrées 3D

Author: ANGHEL Lorena
PASCA Vladimir
Publication venue
Publication date: 01/01/2013
Field of study

Malgré les avantages de l'intégration 3D, le test, le rendement et la fiabilité des Through-Silicon-Vias (TSVs) restent parmi les plus grands défis pour les systèmes 3D à base de Réseaux-sur-Puce (Network-on-Chip - NoC). Dans cette thèse, une stratégie de test hors-ligne a été proposé pour les interconnections TSV des liens inter-die des NoCs 3D. Pour le TSV Interconnect Built-In Self-Test (TSV-IBIST) on propose une nouvelle stratégie pour générer des vecteurs de test qui permet la détection des fautes structuraux (open et short) et paramétriques (fautes de délaye). Des stratégies de correction des fautes transitoires et permanents sur les TSV sont aussi proposées aux plusieurs niveaux d'abstraction: data link et network. Au niveau data link, des techniques qui utilisent des codes de correction (ECC) et retransmission sont utilisées pour protégé les liens verticales. Des codes de correction sont aussi utilisés pour la protection au niveau network. Les défauts de fabrication ou vieillissement des TSVs sont réparé au niveau data link avec des stratégies à base de redondance et sérialisation. Dans le réseau, les liens inter-die défaillante ne sont pas utilisables et un algorithme de routage tolérant aux fautes est proposé. On peut implémenter des techniques de tolérance aux fautes sur plusieurs niveaux. Les résultats ont montré qu'une stratégie multi-level atteint des très hauts niveaux de fiabilité avec un cout plus bas. Malheureusement, il n'y as pas une solution unique et chaque stratégie a ses avantages et limitations. C'est très difficile d'évaluer tôt dans le design flow les couts et l'impact sur la performance. Donc, une méthodologie d'exploration de la résilience aux fautes est proposée pour les NoC 3D mesh.3D technology promises energy-efficient heterogeneous integrated systems, which may open the way to thousands cores chips. Silicon dies containing processing elements are stacked and connected by vertical wires called Through-Silicon-Vias. In 3D chips, interconnecting an increasing number of processing elements requires a scalable high-performance interconnect solution: the 3D Network-on-Chip. Despite the advantages of 3D integration, testing, reliability and yield remain the major challenges for 3D NoC-based systems. In this thesis, the TSV interconnect test issue is addressed by an off-line Interconnect Built-In Self-Test (IBIST) strategy that detects both structural (i.e. opens, shorts) and parametric faults (i.e. delays and delay due to crosstalk). The IBIST circuitry implements a novel algorithm based on the aggressor-victim scenario and alleviates limitations of existing strategies. The proposed Kth-aggressor fault (KAF) model assumes that the aggressors of a victim TSV are neighboring wires within a distance given by the aggressor order K. Using this model, TSV interconnect tests of inter-die 3D NoC links may be performed for different aggressor order, reducing test times and circuitry complexity. In 3D NoCs, TSV permanent and transient faults can be mitigated at different abstraction levels. In this thesis, several error resilience schemes are proposed at data link and network levels. For transient faults, 3D NoC links can be protected using error correction codes (ECC) and retransmission schemes using error detection (Automatic Retransmission Query) and correction codes (i.e. Hybrid error correction and retransmission).For transients along a source-destination path, ECC codes can be implemented at network level (i.e. Network-level Forward Error Correction). Data link solutions also include TSV repair schemes for faults due to fabrication processes (i.e. TSV-Spare-and-Replace and Configurable Serial Links) and aging (i.e. Interconnect Built-In Self-Repair and Adaptive Serialization) defects. At network-level, the faulty inter-die links of 3D mesh NoCs are repaired by implementing a TSV fault-tolerant routing algorithm. Although single-level solutions can achieve the desired yield / reliability targets, error mitigation can be realized by a combination of approaches at several abstraction levels. To this end, multi-level error resilience strategies have been proposed. Experimental results show that there are cases where this multi-layer strategy pays-off both in terms of cost and performance. Unfortunately, one-fits-all solution does not exist, as each strategy has its advantages and limitations. For system designers, it is very difficult to assess early in the design stages the costs and the impact on performance of error resilience. Therefore, an error resilience exploration (ERX) methodology is proposed for 3D NoCs.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

OpenGrey Repository

Jitsuyōteki sōgo ketsugōmō no tame no rironteki sekkei hōhōron

Author: Yasudo Ryōta
ヤスドリョウタ
安戸僚汰
Publication venue: 慶應義塾大学大学院理工学研究科
Publication date
Field of study

KeiO Academic Resource Archive

Green Wave : A Semi Custom Hardware Architecture for Reverse Time Migration

Author: Krüger Jens-Thomas
Publication venue
Publication date: 01/01/2012
Field of study

Over the course of the last few decades the scientific community greatly benefited from steady advances in compute performance. Until the early 2000's this performance improvement was achieved through rising clock rates. This enabled plug-n-play performance improvements for all codes. In 2005 the stagnation of CPU clock rates drove the computing hardware manufactures to attain future performance through explicit parallelism. Now the HPC community faces a new, even bigger challenge. So far performance gains were achieved through replication of general-purpose cores and nodes. Unfortunately, rising cluster sizes resulted in skyrocketing energy costs - a paradigm change in HPC architecture design is inevitable. In combination with the increasing costs of data movement, the HPC community started exploring alternatives like GPUs and large arrays of simple, low-power cores (e.g. BlueGene) to offer the better performance per Watt and greatest scalability. As in general science, the seismic community faces large-scale, complex computational challenges that can only be limited solved with available compute capabilities. Such challenges include the physically correct modeling of subsurface rock layers. This thesis analyzes the requirements and performance of isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) wave propagation kernels as they appear in the Reverse Time Migration (RTM) imaging method. It finds that even with leading-edge, commercial off-the-shelf hardware, large-scale survey sizes cannot be imaged within reasonable time and power constraints. This thesis uses a novel architecture design method leveraging a hardware/software co-design approach, adopted from the mobile- and embedded market, for HPC. The methodology tailors an architecture design to a class of applications without loss of generality like in full custom designs. This approach was first applied in the Green Flash project, which proved that the co-design approach has the potential for high energy efficiency gains. This thesis presents the novel Green Wave architecture that is derived from the Green Flash project. Rather than focusing on climate codes, like Green Flash, Green Wave chooses RTM wave propagation kernels as its target application. Thus, the goal of the application-driven, co-design Green Wave approach, is to enable full programmability while allowing greater computational efficiency than general-purpose processors or GPUs by offering custom extensions to the processor's ISA and correctly sizing software-managed memories and an efficient on-chip network interconnect. The lowest level building blocks of the Green Wave design are pre-verified IP components. This minimizes the amount of custom logic in the design, which in turn reduces verification costs and design uncertainty. In this thesis three Green Wave architecture designs derived from ISO, VTI and TTI kernel analysis are introduced. Further, a programming model is proposed capable of hiding all communication latencies. With production-strength, cycle-accurate hardware simulators Green Wave's performance is benchmarked and its performance compared to leading on-market systems from Intel, AMD and NVidia. Based on a large-scale example survey, the results show that Green Wave has the potential of an energy efficiency improvement of 5x compared to x86 and 1.4x-4x to GPU-based clusters for ISO, VTI and TTI kernels

Heidelberger Dokumentenserver

MLCAD: A Survey of Research in Machine Learning for CAD Keynote Paper

Author: Amrouch H.
Henkel J.
Lin Y.
Pan D. Z.
Rapp M.
Wolf M.
Yu B.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 29/11/2021
Field of study

KITopen

Recommended from our members

Semi-Autonomous Small Unmanned Aircraft Systems for Sampling Tornadic Supercell Thunderstorms

Author: Elston Jack Steward
Publication venue: CU Scholar
Publication date: 01/01/2011
Field of study

This work describes the development of a network-centric unmanned aircraft system (UAS) for in situ sampling of supercell thunderstorms. UAS have been identified as a well-suited platform for meteorological observations given their portability, endurance, and ability to mitigate atmospheric disturbances. They represent a unique tool for performing targeted sampling in regions of a supercell thunderstorm previously unreachable through other methods. Doppler radar can provide unique measurements of the wind field in and around supercell thunderstorms. In order to exploit this capability, a planner was developed that can optimize ingress trajectories for severe storm penetration. The resulting trajectories were examined to determine the feasibility of such a mission, and to optimize ingress in terms of flight time and exposure to precipitation. A network-centric architecture was developed to handle the large amount of distributed data produced during a storm sampling mission. Creation of this architecture was performed through a bottom-up design approach which reflects and enhances the interplay between networked communication and autonomous aircraft operation. The advantages of the approach are demonstrated through several field and hardware-in-the-loop experiments containing different hardware, networking protocols, and objectives. Results are provided from field experiments involving the resulting network-centric architecture. An airmass boundary was sampled in the Collaborative Colorado Nebraska Unmanned Aircraft Experiment (CoCoNUE). Utilizing lessons learned from CoCoNUE, a new concept of operations (CONOPS) and UAS were developed to perform in situ sampling of supercell thunderstorms. Deployment during the Verification of the Origins of Rotation in Tornadoes Experiment 2 (VOR- TEX2) resulted in the first ever sampling of the airmass associated with the rear flank downdraft of a tornadic supercell thunderstorm by a UAS. Hardware-in-the-loop simulation capability was added to the UAS to enable further assessment of the system and CONOPS. The simulation combines a full six degree-of-freedom aircraft dynamic model with wind and precipitation data from simulations of severe convective storms. Interfaces were written to involve as much of the system\u27s field hardware as possible, including the creation of a simulated radar product server. A variety of simulations were conducted to evaluate different aspects of the CONOPS used for the 2010 VORTEX2 field campaign

CU Scholar Institutional Repository

A Bio-inspired Load Balancing Technique for Wireless Sensor Networks

Author: Caliskanelli Ipek
Publication venue: University of York
Publication date: 31/03/2014
Field of study

Wireless Sensor Networks (WSNs) consist of multiple distributed nodes each with limited resources. With their strict resource constraints and application-specific characteristics, WSNs contain many challenging trade-offs. This thesis is concerned with the load balancing of Wireless Sensor Networks (WSNs). We present an approach, inspired by bees’ pheromone propagation mechanism, that allows individual nodes to decide on the execution process locally to solve the trade-off between service availability and energy consumption. We explore the performance consequences of the pheromone-based load balancing approach using a system-level simulator. The effectiveness of the algorithm is evaluated on case studies based on sound sensors with different scenarios of existing approaches on variety of different network topologies. The performance of our approach is dependant on the values chosen for its parameters. As such, we utilise the Simulated Annealing to discover optimal parameter configurations for pheromone-based load balancing technique for any given network schema. Once the parameter values are optimised for the given network topology automatically, we inspect improving the pheromone-based load balancing approach using robotic agents. As cyber-physical systems benefit from the heterogeneity of the hardware components, we introduce the use of pheromone signalling-based robotic guidance that integrates the robotic agents to the existing load balancing approach by guiding the robots into the uncovered area of the sensor field. As such, we maximise the service availability using the robotic agents as well as the sensor nodes

White Rose E-theses Online