21 research outputs found

    Towards a power consumption estimation model for routers over TCP and UDP protocols

    Get PDF
    Due to the growing development in the information and communication technology (ICT) industry, the usage of routers has increased rapidly. Meanwhile, these devices that are produced and developed today consume a definite amount of power, Furthermore, with limited focus on power estimation techniques and the increased demands of networking devices, it led to an increase of the vitality consumption as a result. While new high capacity router components are installed, energy intake in system elements will be rising due to the higher capability network consuming larger component of the vitality. This study considers providing estimating power model in different traffic settings over TCP and UDP protocols, this study is mainly concerned about the transport protocols power consumption. Isolating the power consuming components within an electronic system is a very precise process that requires deep understanding of the role of each component within the system and a thorough study of the component datasheet. The study started by simulating the protocols mechanism then followed by protoclos power measurements, a simple simulation has been provided for Xilinx Virtex-5, it is very complicated to simulate the whole system due to the need of an external devices, so the simulation focused on wavelengths, frequencies and traffic types. This study found that the estimated power stokes was high when the 1480nm, 1580nm, and 1750nm power source increase. while there were differrence in the consumed power while transiting different types of traffic such as CBR and HTTP through UDP and TCP. The effect of different frequencies has been noticed also while applying different frequencies to the protocols. So it is believed that this study may enhance the power scenarios in the network and routers throug applying different techniques to UDP and TC

    Exploration and Design of Power-Efficient Networked Many-Core Systems

    Get PDF
    Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.Siirretty Doriast

    A RECONFIGURABLE AND EXTENSIBLE EXPLORATION PLATFORM FOR FUTURE HETEROGENEOUS SYSTEMS

    Get PDF
    Accelerator-based -or heterogeneous- computing has become increasingly important in a variety of scenarios, ranging from High-Performance Computing (HPC) to embedded systems. While most solutions use sometimes custom-made components, most of today’s systems rely on commodity highend CPUs and/or GPU devices, which deliver adequate performance while ensuring programmability, productivity, and application portability. Unfortunately, pure general-purpose hardware is affected by inherently limited power-efficiency, that is, low GFLOPS-per-Watt, now considered as a primary metric. The many-core model and architectural customization can play here a key role, as they enable unprecedented levels of power-efficiency compared to CPUs/GPUs. However, such paradigms are still immature and deeper exploration is indispensable. This dissertation investigates customizability and proposes novel solutions for heterogeneous architectures, focusing on mechanisms related to coherence and network-on-chip (NoC). First, the work presents a non-coherent scratchpad memory with a configurable bank remapping system to reduce bank conflicts. The experimental results show the benefits of both using a customizable hardware bank remapping function and non-coherent memories for some types of algorithms. Next, we demonstrate how a distributed synchronization master better suits many-cores than standard centralized solutions. This solution, inspired by the directory-based coherence mechanism, supports concurrent synchronizations without relying on memory transactions. The results collected for different NoC sizes provided indications about the area overheads incurred by our solution and demonstrated the benefits of using a dedicated hardware synchronization support. Finally, this dissertation proposes an advanced coherence subsystem, based on the sparse directory approach, with a selective coherence maintenance system which allows coherence to be deactivated for blocks that do not require it. Experimental results show that the use of a hybrid coherent and non-coherent architectural mechanism along with an extended coherence protocol can enhance performance. The above results were all collected by means of a modular and customizable heterogeneous many-core system developed to support the exploration of power-efficient high-performance computing architectures. The system is based on a NoC and a customizable GPU-like accelerator core, as well as a reconfigurable coherence subsystem, ensuring application-specific configuration capabilities. All the explored solutions were evaluated on this real heterogeneous system, which comes along with the above methodological results as part of the contribution in this dissertation. In fact, as a key benefit, the experimental platform enables users to integrate novel hardware/software solutions on a full-system scale, whereas existing platforms do not always support a comprehensive heterogeneous architecture exploration

    Développement d'architectures HW/SW tolérantes aux fautes et auto-calibrantes pour les technologies Intégrées 3D

    Get PDF
    Malgré les avantages de l'intégration 3D, le test, le rendement et la fiabilité des Through-Silicon-Vias (TSVs) restent parmi les plus grands défis pour les systèmes 3D à base de Réseaux-sur-Puce (Network-on-Chip - NoC). Dans cette thèse, une stratégie de test hors-ligne a été proposé pour les interconnections TSV des liens inter-die des NoCs 3D. Pour le TSV Interconnect Built-In Self-Test (TSV-IBIST) on propose une nouvelle stratégie pour générer des vecteurs de test qui permet la détection des fautes structuraux (open et short) et paramétriques (fautes de délaye). Des stratégies de correction des fautes transitoires et permanents sur les TSV sont aussi proposées aux plusieurs niveaux d'abstraction: data link et network. Au niveau data link, des techniques qui utilisent des codes de correction (ECC) et retransmission sont utilisées pour protégé les liens verticales. Des codes de correction sont aussi utilisés pour la protection au niveau network. Les défauts de fabrication ou vieillissement des TSVs sont réparé au niveau data link avec des stratégies à base de redondance et sérialisation. Dans le réseau, les liens inter-die défaillante ne sont pas utilisables et un algorithme de routage tolérant aux fautes est proposé. On peut implémenter des techniques de tolérance aux fautes sur plusieurs niveaux. Les résultats ont montré qu'une stratégie multi-level atteint des très hauts niveaux de fiabilité avec un cout plus bas. Malheureusement, il n'y as pas une solution unique et chaque stratégie a ses avantages et limitations. C'est très difficile d'évaluer tôt dans le design flow les couts et l'impact sur la performance. Donc, une méthodologie d'exploration de la résilience aux fautes est proposée pour les NoC 3D mesh.3D technology promises energy-efficient heterogeneous integrated systems, which may open the way to thousands cores chips. Silicon dies containing processing elements are stacked and connected by vertical wires called Through-Silicon-Vias. In 3D chips, interconnecting an increasing number of processing elements requires a scalable high-performance interconnect solution: the 3D Network-on-Chip. Despite the advantages of 3D integration, testing, reliability and yield remain the major challenges for 3D NoC-based systems. In this thesis, the TSV interconnect test issue is addressed by an off-line Interconnect Built-In Self-Test (IBIST) strategy that detects both structural (i.e. opens, shorts) and parametric faults (i.e. delays and delay due to crosstalk). The IBIST circuitry implements a novel algorithm based on the aggressor-victim scenario and alleviates limitations of existing strategies. The proposed Kth-aggressor fault (KAF) model assumes that the aggressors of a victim TSV are neighboring wires within a distance given by the aggressor order K. Using this model, TSV interconnect tests of inter-die 3D NoC links may be performed for different aggressor order, reducing test times and circuitry complexity. In 3D NoCs, TSV permanent and transient faults can be mitigated at different abstraction levels. In this thesis, several error resilience schemes are proposed at data link and network levels. For transient faults, 3D NoC links can be protected using error correction codes (ECC) and retransmission schemes using error detection (Automatic Retransmission Query) and correction codes (i.e. Hybrid error correction and retransmission).For transients along a source-destination path, ECC codes can be implemented at network level (i.e. Network-level Forward Error Correction). Data link solutions also include TSV repair schemes for faults due to fabrication processes (i.e. TSV-Spare-and-Replace and Configurable Serial Links) and aging (i.e. Interconnect Built-In Self-Repair and Adaptive Serialization) defects. At network-level, the faulty inter-die links of 3D mesh NoCs are repaired by implementing a TSV fault-tolerant routing algorithm. Although single-level solutions can achieve the desired yield / reliability targets, error mitigation can be realized by a combination of approaches at several abstraction levels. To this end, multi-level error resilience strategies have been proposed. Experimental results show that there are cases where this multi-layer strategy pays-off both in terms of cost and performance. Unfortunately, one-fits-all solution does not exist, as each strategy has its advantages and limitations. For system designers, it is very difficult to assess early in the design stages the costs and the impact on performance of error resilience. Therefore, an error resilience exploration (ERX) methodology is proposed for 3D NoCs.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Green Wave : A Semi Custom Hardware Architecture for Reverse Time Migration

    Get PDF
    Over the course of the last few decades the scientific community greatly benefited from steady advances in compute performance. Until the early 2000's this performance improvement was achieved through rising clock rates. This enabled plug-n-play performance improvements for all codes. In 2005 the stagnation of CPU clock rates drove the computing hardware manufactures to attain future performance through explicit parallelism. Now the HPC community faces a new, even bigger challenge. So far performance gains were achieved through replication of general-purpose cores and nodes. Unfortunately, rising cluster sizes resulted in skyrocketing energy costs - a paradigm change in HPC architecture design is inevitable. In combination with the increasing costs of data movement, the HPC community started exploring alternatives like GPUs and large arrays of simple, low-power cores (e.g. BlueGene) to offer the better performance per Watt and greatest scalability. As in general science, the seismic community faces large-scale, complex computational challenges that can only be limited solved with available compute capabilities. Such challenges include the physically correct modeling of subsurface rock layers. This thesis analyzes the requirements and performance of isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) wave propagation kernels as they appear in the Reverse Time Migration (RTM) imaging method. It finds that even with leading-edge, commercial off-the-shelf hardware, large-scale survey sizes cannot be imaged within reasonable time and power constraints. This thesis uses a novel architecture design method leveraging a hardware/software co-design approach, adopted from the mobile- and embedded market, for HPC. The methodology tailors an architecture design to a class of applications without loss of generality like in full custom designs. This approach was first applied in the Green Flash project, which proved that the co-design approach has the potential for high energy efficiency gains. This thesis presents the novel Green Wave architecture that is derived from the Green Flash project. Rather than focusing on climate codes, like Green Flash, Green Wave chooses RTM wave propagation kernels as its target application. Thus, the goal of the application-driven, co-design Green Wave approach, is to enable full programmability while allowing greater computational efficiency than general-purpose processors or GPUs by offering custom extensions to the processor's ISA and correctly sizing software-managed memories and an efficient on-chip network interconnect. The lowest level building blocks of the Green Wave design are pre-verified IP components. This minimizes the amount of custom logic in the design, which in turn reduces verification costs and design uncertainty. In this thesis three Green Wave architecture designs derived from ISO, VTI and TTI kernel analysis are introduced. Further, a programming model is proposed capable of hiding all communication latencies. With production-strength, cycle-accurate hardware simulators Green Wave's performance is benchmarked and its performance compared to leading on-market systems from Intel, AMD and NVidia. Based on a large-scale example survey, the results show that Green Wave has the potential of an energy efficiency improvement of 5x compared to x86 and 1.4x-4x to GPU-based clusters for ISO, VTI and TTI kernels

    MLCAD: A Survey of Research in Machine Learning for CAD Keynote Paper

    Get PDF

    A Bio-inspired Load Balancing Technique for Wireless Sensor Networks

    Get PDF
    Wireless Sensor Networks (WSNs) consist of multiple distributed nodes each with limited resources. With their strict resource constraints and application-specific characteristics, WSNs contain many challenging trade-offs. This thesis is concerned with the load balancing of Wireless Sensor Networks (WSNs). We present an approach, inspired by bees’ pheromone propagation mechanism, that allows individual nodes to decide on the execution process locally to solve the trade-off between service availability and energy consumption. We explore the performance consequences of the pheromone-based load balancing approach using a system-level simulator. The effectiveness of the algorithm is evaluated on case studies based on sound sensors with different scenarios of existing approaches on variety of different network topologies. The performance of our approach is dependant on the values chosen for its parameters. As such, we utilise the Simulated Annealing to discover optimal parameter configurations for pheromone-based load balancing technique for any given network schema. Once the parameter values are optimised for the given network topology automatically, we inspect improving the pheromone-based load balancing approach using robotic agents. As cyber-physical systems benefit from the heterogeneity of the hardware components, we introduce the use of pheromone signalling-based robotic guidance that integrates the robotic agents to the existing load balancing approach by guiding the robots into the uncovered area of the sensor field. As such, we maximise the service availability using the robotic agents as well as the sensor nodes
    corecore