289 research outputs found
A fuzzy logic based dynamic reconfiguration scheme for optimal energy and throughput in symmetric chip multiprocessors
Embedded systems architectures have traditionally often been investigated and designed in order to achieve a greater throughput combined with minimum energy consumption. With the advent of reconfigurable architectures it is now possible to support algorithms to find optimal solutions for an improved energy and throughput balance. As a result of ongoing research several online and offline techniques and algorithm have been proposed for hardware adaptation. This paper presents a novel coarse-grained reconfigurable symmetric chip multiprocessor (SCMP) architecture managed by a fuzzy logic engine that balances performance and energy consumption. The architecture incorporates reconfigurable level 1 (L1) caches, power gated cores and adaptive on-chip network routers to allow minimizing leakage energy effects for inactive components. A coarse grained architecture was selected as to be a focus for this study as it typically allows for fast reconfiguration as compared to the fine-grained architectures, thus making it more feasible to be used for runtime adaption schemes. The presented architecture is analyzed using a set of OpenMP based parallel benchmarks and the results show significant improvements in performance while maintaining minimum energy consumption
Fuzzy logic based energy and throughput aware design space exploration for MPSoCs
Multicore architectures were introduced to mitigate the issue of increase in power dissipation with clock frequency. Introduction of deeper pipelines, speculative threading etc. for single core systems were not able to bring much increase in performance as compared to their associated power overhead. However for multicore architectures performance scaling with number of cores has always been a challenge. The Amdahl's law shows that the theoretical maximum speedup of a multicore architecture is not even close to the multiple of number of cores. With less amount of code in parallel having more number of cores for an application might just contribute in greater power dissipation instead of bringing some performance advantage. Therefore there is a need of an adaptive multicore architecture that can be tailored for the application in use for higher energy efficiency. In this paper a fuzzy logic based design space exploration technique is presented that is targeted to optimize a multicore architecture according to the workload requirements in order to achieve optimum balance between throughput and energy of the system
Self-adaptivity of applications on network on chip multiprocessors: the case of fault-tolerant Kahn process networks
Technology scaling accompanied with higher operating frequencies and the ability to integrate more functionality in the same chip has been the driving force behind delivering higher performance computing systems at lower costs. Embedded computing systems, which have been riding the same wave of success, have evolved into complex architectures encompassing a high number of cores interconnected by an on-chip network (usually identified as Multiprocessor System-on-Chip). However these trends are hindered by issues that arise as technology scaling continues towards deep submicron scales. Firstly, growing complexity of these systems and the variability introduced by process technologies make it ever harder to perform a thorough optimization of the system at design time. Secondly, designers are faced with a reliability wall that emerges as age-related degradation reduces the lifetime of transistors, and as the probability of defects escaping post-manufacturing testing is increased. In this thesis, we take on these challenges within the context of streaming applications running in network-on-chip based parallel (not necessarily homogeneous) systems-on-chip that adopt the no-remote memory access model. In particular, this thesis tackles two main problems: (1) fault-aware online task remapping, (2) application-level self-adaptation for quality management. For the former, by viewing fault tolerance as a self-adaptation aspect, we adopt a cross-layer approach that aims at graceful performance degradation by addressing permanent faults in processing elements mostly at system-level, in particular by exploiting redundancy available in multi-core platforms. We propose an optimal solution based on an integer linear programming formulation (suitable for design time adoption) as well as heuristic-based solutions to be used at run-time. We assess the impact of our approach on the lifetime reliability. We propose two recovery schemes based on a checkpoint-and-rollback and a rollforward technique. For the latter, we propose two variants of a monitor-controller- adapter loop that adapts application-level parameters to meet performance goals. We demonstrate not only that fault tolerance and self-adaptivity can be achieved in embedded platforms, but also that it can be done without incurring large overheads. In addressing these problems, we present techniques which have been realized (depending on their characteristics) in the form of a design tool, a run-time library or a hardware core to be added to the basic architecture
Embedded electronic systems driven by run-time reconfigurable hardware
Abstract
This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen
Esta tesis doctoral abarca el diseño de sistemas electrĂłnicos embebidos basados en tecnologĂa hardware dinámicamente reconfigurable –disponible a travĂ©s de dispositivos lĂłgicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguraciĂłn que proporcione a la FPGA la capacidad de reconfiguraciĂłn dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicaciĂłn particionada en tareas multiplexadas en tiempo y en espacio, optimizando asĂ su implementaciĂłn fĂsica –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalĂşa el flujo de diseño de dicha tecnologĂa a travĂ©s del prototipado de varias aplicaciones de ingenierĂa (sistemas de control, coprocesadores aritmĂ©ticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotaciĂłn en la industria.Resum
Aquesta tesi doctoral estĂ orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinĂ micament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguraciĂł que proporcioni a la FPGA la capacitat de reconfiguraciĂł dinĂ mica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicaciĂł particionada en tasques multiplexades en temps i en espai, optimizant aixĂ la seva implementaciĂł fĂsica –à rea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estĂ tic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalĂşa el fluxe de disseny d’aquesta tecnologia a travĂ©s del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotaciĂł a la indĂşstria
High-performance and hardware-aware computing: proceedings of the second International Workshop on New Frontiers in High-performance and Hardware-aware Computing (HipHaC\u2711), San Antonio, Texas, USA, February 2011 ; (in conjunction with HPCA-17)
High-performance system architectures are increasingly exploiting heterogeneity. The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full
hardware potential if all features on all levels are taken into account in a holistic approach
Software parametrization of feasible reconfigurable real-time systems under energy and dependency constraints
Enforcing temporal constraints is necessary to maintain the correctness of a realtime system. However, a real-time system may be enclosed by many factors and constraints that lead to different challenges to overcome. In other words, to achieve the real-time aspects, these systems face various challenges particularly in terms of architecture, reconfiguration property, energy consumption, and dependency constraints. Unfortunately, the characterization of real-time task deadlines is a relatively unexplored problem in the real-time community. Most of the literature seems to consider that the deadlines are somehow provided as hard assumptions, this can generate high costs relative to the development time if these deadlines are violated at runtime. In this context, the main aim of this thesis is to determine the effective temporal properties that will certainly be met at runtime under well-defined constraints. We went to overcome these challenges in a step-wise manner. Each time, we elected a well-defined subset of challenges to be solved. This thesis deals with reconfigurable real-time systems in mono-core and multi-core architectures. First, we propose a new scheduling strategy based on configuring feasible scheduling of software tasks of various types (periodic, sporadic, and aperiodic) and constraints (hard and soft) mono-core architecture. Then, the second contribution deals with reconfigurable real-time systems in mono-core under energy and resource sharing constraints. Finally, the main objective of the multi-core architecture is achieved in a third contribution.Das Erzwingen zeitlicher Beschränkungen ist notwendig,um die Korrektheit eines Echtzeitsystems aufrechtzuerhalten. Ein Echtzeitsystem kann jedoch von vielen Faktoren und Beschränkungen umgeben sein, die zu unterschiedlichen Herausforderungen führen, die es zu bewältigen gilt. Mit anderen Worten, um die zeitlichen Aspekte zu erreichen, können diese Systeme verschiedenen Herausforderungen gegenüberstehen, einschliesslich Architektur, Rekonfigurationseigenschaft, Energie und Abhängigkeitsbeschränkungen. Leider ist die Charakterisierung von Echtzeit-Aufgabenterminen ein relativ unerforschtes Problem in der Echtzeit-Community. Der grösste Teil der Literatur geht davon aus, dass die Fristen (Deadlines) irgendwie als harte Annahmen bereitgestellt werden, was im Verhältnis zur Entwicklungszeit hohe Kosten verursachen kann, wenn diese Fristen zur Laufzeit verletzt werden. In diesem Zusammenhang ist das Hauptziel dieser Arbeit, die effektiven zeitlichen Eigenschaften zu bestimmen, die zur Laufzeit unter wohldefinierten Randbedingungen mit Sicherheit erfüllt werden. Wir haben diese Herausforderungen schrittweise gemeistert. Jedes Mal haben wir eine wohldefinierte Teilmenge von Herausforderungen ausgewählt, die es zu lösen gilt. Zunächst schlagen wir eine neue Scheduling-Strategie vor, die auf der Konfiguration eines durchführbaren Scheduling von Software-Tasks verschiedener Typen (periodisch, sporadisch und aperiodisch) und Beschränkungen (hart und weich) einer Mono-Core-Architektur basiert. Der zweite Beitrag befasst sich dann mit rekonfigurierbaren Echtzeitsystemen in Mono-Core unter Energie und Ressourcenteilungsbeschränkungen. Abschliessend wird in einem dritten Beitrag das Verfahren auf Multi-Core-Architekturen erweitert
Software development of reconfigurable real-time systems : from specification to implementation
This thesis deals with reconfigurable real-time systems solving real-time tasks scheduling problems in a mono-core and multi-core architectures. The main focus in this thesis is on providing guidelines, methods, and tools for the synthesis of feasible reconfigurable real-time systems in a mono-processor and multi-processor architectures. The development of these systems faces various challenges particularly in terms of stability, energy consumption, response and blocking time. To address this problem, we propose in this work a new strategy of i) placement and scheduling of tasks to execute real-time applications on mono-core and multi-core architectures, ii) optimization step based on Mixed integer linear programming (MILP), and iii) guidance tool that assists designers to implement a feasible multi-core reconfigurable real-time from specification level to implementation level. We apply and simulate the contribution to a case study, and compare the proposed results with related works in order to show the originality of this methodology.Echtzeitsysteme laufen unter harten Bedingungen an ihre Ausführungszeit. Die Einhaltung der Echtzeit-Bedingungen bestimmt die Zuverlässigkeit und Genauigkeit dieser Systeme. Neben den Echtzeit-Bedingungen müssen rekonfigurierbare Echtzeitsysteme zusätzliche Rekonfigurations-Bedingungen erfüllen. Diese Arbeit beschäftigt sich mit rekonfigurierbaren Echtzeitsystemen in Mono- und Multicore-Architekturen. An die Entwicklung dieser Systeme sind verschiedene Anforderungen gestellt. Insbesondere muss die Rekonfigurierbarkeit beachtet werden. Dabei sind aber Echtzeit-Bedingungen und Ressourcenbeschränkungen weiterhin zu beachten. Darüber hinaus werden die Kosten für die Entwicklung dieser Systeme insbesondere durch falsche Designentscheidungen in den frühen Phasen der Entwicklung stark beeinträchtigt. Das Hauptziel in dieser Arbeit liegt deshalb auf der Bereitstellung von Handlungsempfehlungen, Methoden und Werkzeugen für die zielgerichtete Entwicklung von realisierbaren rekonfigurierbaren Echtzeitsystemen in Mono- und Multicore-Architekturen. Um diese Herausforderungen zu adressieren wird eine neue Strategie vorgeschlagen, die 1) die Funktionsallokation, 2) die Platzierung und das Scheduling von Tasks, 3) einen Optimierungsschritt auf der Basis von Mixed Integer Linear Programming (MILP) und 4) eine entscheidungsunterstützende Lösung umfasst, die den Designern hilft, eine realisierbare rekonfigurierbare Echtzeitlösung von der Spezifikationsebene bis zur Implementierungsebene zu entwickeln. Die vorgeschlagene Methodik wird auf eine Fallstudie angewendet und mit verwandten Arbeiten vergliche
On Dynamic Monitoring Methods for Networks-on-Chip
Rapid ongoing evolution of multiprocessors will lead to systems with hundreds of processing cores integrated in a single chip. An emerging challenge is the implementation of reliable and efficient interconnection between these cores as well as other components in the systems. Network-on-Chip is an interconnection approach which is intended to solve the performance bottleneck caused by traditional, poorly scalable communication structures such as buses. However, a large on-chip network involves issues related to congestion problems and system control, for instance. Additionally, faults can cause problems in multiprocessor systems. These faults can be transient faults, permanent manufacturing faults, or they can appear due to aging. To solve the emerging traffic management, controllability issues and to maintain system operation regardless of faults a monitoring system is needed. The monitoring system should be dynamically applicable to various purposes and it should fully cover the system under observation. In a large multiprocessor the distances between components can be relatively long. Therefore, the system should be designed so that the amount of energy-inefficient long-distance communication is minimized.
This thesis presents a dynamically clustered distributed monitoring structure. The monitoring is distributed so that no centralized control is required for basic tasks such as traffic management and task mapping. To enable extensive analysis of different Network-on-Chip architectures, an in-house SystemC based simulation environment was implemented. It allows transaction level analysis without time consuming circuit level implementations during early design phases of novel architectures and features.
The presented analysis shows that the dynamically clustered monitoring structure can be efficiently utilized for traffic management in faulty and congested Network-on-Chip-based multiprocessor systems. The monitoring structure can be also successfully applied for task mapping purposes. Furthermore, the analysis shows that the presented in-house simulation environment is flexible and practical tool for extensive Network-on-Chip architecture analysis.Siirretty Doriast
Reconfigurable hardware architecture of a shape recognition system based on specialized tiny neural networks with online training.
Neural networks are widely used in pattern recognition, security applications, and robot control. We propose a hardware architecture system using tiny neural networks (TNNs)specialized in image recognition. The generic TNN architecture allows for expandability by means of mapping several basic units(layers) and dynamic reconfiguration, depending on the application specific demands. One of the most important features of TNNs is their learning ability. Weight modification and architecture reconfiguration can be carried out at run-time. Our system performs objects identification by the interpretation of characteristics elements of their shapes. This is achieved by interconnecting several specialized TNNs. The results of several tests in different conditions are reported in this paper. The system accurately detects a test shape in most of the experiments performed. This paper also contains a detailed description of the system architecture and the processing steps. In order to validate the research, the system has been implemented and configured as a perceptron network with back-propagation learning, choosing as reference application the recognition of shapes. Simulation results show that this architecture has significant performance benefits
- …