19 research outputs found
Automated Debugging Methodology for FPGA-based Systems
Electronic devices make up a vital part of our lives. These are seen from mobiles, laptops, computers, home automation, etc. to name a few. The modern designs constitute billions of transistors. However, with this evolution, ensuring that the devices fulfill the designer’s expectation under variable conditions has also become a great challenge. This requires a lot of design time and effort. Whenever an error is encountered, the process is re-started. Hence, it is desired to minimize the number of spins required to achieve an error-free product, as each spin results in loss of time and effort.
Software-based simulation systems present the main technique to ensure the verification of the design before fabrication. However, few design errors (bugs) are likely to escape the simulation process. Such bugs subsequently appear during the post-silicon phase. Finding such bugs is time-consuming due to inherent invisibility of the hardware. Instead of software simulation of the design in the pre-silicon phase, post-silicon techniques permit the designers to verify the functionality through the physical implementations of the design. The main benefit of the methodology is that the implemented design in the post-silicon phase runs many order-of-magnitude faster than its counterpart in pre-silicon. This allows the designers to validate their design more exhaustively.
This thesis presents five main contributions to enable a fast and automated debugging solution for reconfigurable hardware. During the research work, we used an obstacle avoidance system for robotic vehicles as a use case to illustrate how to apply the proposed debugging solution in practical environments.
The first contribution presents a debugging system capable of providing a lossless trace of debugging data which permits a cycle-accurate replay. This methodology ensures capturing permanent as well as intermittent errors in the implemented design. The contribution also describes a solution to enhance hardware observability. It is proposed to utilize processor-configurable concentration networks, employ debug data compression to transmit the data more efficiently, and partially reconfiguring the debugging system at run-time to save the time required for design re-compilation as well as preserve the timing closure.
The second contribution presents a solution for communication-centric designs. Furthermore, solutions for designs with multi-clock domains are also discussed.
The third contribution presents a priority-based signal selection methodology to identify the signals which can be more helpful during the debugging process. A connectivity generation tool is also presented which can map the identified signals to the debugging system.
The fourth contribution presents an automated error detection solution which can help in capturing the permanent as well as intermittent errors without continuous monitoring of debugging data. The proposed solution works for designs even in the absence of golden reference.
The fifth contribution proposes to use artificial intelligence for post-silicon debugging. We presented a novel idea of using a recurrent neural network for debugging when a golden reference is present for training the network. Furthermore, the idea was also extended to designs where golden reference is not present
Memory hierarchy and data communication in heterogeneous reconfigurable SoCs
The miniaturization race in the hardware industry aiming at continuous increasing
of transistor density on a die does not bring respective application performance
improvements any more. One of the most promising alternatives is to
exploit a heterogeneous nature of common applications in hardware. Supported by
reconfigurable computation, which has already proved its efficiency in accelerating
data intensive applications, this concept promises a breakthrough in contemporary
technology development.
Memory organization in such heterogeneous reconfigurable architectures becomes
very critical. Two primary aspects introduce a sophisticated trade-off. On
the one hand, a memory subsystem should provide well organized distributed data
structure and guarantee the required data bandwidth. On the other hand, it should
hide the heterogeneous hardware structure from the end-user, in order to support
feasible high-level programmability of the system.
This thesis work explores the heterogeneous reconfigurable hardware architectures
and presents possible solutions to cope the problem of memory organization
and data structure. By the example of the MORPHEUS heterogeneous platform,
the discussion follows the complete design cycle, starting from decision making
and justification, until hardware realization. Particular emphasis is made on the
methods to support high system performance, meet application requirements, and
provide a user-friendly programmer interface.
As a result, the research introduces a complete heterogeneous platform enhanced
with a hierarchical memory organization, which copes with its task by
means of separating computation from communication, providing reconfigurable
engines with computation and configuration data, and unification of heterogeneous
computational devices using local storage buffers. It is distinguished from the
related solutions by distributed data-flow organization, specifically engineered
mechanisms to operate with data on local domains, particular communication infrastructure
based on Network-on-Chip, and thorough methods to prevent computation
and communication stalls. In addition, a novel advanced technique to accelerate
memory access was developed and implemented
CROSS-LAYER DESIGN, OPTIMIZATION AND PROTOTYPING OF NoCs FOR THE NEXT GENERATION OF HOMOGENEOUS MANY-CORE SYSTEMS
This thesis provides a whole set of design methods to enable and manage the
runtime heterogeneity of features-rich industry-ready Tile-Based Networkon-
Chips at different abstraction layers (Architecture Design, Network Assembling,
Testing of NoC, Runtime Operation). The key idea is to maintain
the functionalities of the original layers, and to improve the performance
of architectures by allowing, joint optimization and layer coordinations. In
general purpose systems, we address the microarchitectural challenges by codesigning
and co-optimizing feature-rich architectures. In application-specific
NoCs, we emphasize the event notification, so that the platform is continuously
under control. At the network assembly level, this thesis proposes a
Hold Time Robustness technique, to tackle the hold time issue in synchronous
NoCs. At the network architectural level, the choice of a suitable synchronization
paradigm requires a boost of synthesis flow as well as the coexistence
with the DVFS. On one hand this implies the coexistence of mesochronous
synchronizers in the network with dual-clock FIFOs at network boundaries.
On the other hand, dual-clock FIFOs may be placed across inter-switch links
hence removing the need for mesochronous synchronizers. This thesis will
study the implications of the above approaches both on the design flow and
on the performance and power quality metrics of the network. Once the manycore
system is composed together, the issue of testing it arises. This thesis
takes on this challenge and engineers various testing infrastructures. At the
upper abstraction layer, the thesis addresses the issue of managing the fully
operational system and proposes a congestion management technique named
HACS. Moreover, some of the ideas of this thesis will undergo an FPGA
prototyping. Finally, we provide some features for emerging technology by
characterizing the power consumption of Optical NoC Interfaces
Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)
ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability
Platform-based design, test and fast verification flow for mixed-signal systems on chip
This research is providing methodologies to enhance the design phase from architectural space exploration and system study to verification of the whole mixed-signal system. At the beginning of the work, some innovative digital IPs have been designed to develop efficient signal conditioning for sensor systems on-chip that has been included in commercial products. After this phase, the main focus has been addressed to the creation of a re-usable and versatile test of the device after the tape-out which is close to become one of the major cost factor for ICs companies, strongly linking it to model’s test-benches to avoid re-design phases and multi-environment scenarios, producing a very effective approach to a single, fast and reliable multi-level verification environment. All these works generated different publications in scientific literature.
The compound scenario concerning the development of sensor systems is presented in Chapter 1, together with an overview of the related market with a particular focus on the latest MEMS and MOEMS technology devices, and their applications in various segments.
Chapter 2 introduces the state of the art for sensor interfaces: the generic sensor interface concept (based on sharing the same electronics among similar applications achieving cost saving at the expense of area and performance loss) versus the Platform Based Design methodology, which overcomes the drawbacks of the classic solution by keeping the generality at the highest design layers and customizing the platform for a target sensor achieving optimized performances. An evolution of Platform Based Design achieved by implementation into silicon of the ISIF (Intelligent Sensor InterFace) platform is therefore presented. ISIF is a highly configurable mixed-signal chip which allows designers to perform an effective design space exploration and to evaluate directly on silicon the system performances avoiding the critical and time consuming analysis required by standard platform based approach.
In chapter 3 we describe the design of a smart sensor interface for conditioning next generation MOEMS. The adoption of a new, high performance and high integrated technology allow us to integrate not only a versatile platform but also a powerful ARM processor and various IPs providing the possibility to use the platform not only as a conditioning platform but also as a processing unit for the application. In this chapter a description of the various blocks is given, with a particular emphasis on the IP developed in order to grant the highest grade of flexibility with the minimum area occupation.
The architectural space evaluation and the application prototyping with ISIF has enabled an effective, rapid and low risk development of a new high performance platform achieving a flexible sensor system for MEMS and MOEMS monitoring and conditioning. The platform has been design to cover very challenging test-benches, like a laser-based projector device. In this way the platform will not only be able to effectively handle the sensor but also all the system that can be built around it, reducing the needed for further electronics and resulting in an efficient test bench for the algorithm developed to drive the system.
The high costs in ASIC development are mainly related to re-design phases because of missing complete top-level tests. Analog and digital parts design flows are separately verified. Starting from these considerations, in the last chapter a complete test environment for complex mixed-signal chips is presented. A semi-automatic VHDL-AMS flow to provide totally matching top-level is described and then, an evolution for fast self-checking test development for both model and real chip verification is proposed. By the introduction of a Python interface, the designer can easily perform interactive tests to cover all the features verification (e.g. calibration and trimming) into the design phase and check them all with the same environment on the real chip after the tape-out. This strategy has been tested on a consumer 3D-gyro for consumer application, in collaboration with SensorDynamics AG
Mixed-Criticality Systems on Commercial-Off-the-Shelf Multi-Processor Systems-on-Chip
Avionics and space industries are struggling with the adoption of technologies
like multi-processor system-on-chips (MPSoCs) due to strict safety requirements.
This thesis propose a new reference architecture for MPSoC-based mixed-criticality
systems (MCS) - i.e., systems integrating applications with different level of criticality - which are a common use case for aforementioned industries.
This thesis proposes a system architecture capable of granting partitioning -
which is, for short, the property of fault containment. It is based on the detection
of spatial and temporal interference, and has been named the online detection of
interference (ODIn) architecture.
Spatial partitioning requires that an application is not able to corrupt resources
used by a different application. In the architecture proposed in this thesis, spatial
partitioning is implemented using type-1 hypervisors, which allow definition of
resource partitions. An application running in a partition can only access resources
granted to that partition, therefore it cannot corrupt resources used by applications
running in other partitions.
Temporal partitioning requires that an application is not able to unexpectedly
change the execution time of other applications. In the proposed architecture, temporal partitioning has been solved using a bounded interference approach, composed of
an offline analysis phase and an online safety net.
The offline phase is based on a statistical profiling of a metric sensitive to
temporal interference’s, performed in nominal conditions, which allows definition of
a set of three thresholds:
1. the detection threshold TD;
2. the warning threshold TW ;
3. the α threshold.
Two rules of detection are defined using such thresholds:
Alarm rule When the value of the metric is above TD.
Warning rule When the value of the metric is in the warning region [TW ;TD] for
more than α consecutive times.
ODIn’s online safety-net exploits performance counters, available in many MPSoC architectures; such counters are configured at bootstrap to monitor the selected
metric(s), and to raise an interrupt request (IRQ) in case the metric value goes above
TD, implementing the alarm rule. The warning rule is implemented in a software detection module, which reads the value of performance counters when the monitored
task yields control to the scheduler and reset them if there is no detection.
ODIn also uses two additional detection mechanisms:
1. a control flow check technique, based on compile-time defined block signatures, is implemented through a set of watchdog processors, each monitoring
one partition.
2. a timeout is implemented through a system watchdog timer (SWDT), which is
able to send an external signal when the timeout is violated.
The recovery actions implemented in ODIn are:
• graceful degradation, to react to IRQs of WDPs monitoring non-critical applications or to warning rule violations; it temporarily stops non-critical applications
to grant resources to the critical application;
• hard recovery, to react to the SWDT, to the WDP of the critical application, or
to alarm rule violations; it causes a switch to a hot stand-by spare computer.
Experimental validation of ODIn was performed on two hardware platforms: the
ZedBoard - dual-core - and the Inventami board - quad-core.
A space benchmark and an avionic benchmark were implemented on both platforms, composed by different modules as showed in Table 1
Each version of the final application was evaluated through fault injection (FI)
campaigns, performed using a specifically designed FI system. There were three
types of FI campaigns:
1. HW FI, to emulate single event effects;
2. SW FI, to emulate bugs in non-critical applications;
3. artificial bug FI, to emulate a bug in non-critical applications introducing
unexpected interference on the critical application.
Experimental results show that ODIn is resilient to all considered types of faul
A Scalable and Adaptive Network on Chip for Many-Core Architectures
In this work, a scalable network on chip (NoC) for future many-core architectures is proposed and investigated. It supports different QoS mechanisms to ensure predictable communication. Self-optimization is introduced to adapt the energy footprint and the performance of the network to the communication requirements. A fault tolerance concept allows to deal with permanent errors. Moreover, a template-based automated evaluation and design methodology and a synthesis flow for NoCs is introduced
Approximate Computing Strategies for Low-Overhead Fault Tolerance in Safety-Critical Applications
This work studies the reliability of embedded systems with approximate computing on software and hardware designs. It presents approximate computing methods and proposes approximate fault tolerance techniques applied to programmable hardware and embedded software to provide reliability at low computational costs. The objective of this thesis is the development of fault tolerance techniques based on approximate computing and proving that approximate computing can be applied to most safety-critical systems. It starts with an experimental analysis of the reliability of embedded systems used at safety-critical projects. Results show that the reliability of single-core systems, and types of errors they are sensitive to, differ from multicore processing systems. The usage of an operating system and two different parallel programming APIs are also evaluated. Fault injection experiment results show that embedded Linux has a critical impact on the system’s reliability and the types of errors to which it is most sensitive. Traditional fault tolerance techniques and parallel variants of them are evaluated for their fault-masking capability on multicore systems. The work shows that parallel fault tolerance can indeed not only improve execution time but also fault-masking. Lastly, an approximate parallel fault tolerance technique is proposed, where the system abandons faulty execution tasks. This first approximate computing approach to fault tolerance in parallel processing systems was able to improve the reliability and the fault-masking capability of the techniques, significantly reducing errors that would cause system crashes. Inspired by the conflict between the improvements provided by approximate computing and the safety-critical systems requirements, this work presents an analysis of the applicability of approximate computing techniques on critical systems. The proposed techniques are tested under simulation, emulation, and laser fault injection experiments. Results show that approximate computing algorithms do have a particular behavior, different from traditional algorithms. The approximation techniques presented and proposed in this work are also used to develop fault tolerance techniques. Results show that those new approximate fault tolerance techniques are less costly than traditional ones and able to achieve almost the same level of error masking.Este trabalho estuda a confiabilidade de sistemas embarcados com computação aproximada em software e projetos de hardware. Ele apresenta mĂ©todos de computação aproximada e tĂ©cnicas aproximadas para tolerância a falhas em hardware programável e software embarcado que provĂŞem alta confiabilidade a baixos custos computacionais. O objetivo desta tese Ă© o desenvolvimento de tĂ©cnicas de tolerância a falhas baseadas em computação aproximada e provar que este paradigma pode ser usado em sistemas crĂticos. O texto começa com uma análise da confiabilidade de sistemas embarcados usados em sistemas de tolerância crĂtica. Os resultados mostram que a resiliĂŞncia de sistemas singlecore, e os tipos de erros aos quais eles sĂŁo mais sensĂveis, Ă© diferente dos multi-core. O uso de sistemas operacionais tambĂ©m Ă© analisado, assim como duas APIs de programação paralela. Experimentos de injeção de falhas mostram que o uso de Linux embarcado tem um forte impacto na confiabilidade do sistema. TĂ©cnicas tradicionais de tolerância a falhas e variações paralelas das mesmas sĂŁo avaliadas. O trabalho mostra que tĂ©cnicas de tolerância a falhas paralelas podem de fato melhorar nĂŁo apenas o tempo de execução da aplicação, mas tambĂ©m seu mascaramento de erros. Por fim, uma tĂ©cnica de tolerância a falhas paralela aproximada Ă© proposta, onde o sistema abandona instâncias de execuções que apresentam falhas. Esta primeira experiĂŞncia com computação aproximada foi capaz de melhorar a confiabilidade das tĂ©cnicas previamente apresentadas, reduzindo significativamente a ocorrĂŞncia de erros que provocam um crash total do sistema. Inspirado pelo conflito entre as melhorias trazidas pela computação aproximada e os requisitos dos sistemas crĂticos, este trabalho apresenta uma análise da aplicabilidade de computação aproximada nestes sistemas. As tĂ©cnicas propostas sĂŁo testadas sob experimentos de injeção de falhas por simulação, emulação e laser. Os resultados destes experimentos mostram que algoritmos aproximados possuem um comportamento particular que lhes Ă© inerente, diferente dos tradicionais. As tĂ©cnicas de aproximação apresentadas e propostas no trabalho sĂŁo tambĂ©m utilizadas para o desenvolvimento de tĂ©cnicas de tolerância a falhas aproximadas. Estas novas tĂ©cnicas possuem um custo menor que as tradicionais e sĂŁo capazes de atingir o mesmo nĂvel de mascaramento de erros
New Hardware Architecture for Low-Cost Functional Test Systems Applications to HDMI generation
English: Development of a new test hardware architecture for functional test systems. Development of a proof-of-concept prototype for HDMI generation.Castellano: Desarrollo de una nueva arquitectura para equipos de test destinados a máquinas de test funcional de PCBs. Desarrollo de un prototipo de demostración destinado a la generación de HDMI.Català : Desenvolupament d'una nova arquitectura per equips de test destinats a mà quines de test funcional de PCB. Desenvolupament d'un prototip de demostració destinat a generació d'HDM
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems