21 research outputs found

    Integration and validation of embedded flight software on space-qualified multicore architectures

    Get PDF
    In the recent decades, the importance of software on space missions has notably increased, reflecting the need to integrate advanced on-board functionalities. With multicore processors being lately introduced to host critical high-performance applications, the complexity to validate software has significantly raised with respect to single core architectures. While there has been a big step forward in avionics after the publication of the CAST-32A paper, the ECSS-E-ST-40C software engineering standard used by the European Space Agency (ESA) is still not providing validation support for multicore processors. Hence, it is expected that standardising guidelines to develop software on such platforms will become a recurring topic in the industry to match the demands of future space exploration missions

    Dynamic software randomisation: Lessons learnec from an aerospace case study

    Get PDF
    Timing Validation and Verification (V&V) is an important step in real-time system design, in which a system's timing behaviour is assessed via Worst Case Execution Time (WCET) estimation and scheduling analysis. For WCET estimation, measurement-based timing analysis (MBTA) techniques are widely-used and well-established in industrial environments. However, the advent of complex processors makes it more difficult for the user to provide evidence that the software is tested under stress conditions representative of those at system operation. Measurement-Based Probabilistic Timing Analysis (MBPTA) is a variant of MBTA followed by the PROXIMA European Project that facilitates formulating this representativeness argument. MBPTA requires certain properties to be applicable, which can be obtained by selectively injecting randomisation in platform's timing behaviour via hardware or software means. In this paper, we assess the effectiveness of the PROXIMA's dynamic software randomisation (DSR) with a space industrial case study executed on a real unmodified hardware platform and an industrial operating system. We present the challenges faced in its development, in order to achieve MBPTA compliance and the lessons learned from this process. Our results, obtained using a commercial timing analysis tool, indicate that DSR does not impact the average performance of the application, while it enables the use of MBPTA. This results in tighter pWCET estimates compared to current industrial practice.The research leading to these results has received funding from the European Community’s FP7 [FP7/2007-2013] under the PROXIMA Project (www.proxima-project.eu), grant agreement no 611085. This work has also been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2015-65316-P and the HiPEAC Network of Excellence. Jaume Abella has been partially supported by the Ministry of Economy and Competitiveness under Ramon y Cajal postdoctoral fellowship number RYC-2013-14717.Peer ReviewedPostprint (author's final draft

    New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs

    Full text link
    Tesis por compendio[EN] Relevance of electronics towards safety of common devices has only been growing, as an ever growing stake of the functionality is assigned to them. But of course, this comes along the constant need for higher performances to fulfill such functionality requirements, while keeping power and budget low. In this scenario, industry is struggling to provide a technology which meets all the performance, power and price specifications, at the cost of an increased vulnerability to several types of known faults or the appearance of new ones. To provide a solution for the new and growing faults in the systems, designers have been using traditional techniques from safety-critical applications, which offer in general suboptimal results. In fact, modern embedded architectures offer the possibility of optimizing the dependability properties by enabling the interaction of hardware, firmware and software levels in the process. However, that point is not yet successfully achieved. Advances in every level towards that direction are much needed if flexible, robust, resilient and cost effective fault tolerance is desired. The work presented here focuses on the hardware level, with the background consideration of a potential integration into a holistic approach. The efforts in this thesis have focused several issues: (i) to introduce additional fault models as required for adequate representativity of physical effects blooming in modern manufacturing technologies, (ii) to provide tools and methods to efficiently inject both the proposed models and classical ones, (iii) to analyze the optimum method for assessing the robustness of the systems by using extensive fault injection and later correlation with higher level layers in an effort to cut development time and cost, (iv) to provide new detection methodologies to cope with challenges modeled by proposed fault models, (v) to propose mitigation strategies focused towards tackling such new threat scenarios and (vi) to devise an automated methodology for the deployment of many fault tolerance mechanisms in a systematic robust way. The outcomes of the thesis constitute a suite of tools and methods to help the designer of critical systems in his task to develop robust, validated, and on-time designs tailored to his application.[ES] La relevancia que la electrónica adquiere en la seguridad de los productos ha crecido inexorablemente, puesto que cada vez ésta copa una mayor influencia en la funcionalidad de los mismos. Pero, por supuesto, este hecho viene acompañado de una necesidad constante de mayores prestaciones para cumplir con los requerimientos funcionales, al tiempo que se mantienen los costes y el consumo en unos niveles reducidos. En este escenario, la industria está realizando esfuerzos para proveer una tecnología que cumpla con todas las especificaciones de potencia, consumo y precio, a costa de un incremento en la vulnerabilidad a múltiples tipos de fallos conocidos o la introducción de nuevos. Para ofrecer una solución a los fallos nuevos y crecientes en los sistemas, los diseñadores han recurrido a técnicas tradicionalmente asociadas a sistemas críticos para la seguridad, que ofrecen en general resultados sub-óptimos. De hecho, las arquitecturas empotradas modernas ofrecen la posibilidad de optimizar las propiedades de confiabilidad al habilitar la interacción de los niveles de hardware, firmware y software en el proceso. No obstante, ese punto no está resulto todavía. Se necesitan avances en todos los niveles en la mencionada dirección para poder alcanzar los objetivos de una tolerancia a fallos flexible, robusta, resiliente y a bajo coste. El trabajo presentado aquí se centra en el nivel de hardware, con la consideración de fondo de una potencial integración en una estrategia holística. Los esfuerzos de esta tesis se han centrado en los siguientes aspectos: (i) la introducción de modelos de fallo adicionales requeridos para la representación adecuada de efectos físicos surgentes en las tecnologías de manufactura actuales, (ii) la provisión de herramientas y métodos para la inyección eficiente de los modelos propuestos y de los clásicos, (iii) el análisis del método óptimo para estudiar la robustez de sistemas mediante el uso de inyección de fallos extensiva, y la posterior correlación con capas de más alto nivel en un esfuerzo por recortar el tiempo y coste de desarrollo, (iv) la provisión de nuevos métodos de detección para cubrir los retos planteados por los modelos de fallo propuestos, (v) la propuesta de estrategias de mitigación enfocadas hacia el tratamiento de dichos escenarios de amenaza y (vi) la introducción de una metodología automatizada de despliegue de diversos mecanismos de tolerancia a fallos de forma robusta y sistemática. Los resultados de la presente tesis constituyen un conjunto de herramientas y métodos para ayudar al diseñador de sistemas críticos en su tarea de desarrollo de diseños robustos, validados y en tiempo adaptados a su aplicación.[CA] La rellevància que l'electrònica adquireix en la seguretat dels productes ha crescut inexorablement, puix cada volta més aquesta abasta una major influència en la funcionalitat dels mateixos. Però, per descomptat, aquest fet ve acompanyat d'un constant necessitat de majors prestacions per acomplir els requeriments funcionals, mentre es mantenen els costos i consums en uns nivells reduïts. Donat aquest escenari, la indústria està fent esforços per proveir una tecnologia que complisca amb totes les especificacions de potència, consum i preu, tot a costa d'un increment en la vulnerabilitat a diversos tipus de fallades conegudes, i a la introducció de nous tipus. Per oferir una solució a les noves i creixents fallades als sistemes, els dissenyadors han recorregut a tècniques tradicionalment associades a sistemes crítics per a la seguretat, que en general oferixen resultats sub-òptims. De fet, les arquitectures empotrades modernes oferixen la possibilitat d'optimitzar les propietats de confiabilitat en habilitar la interacció dels nivells de hardware, firmware i software en el procés. Tot i això eixe punt no està resolt encara. Es necessiten avanços a tots els nivells en l'esmentada direcció per poder assolir els objectius d'una tolerància a fallades flexible, robusta, resilient i a baix cost. El treball ací presentat se centra en el nivell de hardware, amb la consideració de fons d'una potencial integració en una estratègia holística. Els esforços d'esta tesi s'han centrat en els següents aspectes: (i) la introducció de models de fallada addicionals requerits per a la representació adequada d'efectes físics que apareixen en les tecnologies de fabricació actuals, (ii) la provisió de ferramentes i mètodes per a la injecció eficient del models proposats i dels clàssics, (iii) l'anàlisi del mètode òptim per estudiar la robustesa de sistemes mitjançant l'ús d'injecció de fallades extensiva, i la posterior correlació amb capes de més alt nivell en un esforç per retallar el temps i cost de desenvolupament, (iv) la provisió de nous mètodes de detecció per cobrir els reptes plantejats pels models de fallades proposats, (v) la proposta d'estratègies de mitigació enfocades cap al tractament dels esmentats escenaris d'amenaça i (vi) la introducció d'una metodologia automatitzada de desplegament de diversos mecanismes de tolerància a fallades de forma robusta i sistemàtica. Els resultats de la present tesi constitueixen un conjunt de ferramentes i mètodes per ajudar el dissenyador de sistemes crítics en la seua tasca de desenvolupament de dissenys robustos, validats i a temps adaptats a la seua aplicació.Espinosa García, J. (2016). New Fault Detection, Mitigation and Injection Strategies for Current and Forthcoming Challenges of HW Embedded Designs [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/73146TESISCompendi

    Validation of Abstract Side-Channel Models for Computer Architectures

    Get PDF
    Observational models make tractable the analysis of information flow properties by providing an abstraction of side channels. We introduce a methodology and a tool, Scam-V, to validate observational models for modern computer architectures. We combine symbolic execution, relational analysis, and different program generation techniques to generate experiments and validate the models. An experiment consists of a randomly generated program together with two inputs that are observationally equivalent according to the model under the test. Validation is done by checking indistinguishability of the two inputs on real hardware by executing the program and analyzing the side channel. We have evaluated our framework by validating models that abstract the data-cache side channel of a Raspberry Pi 3 board with a processor implementing the ARMv8-A architecture. Our results show that Scam-V can identify bugs in the implementation of the models and generate test programs which invalidate the models due to hidden microarchitectural behavior

    Memory management unit for hardware-assisted dynamic relocation in on-board satellite systems

    Get PDF
    Satellite on-board systems spend their lives in hostile environments, where radiation can cause critical hardware failures. One of the most radiation-sensitive elements is memory. The so-called single event effects (SEEs) can corrupt or even irretrievably damage the cells that store the data and program instructions. When one of these cells is corrupted, the program must not use it again during execution. In order to avoid rebuilding and uploading the code, a memory management unit can be used to transparently relocate the program to an error-free memory region. This article presents the design and implementation of a memory management unit that allows the dynamic relocation of on-board software. This unit provides a hardware mechanism that allows the automatic relocation of sections of code or data at run-time, only requiring software intervention for initialization and configuration. The unit has been implemented on the LEON architecture, a reference for the European Space Agency (ESA) missions. The proposed solution has been validated using the boot and application software (ASW) of the instrument control unit of the Energetic Particle Detector of the Solar Orbiter Mission as a base. Processor synthesis on different FPGAs has shown resource usage and power consumption similar to that of a conventional memory management unit. The results vary between ± 1?15% in resource usage and ± 1?7% in power consumption, depending on the number of inputs assigned to the unit and the FPGA used. When comparing performance, both the proposed and conventional memory management units show the same results.Universidad de Alcal

    Improving time predictability of shared hardware resources in real-time multicore systems : emphasis on the space domain

    Get PDF
    Critical Real-Time Embedded Systems (CRTES) follow a verification and validation process on the timing and functional correctness. This process includes the timing analysis that provides Worst-Case Execution Time (WCET) estimates to provide evidence that the execution time of the system, or parts of it, remain within the deadlines. A key design principle for CRTES is the incremental qualification, whereby each software component can be subject to verification and validation independently of any other component, with obvious benefits for cost. At timing level, this requires time composability, such that the timing behavior of a function is not affected by other functions. CRTES are experiencing an unprecedented growth with rising performance demands that have motivated the use of multicore architectures. Multicores can provide the performance required and bring the potential of integrating several software functions onto the same hardware. However, multicore contention in the access to shared hardware resources creates a dependence of the execution time of a task with the rest of the tasks running simultaneously. This dependence threatens time predictability and jeopardizes time composability. In this thesis we analyze and propose hardware solutions to be applied on current multicore designs for CRTES to improve time predictability and time composability, focusing on the on-chip bus and the memory controller. At hardware level, we propose new bus and memory controller designs that control and mitigate contention between different cores and allow to have time composability by design, also in the context of mixed-criticality systems. At analysis level, we propose contention prediction models that factor the impact of contenders and don¿t need modifications to the hardware. We also propose a set of Performance Monitoring Counters (PMC) that provide evidence about the contention. We give an special emphasis on the Space domain focusing on the Cobham Gaisler NGMP multicore processor, which is currently assessed by the European Space Agency for its future missions.Los Sistemas Críticos Empotrados de Tiempo Real (CRTES) siguen un proceso de verificación y validación para su correctitud funcional y temporal. Este proceso incluye el análisis temporal que proporciona estimaciones de el peor caso del tiempo de ejecución (WCET) para dar evidencia de que el tiempo de ejecución del sistema, o partes de él, permanecen dentro de los límites temporales. Un principio de diseño clave para los CRTES es la cualificación incremental, por la que cada componente de software puede ser verificado y validado independientemente del resto de componentes, con beneficios obvios para el coste. A nivel temporal, esto requiere composabilidad temporal, por la que el comportamiento temporal de una función no se ve afectado por otras funciones. CRTES están experimentando un crecimiento sin precedentes con crecientes demandas de rendimiento que han motivado el uso the arquitecturas multi-núcleo (multicore). Los procesadores multi-núcleo pueden proporcionar el rendimiento requerido y tienen el potencial de integrar varias funcionalidades software en el mismo hardware. A pesar de ello, la interferencia entre los diferentes núcleos que aparece en los recursos compartidos de os procesadores multi núcleo crea una dependencia del tiempo de ejecución de una tarea con el resto de tareas ejecutándose simultáneamente en el procesador. Esta dependencia amenaza la predictabilidad temporal y compromete la composabilidad temporal. En esta tésis analizamos y proponemos soluciones hardware para ser aplicadas en los diseños multi núcleo actuales para CRTES que mejoran la predictabilidad y composabilidad temporal, centrándose en el bus y el controlador de memoria internos al chip. A nivel de hardware, proponemos nuevos diseños de buses y controladores de memoria que controlan y mitigan la interferencia entre los diferentes núcleos y permiten tener composabilidad temporal por diseño, también en el contexto de sistemas de criticalidad mixta. A nivel de análisis, proponemos modelos de predicción de la interferencia que factorizan el impacto de los núcleos y no necesitan modificaciones hardware. También proponemos un conjunto de Contadores de Control del Rendimiento (PMC) que proporcionoan evidencia de la interferencia. En esta tésis, damós especial importancia al dominio espacial, centrándonos en el procesador mutli núcleo Cobham Gaisler NGMP, que está siendo actualmente evaluado por la Agencia Espacial Europea para sus futuras misiones

    Software support to strengthen measurement-based timing analysis

    Get PDF
    In critical domains, the advent of high-performance (complex) hardware, used to provide the rising levels of guaranteed performance, complicates providing evidence of reliable software execution specially on aspects related to the timing dimension. Caches and multicores are two of the hardware features that have the potential to significantly reduce worst-case execution time (WCET) estimates, yet they pose new challenges on current-practice measurement-based timing analysis (MBTA) approaches. MBTA methods rely on some form of instrumentation, either at hardware or software level, of the target program or fragments thereof to collect execution-time measurement data. Instrumentation does affect the timing and functional behavior of a program, resulting in the so-called probe effect: leaving the instrumentation code in the final executable can negatively affect average performance and could not be even admissible under stringent industrial qualification and certification standards; removing it before operation jeopardizes the results of timing analysis as the WCET estimates on the instrumented version of the program cannot be valid any more due, for example, to the timing effects incurred by different cache alignments. Measurement-Based Probabilistic Timing Analysis (MBPTA) is a variant of MBTA that aims at increasing the confidence on WCET estimates. MBPTA aims at relieving the user from controlling hardware sources of jitter. MBPTA implicitly controls the impact of jittery resources on measurements captured at analysis. Some hardware resources are randomized so that their execution times at analysis vary according to a probabilistic execution time distribution that can be used to upperbound the latencies during operation and give a WCET prediction with a certain probability. In this Master Thesis we present our approach to mitigate the impact of instrumentation code on cache behavior by reducing the instrumentation overhead while at the same time preserving and consolidating the results of timing analysis. We further propose a technique for multilevel-cache multicores that combines deterministic and probabilistic jitter-bounding approaches to reliably handle variability in execution time generated by caches and the contention in accessing shared hardware resources

    Reliable Software for Unreliable Hardware - A Cross-Layer Approach

    Get PDF
    A novel cross-layer reliability analysis, modeling, and optimization approach is proposed in this thesis that leverages multiple layers in the system design abstraction (i.e. hardware, compiler, system software, and application program) to exploit the available reliability enhancing potential at each system layer and to exchange this information across multiple system layers

    Adaptation multicoeur d'un noyau de partitionnement robuste vers l'architecture PowerPC

    Get PDF
    L’utilisation de plus en plus commune de l’architecture d’avionique modulaire intégrée (IMA) a permis de réduire le poids, la taille et l’encombrement des systèmes avioniques en consolidant l’exécution logicielle de plusieurs fonctions sur un même processeur. En même temps que l’adoption accélérée de l’architecture IMA, les microprocesseurs multicoeurs ont gagné en popularité en raison de la stagnation des fréquences d’horloge des processeurs monocoeurs. Les processeurs multicoeurs promettent d’augmenter l’intégration de fonctions logicielles, mais ces derniers ne sont toujours pas acceptés en avionique, pour des raisons de complexité qui affectent la sûreté. La technologie principale permettant l’utilisation de l’architecture IMA est le noyau de partitionnement robuste. Un noyau de partitionnement robuste permet d’isoler les applications indépendantes d’un système IMA pour prévenir la propagation des fautes. Cette isolation est réalisée par partitionnement spatial et temporel robustes. Dans ce domaine, il n’existe actuellement aucun système d’exploitation supportant les recherches sur l’évaluation de sûreté des processeurs multicoeurs. Nous proposons dans ce mémoire d’adapter un noyau de partitionnement robuste existant, afin qu’il supporte le déploiement de partitions en parallèle sur plusieurs coeurs d’un processeur multicoeur. Nous avons analysé l’architecture d’un noyau existant, nommé XtratuM, puis nous l’avons adapté à un modèle de partitionnement robuste multicoeur. Nous avons ensuite réalisé une implémentation de cette adaptation sur un processeur PowerPC multicoeur de la famille MPC8641 de Freescale. Le prototype résultant est nommé XtratuM-PPC. Nous présentons enfin une étude de cas de l’utilisation de XtratuM-PPC avec un plan d’exécution multicoeur. Lors des phases d’adaptation et de réalisation, nous avons identifié un ensemble de problèmes techniques affectant la sûreté des noyaux de partitionnement robuste sur les processeurs multicoeurs. Ces problèmes mettent en relief la complexité d’implémentation de ce type de système logiciel. Nos travaux nous permettent de conclure que l’adaptation d’un noyau de partitionnement robuste monocoeur existant à une architecture multicoeur est possible. Cependant, les problèmes de sûreté qui apparaissent avec les processeurs multicoeurs demeurent non résolus. Notre prototype de noyau de partitionnement robuste multicoeur est un point de départ pour la résolution de ces problèmes dans un environnement réel
    corecore