1,792 research outputs found

    Novel Validation Techniques for Autonomous Vehicles

    Get PDF
    The automotive industry is facing challenges in producing electrical, connected, and autonomous vehicles. Even if these challenges are, from a technical point of view, independent from each other, the market and regulatory bodies require them to be developed and integrated simultaneously. The development of autonomous vehicles implies the development of highly dependable systems. This is a multidisciplinary activity involving knowledge from robotics, computer science, electrical and mechanical engineering, psychology, social studies, and ethics. Nowadays, many Advanced Driver Assistance Systems (ADAS), like Emergency Braking System, Lane Keep Assistant, and Park Assist, are available. Newer luxury cars can drive by themselves on highways or park automatically, but the end goal is to develop completely autonomous driving vehicles, able to go by themselves, without needing human interventions in any situation. The more vehicles become autonomous, the greater the difficulty in keeping them reliable. It enhances the challenges in terms of development processes since their misbehaviors can lead to catastrophic consequences and, differently from the past, there is no more a human driver to mitigate the effects of erroneous behaviors. Primary threats to dependability come from three sources: misuse from the drivers, design systematic errors, and random hardware failures. These safety threats are addressed under various aspects, considering the particular type of item to be designed. In particular, for the sake of this work, we analyze those related to Functional Safety (FuSa), viewed as the ability of a system to react on time and in the proper way to the external environment. From the technological point of view, these behaviors are implemented by electrical and electronic items. Various standards to achieve FuSa have been released over the years. The first, released in 1998, was the IEC 61508. Its last version is the one released in 2010. This standard defines mainly: • a Functional Safety Management System (FSMS); • methods to determine a Safety Integrated Level (SIL); • methods to determine the probability of failures. To adapt the IEC61508 to the automotive industry’s peculiarity, a newer standard, the ISO26262, was released in 2011 then updated in 2018. This standard provides guidelines about FSMS, called in this case Safety Lifecycle, describing how to develop software and hardware components suitable for functional safety. It also provides a different way to compute the SIL, called in this case Automotive SIL (ASIL), allowing us to consider the average driver’s abilities to control the vehicle in case of failures. Moreover, it describes a way to determine the probability of random hardware failures through Failure Mode, Effects, and Diagnostic Analysis (FMEDA). This dissertation contains contributions to three topics: • random hardware failures mitigation; • improvementoftheISO26262HazardAnalysisandRiskAssessment(HARA); • real-time verification of the embedded software. As the main contribution of this dissertation, I address the safety threats due to random hardware failures (RHFs). For this purpose, I propose a novel simulation-based approach to aid the Failure Mode, Effects, and Diagnostic Analysis (FMEDA) required by the ISO26262 standard. Thanks to a SPICE-level model of the item, and the adoption of fault injection techniques, it is possible to simulate its behaviors obtaining useful information to classify the various failure modes. The proposed approach evolved from a mere simulation of the item, allowing only an item-level failure mode classification up to a vehicle-level analysis. The propagation of the failure modes’ effects on the whole vehicle enables us to assess the impacts on the vehicle’s drivability, improving the quality of the classifications. It can be advantageous where it is difficult to predict how the item-level misbehaviors propagate to the vehicle level, as in the case of a virtual differential gear or the mobility system of a robot. It has been chosen since it can be considered similar to the novel light vehicles, such as electric scooters, that are becoming more and more popular. Moreover, my research group has complete access to its design since it is realized by our university’s DIANA students’ team. When a SPICE-level simulation is too long to be performed, or it is not possible to develop a complete model of the item due to intellectual property protection rules, it is possible to aid this process through behavioral models of the item. A simulation of this kind has been performed on a mobile robotic system. Behavioral models of the electronic components were used, alongside mechanical simulations, to assess the software failure mitigation capabilities. Another contribution has been obtained by modifying the main one. The idea was to make it possible to aid also the Hazard Analysis and Risk Assessment (HARA). This assessment is performed during the concept phase, so before starting to design the item implementation. Its goal is to determine the hazards involved in the item functionality and their associated levels of risk. The end goal of this phase is a list of safety goals. For each one of these safety goals, an ASIL has to be determined. Since HARA relies only on designers expertise and knowledge, it lacks in objectivity and repeatability. Thanks to the simulation results, it is possible to predict the effects of the failures on the vehicle’s drivability, allowing us to improve the severity and controllability assessment, thus improving the objectivity. Moreover, since simulation conditions can be stored, it is possible, at any time, to recheck the results and to add new scenarios, improving the repeatability. The third group of contributions is about the real-time verification of embedded software. Through Hardware-In-the-Loop (HIL), a software integration verification has been performed to test a fundamental automotive component, mixed-criticality applications, and multi-agent robots. The first of these contributions is about real-time tests on Body Control Modules (BCM). These modules manage various electronic accessories in the vehicle’s body, like power windows and mirrors, air conditioning, immobilizer, central locking. The main characteristics of BCMs are the communications with other embedded computers via the car’s vehicle bus (Controller Area Network) and to have a high number (hundreds) of low-speed I/Os. As the second contribution, I propose a methodology to assess the error recovery system’s effects on mixed-criticality applications regarding deadline misses. The system runs two tasks: a critical airplane longitudinal control and a non-critical image compression algorithm. I start by presenting the approach on a benchmark application containing an instrumented bug into the lower criticality task; then, we improved it by injecting random errors inside the lower criticality task’s memory space through a debugger. In the latter case, thanks to the HIL, it is possible to pause the time domain simulation when the debugger operates and resume it once the injection is complete. In this way, it is possible to interact with the target without interfering with the simulation results, combining a full control of the target with an accurate time-domain assessment. The last contribution of this third group is about a methodology to verify, on multi-agent robots, the synchronization between two agents in charge to move the end effector of a delta robot: the correct position and speed of the end effector at any time is strongly affected by a loss of synchronization. The last two contributions may seem unrelated to the automotive industry, but interest in these applications is gaining. Mixed-criticality systems allow reducing the number of ECUs inside cars (for cost reduction), while the multi-agent approach is helpful to improve the cooperation of the connected cars with respect to other vehicles and the infrastructure. The fourth contribution, contained in the appendix, is about a machine learning application to improve the social acceptance of autonomous vehicles. The idea is to improve the comfort of the passengers by recognizing their emotions. I started with the idea to modify the vehicle’s driving style based on a real-time emotions recognition system but, due to the difficulties of performing such operations in an experimental setup, I move to analyze them offline. The emotions are determined on volunteers’ facial expressions recorded while viewing 3D representa- tions showing different calibrations. Thanks to the passengers’ emotional responses, it is possible to choose the better calibration from the comfort point of view

    Novel Validation Techniques for Autonomous Vehicles

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    How to stretch system reliability exploiting mission constraints: A practical roadmap for industries

    Get PDF
    Reliability analysis can be committed to companies by customers willing to verify whether their products comply with the major international standards or simply to verify the design prior of market deployment. Nevertheless, these analyses may be required at the very preliminary stages of design or when the design is already in progress due to low organizational capabilities or simple delay in the project implementation process. The results sometime maybe be far from the market or customer target with a subsequent need to redesign the whole asset. Of course, not all the cases fall in the worst scenario and maybe with some additional consideration on mission definition it is possible to comply with the proposed reliability targets. In this paper the author will provide an overview on the approach which could be followed to achieve the reliability target even when the project is still on-going providing a practical case study

    A safety assessment framework for Automatic Dependent Surveillance Broadcast (ADS-B) and its potential impact on aviation safety

    Get PDF
    The limitations of the current civil aviation surveillance systems include a lack of coverage in some areas and low performance in terms of accuracy, integrity, continuity and availability particularly in high density traffic areas including airports, with a negative impact on capacity and safety. Automatic Dependent Surveillance Broadcast (ADS-B) technology has been proposed to address these limitations by enabling improved situational awareness for all stakeholders and enhanced airborne and ground surveillance, resulting in increased safety and capacity. In particular, its scalability and adaptability should facilitate its use in general aviation and in ground vehicles. This should, in principle, provide affordable, effective surveillance of all air and ground traffic, even on airport taxiways and runways, and in airspace where radar is ineffective or unavailable. The success of the progressive implementation of ADS-B has led to numerous programmes for its introduction in other parts of the World where the operational environment is considerably different from that of Australia. However, a number of critical issues must be addressed in order to benefit from ADS-B, including the development and execution of a safety case that addresses both its introduction into legacy and new systems’ operational concepts, the latter including the Single European Sky (SES) / Single European Sky ATM Research (SESAR) and the US’ Next Generation Air Transportation System (NexGEN). This requires amongst others, a good understanding of the limitations of existing surveillance systems, ADS-B architecture and system failures and its interfaces to the existing and future ATM systems. Research on ADS-B to date has not addressed in detail the important questions of limitations of existing systems and ADS-B failure modes including their characterisation, modelling and assessment of impact. The latter is particularly important due to the sole dependency of ADS-B on GNSS for information on aircraft state and its reliance on communication technologies such as Mode-S Extended Squitter, VHF Data Link Mode-4 (VDLM4) or Universal Access Transceiver (UAT), to broadcast the surveillance information to ground-based air traffic control (ATC) and other ADS-B equipped aircraft within a specified range, all of which increase complexity and the potential for failures. This thesis proposes a novel framework for the assessment of the ADS-B system performance to meet the level of safety required for ground and airborne surveillance operations. The framework integrates various methods for ADS-B performance assessment in terms of accuracy, integrity, continuity, availability and latency, and reliability assessment using probabilistic safety assessment methods; customized failure mode identification approach and fault tree analysis. Based on the framework, the thesis develops a failure mode register for ADS-B, identifies and quantifies the impact of a number of potential hazards for the ADS-B. Furthermore, this thesis identifies various anomalies in the onboard GNSS system that feeds aircraft navigation information into the ADS-B system. Finally, the thesis maps the ADS-B data availability and the quantified system performance to the envisioned airborne surveillance application’s requirements. The mapping exercise indicates that, the quantified ADS-B accuracy is sufficient for all applications while ADS-B integrity is insufficient to support the most stringent application: Airborne Separation (ASEP). In addition, some of the required performance parameters are unavailable from aircraft certified to DO-260 standard. Therefore, all aircraft must be certified to DO-260B standard to support the applications and perform continuous monitoring, to ensure consistency in the system performance of each aircraft.Open Acces

    Real-time trace decoding and monitoring for safety and security in embedded systems

    Get PDF
    Integrated circuits and systems can be found almost everywhere in today’s world. As their use increases, they need to be made safer and more perfor mant to meet current demands in processing power. FPGA integrated SoCs can provide the ideal trade-off between performance, adaptability, and energy usage. One of today’s vital challenges lies in updating existing fault tolerance techniques for these new systems while utilizing all available processing capa bilities, such as multi-core and heterogeneous processing units. Control-flow monitoring is one of the primary mechanisms described for error detection at the software architectural level for the highest grade of hazard level clas sifications (e.g., ASIL D) described in industry safety standards ISO-26262. Control-flow errors are also known to compose the majority of detected errors for ICs and embedded systems in safety-critical and risk-susceptible environ ments [5]. Software-based monitoring methods remain the most popular [6–8]. However, recent studies show that the overheads they impose make actual reliability gains negligible [9, 10]. This work proposes and demonstrates a new control flow checking method implemented in FPGA for multi-core embedded systems called control-flow trace checker (CFTC). CFTC uses existing trace and debug subsystems of modern processors to rebuild their execution states. It can iden tify any errors in real-time by comparing executed states to a set of permitted state transitions determined statically. This novel implementation weighs hardware resource trade-offs to target mul tiple independent tasks in multi-core embedded applications, as well as single core systems. The proposed system is entirely implemented in hardware and isolated from all monitored software components, requiring 2.4% of the target FPGA platform resources to protect an execution unit in its entirety. There fore, it avoids undesired overheads and maintains deterministic error detection latencies, which guarantees reliability improvements without impairing the target software system. Finally, CFTC is evaluated under different software i Resumo fault-injection scenarios, achieving detection rates of 100% of all control-flow errors to wrong destinations and 98% of all injected faults to program binaries. All detection times are further analyzed and precisely described by a model based on the monitor’s resources and speed and the software application’s control-flow structure and binary characteristics.Circuitos integrados estão presentes em quase todos sistemas complexos do mundo moderno. Conforme sua frequência de uso aumenta, eles precisam se tornar mais seguros e performantes para conseguir atender as novas demandas em potência de processamento. Sistemas em Chip integrados com FPGAs conseguem prover o balanço perfeito entre desempenho, adaptabilidade, e uso de energia. Um dos maiores desafios agora é a necessidade de atualizar técnicas de tolerância à falhas para estes novos sistemas, aproveitando os novos avanços em capacidade de processamento. Monitoramento de fluxo de controle é um dos principais mecanismos para a detecção de erros em nível de software para sistemas classificados como de alto risco (e.g. ASIL D), descrito em padrões de segurança como o ISO-26262. Estes erros são conhecidos por compor a maioria dos erros detectados em sistemas integrados [5]. Embora métodos de monitoramento baseados em software continuem sendo os mais populares [6–8], estudos recentes mostram que seus custos adicionais, em termos de performance e área, diminuem consideravelmente seus ganhos reais em confiabilidade [9, 10]. Propomos aqui um novo método de monitora mento de fluxo de controle implementado em FPGA para sistemas embarcados multi-core. Este método usa subsistemas de trace e execução de código para reconstruir o estado atual do processador, identificando erros através de com parações entre diferentes estados de execução da CPU. Propomos uma implementação que considera trade-offs no uso de recuros de sistema para monitorar múltiplas tarefas independetes. Nossa abordagem suporta o monitoramento de sistemas simples e também de sistemas multi-core multitarefa. Por fim, nossa técnica é totalmente implementada em hardware, evitando o uso de unidades de processamento de software que possa adicionar custos indesejáveis à aplicação em perda de confiabilidade. Propomos, assim, um mecanismo de verificação de fluxo de controle, escalável e extensível, para proteção de sistemas embarcados críticos e multi-core

    Real-Time Trace Decoding and Monitoring for Safety and Security in Embedded Systems

    Get PDF
    Integrated circuits and systems can be found almost everywhere in today’s world. As their use increases, they need to be made safer and more perfor mant to meet current demands in processing power. FPGA integrated SoCs can provide the ideal trade-off between performance, adaptability, and energy usage. One of today’s vital challenges lies in updating existing fault tolerance techniques for these new systems while utilizing all available processing capa bilities, such as multi-core and heterogeneous processing units. Control-flow monitoring is one of the primary mechanisms described for error detection at the software architectural level for the highest grade of hazard level clas sifications (e.g., ASIL D) described in industry safety standards ISO-26262. Control-flow errors are also known to compose the majority of detected errors for ICs and embedded systems in safety-critical and risk-susceptible environ ments [5]. Software-based monitoring methods remain the most popular [6–8]. However, recent studies show that the overheads they impose make actual reliability gains negligible [9, 10]. This work proposes and demonstrates a new control flow checking method implemented in FPGA for multi-core embedded systems called control-flow trace checker (CFTC). CFTC uses existing trace and debug subsystems of modern processors to rebuild their execution states. It can iden tify any errors in real-time by comparing executed states to a set of permitted state transitions determined statically. This novel implementation weighs hardware resource trade-offs to target mul tiple independent tasks in multi-core embedded applications, as well as single core systems. The proposed system is entirely implemented in hardware and isolated from all monitored software components, requiring 2.4% of the target FPGA platform resources to protect an execution unit in its entirety. There fore, it avoids undesired overheads and maintains deterministic error detection latencies, which guarantees reliability improvements without impairing the target software system. Finally, CFTC is evaluated under different software i Resumo fault-injection scenarios, achieving detection rates of 100% of all control-flow errors to wrong destinations and 98% of all injected faults to program binaries. All detection times are further analyzed and precisely described by a model based on the monitor’s resources and speed and the software application’s control-flow structure and binary characteristics.Circuitos integrados estão presentes em quase todos sistemas complexos do mundo moderno. Conforme sua frequência de uso aumenta, eles precisam se tornar mais seguros e performantes para conseguir atender as novas demandas em potência de processamento. Sistemas em Chip integrados com FPGAs conseguem prover o balanço perfeito entre desempenho, adaptabilidade, e uso de energia. Um dos maiores desafios agora é a necessidade de atualizar técnicas de tolerância à falhas para estes novos sistemas, aproveitando os novos avanços em capacidade de processamento. Monitoramento de fluxo de controle é um dos principais mecanismos para a detecção de erros em nível de software para sistemas classificados como de alto risco (e.g. ASIL D), descrito em padrões de segurança como o ISO-26262. Estes erros são conhecidos por compor a maioria dos erros detectados em sistemas integrados [5]. Embora métodos de monitoramento baseados em software continuem sendo os mais populares [6–8], estudos recentes mostram que seus custos adicionais, em termos de performance e área, diminuem consideravelmente seus ganhos reais em confiabilidade [9, 10]. Propomos aqui um novo método de monitora mento de fluxo de controle implementado em FPGA para sistemas embarcados multi-core. Este método usa subsistemas de trace e execução de código para reconstruir o estado atual do processador, identificando erros através de com parações entre diferentes estados de execução da CPU. Propomos uma implementação que considera trade-offs no uso de recuros de sistema para monitorar múltiplas tarefas independetes. Nossa abordagem suporta o monitoramento de sistemas simples e também de sistemas multi-core multitarefa. Por fim, nossa técnica é totalmente implementada em hardware, evitando o uso de unidades de processamento de software que possa adicionar custos indesejáveis à aplicação em perda de confiabilidade. Propomos, assim, um mecanismo de verificação de fluxo de controle, escalável e extensível, para proteção de sistemas embarcados críticos e multi-core

    Development and certification of mixed-criticality embedded systems based on probabilistic timing analysis

    Get PDF
    An increasing variety of emerging systems relentlessly replaces or augments the functionality of mechanical subsystems with embedded electronics. For quantity, complexity, and use, the safety of such subsystems is an increasingly important matter. Accordingly, those systems are subject to safety certification to demonstrate system's safety by rigorous development processes and hardware/software constraints. The massive augment in embedded processors' complexity renders the arduous certification task significantly harder to achieve. The focus of this thesis is to address the certification challenges in multicore architectures: despite their potential to integrate several applications on a single platform, their inherent complexity imperils their timing predictability and certification. Recently, the Measurement-Based Probabilistic Timing Analysis (MBPTA) technique emerged as an alternative to deal with hardware/software complexity. The innovation that MBPTA brings about is, however, a major step from current certification procedures and standards. The particular contributions of this Thesis include: (i) the definition of certification arguments for mixed-criticality integration upon multicore processors. In particular we propose a set of safety mechanisms and procedures as required to comply with functional safety standards. For timing predictability, (ii) we present a quantitative approach to assess the likelihood of execution-time exceedance events with respect to the risk reduction requirements on safety standards. To this end, we build upon the MBPTA approach and we present the design of a safety-related source of randomization (SoR), that plays a key role in the platform-level randomization needed by MBPTA. And (iii) we evaluate current certification guidance with respect to emerging high performance design trends like caches. Overall, this Thesis pushes the certification limits in the use of multicore and MBPTA technology in Critical Real-Time Embedded Systems (CRTES) and paves the way towards their adoption in industry.Una creciente variedad de sistemas emergentes reemplazan o aumentan la funcionalidad de subsistemas mecánicos con componentes electrónicos embebidos. El aumento en la cantidad y complejidad de dichos subsistemas electrónicos así como su cometido, hacen de su seguridad una cuestión de creciente importancia. Tanto es así que la comercialización de estos sistemas críticos está sujeta a rigurosos procesos de certificación donde se garantiza la seguridad del sistema mediante estrictas restricciones en el proceso de desarrollo y diseño de su hardware y software. Esta tesis trata de abordar los nuevos retos y dificultades dadas por la introducción de procesadores multi-núcleo en dichos sistemas críticos: aunque su mayor rendimiento despierta el interés de la industria para integrar múltiples aplicaciones en una sola plataforma, suponen una mayor complejidad. Su arquitectura desafía su análisis temporal mediante los métodos tradicionales y, asimismo, su certificación es cada vez más compleja y costosa. Con el fin de lidiar con estas limitaciones, recientemente se ha desarrollado una novedosa técnica de análisis temporal probabilístico basado en medidas (MBPTA). La innovación de esta técnica, sin embargo, supone un gran cambio cultural respecto a los estándares y procedimientos tradicionales de certificación. En esta línea, las contribuciones de esta tesis están agrupadas en tres ejes principales: (i) definición de argumentos de seguridad para la certificación de aplicaciones de criticidad-mixta sobre plataformas multi-núcleo. Se definen, en particular, mecanismos de seguridad, técnicas de diagnóstico y reacción de faltas acorde con el estándar IEC 61508 sobre una arquitectura multi-núcleo de referencia. Respecto al análisis temporal, (ii) presentamos la cuantificación de la probabilidad de exceder un límite temporal y su relación con los requisitos de reducción de riesgos derivados de los estándares de seguridad funcional. Con este fin, nos basamos en la técnica MBPTA y presentamos el diseño de una fuente de números aleatorios segura; un componente clave para conseguir las propiedades aleatorias requeridas por MBPTA a nivel de plataforma. Por último, (iii) extrapolamos las guías actuales para la certificación de arquitecturas multi-núcleo a una solución comercial de 8 núcleos y las evaluamos con respecto a las tendencias emergentes de diseño de alto rendimiento (caches). Con estas contribuciones, esta tesis trata de abordar los retos que el uso de procesadores multi-núcleo y MBPTA implican en el proceso de certificación de sistemas críticos de tiempo real y facilita, de esta forma, su adopción por la industria.Postprint (published version

    Mixed-Criticality Systems on Commercial-Off-the-Shelf Multi-Processor Systems-on-Chip

    Get PDF
    Avionics and space industries are struggling with the adoption of technologies like multi-processor system-on-chips (MPSoCs) due to strict safety requirements. This thesis propose a new reference architecture for MPSoC-based mixed-criticality systems (MCS) - i.e., systems integrating applications with different level of criticality - which are a common use case for aforementioned industries. This thesis proposes a system architecture capable of granting partitioning - which is, for short, the property of fault containment. It is based on the detection of spatial and temporal interference, and has been named the online detection of interference (ODIn) architecture. Spatial partitioning requires that an application is not able to corrupt resources used by a different application. In the architecture proposed in this thesis, spatial partitioning is implemented using type-1 hypervisors, which allow definition of resource partitions. An application running in a partition can only access resources granted to that partition, therefore it cannot corrupt resources used by applications running in other partitions. Temporal partitioning requires that an application is not able to unexpectedly change the execution time of other applications. In the proposed architecture, temporal partitioning has been solved using a bounded interference approach, composed of an offline analysis phase and an online safety net. The offline phase is based on a statistical profiling of a metric sensitive to temporal interference’s, performed in nominal conditions, which allows definition of a set of three thresholds: 1. the detection threshold TD; 2. the warning threshold TW ; 3. the α threshold. Two rules of detection are defined using such thresholds: Alarm rule When the value of the metric is above TD. Warning rule When the value of the metric is in the warning region [TW ;TD] for more than α consecutive times. ODIn’s online safety-net exploits performance counters, available in many MPSoC architectures; such counters are configured at bootstrap to monitor the selected metric(s), and to raise an interrupt request (IRQ) in case the metric value goes above TD, implementing the alarm rule. The warning rule is implemented in a software detection module, which reads the value of performance counters when the monitored task yields control to the scheduler and reset them if there is no detection. ODIn also uses two additional detection mechanisms: 1. a control flow check technique, based on compile-time defined block signatures, is implemented through a set of watchdog processors, each monitoring one partition. 2. a timeout is implemented through a system watchdog timer (SWDT), which is able to send an external signal when the timeout is violated. The recovery actions implemented in ODIn are: • graceful degradation, to react to IRQs of WDPs monitoring non-critical applications or to warning rule violations; it temporarily stops non-critical applications to grant resources to the critical application; • hard recovery, to react to the SWDT, to the WDP of the critical application, or to alarm rule violations; it causes a switch to a hot stand-by spare computer. Experimental validation of ODIn was performed on two hardware platforms: the ZedBoard - dual-core - and the Inventami board - quad-core. A space benchmark and an avionic benchmark were implemented on both platforms, composed by different modules as showed in Table 1 Each version of the final application was evaluated through fault injection (FI) campaigns, performed using a specifically designed FI system. There were three types of FI campaigns: 1. HW FI, to emulate single event effects; 2. SW FI, to emulate bugs in non-critical applications; 3. artificial bug FI, to emulate a bug in non-critical applications introducing unexpected interference on the critical application. Experimental results show that ODIn is resilient to all considered types of faul
    • …
    corecore