12 research outputs found

    Introduction to the special section on dependable network computing

    Get PDF
    Dependable network computing is becoming a key part of our daily economic and social life. Every day, millions of users and businesses are utilizing the Internet infrastructure for real-time electronic commerce transactions, scheduling important events, and building relationships. While network traffic and the number of users are rapidly growing, the mean-time between failures (MTTF) is surprisingly short; according to recent studies, in the majority of Internet backbone paths, the MTTF is 28 days. This leads to a strong requirement for highly dependable networks, servers, and software systems. The challenge is to build interconnected systems, based on available technology, that are inexpensive, accessible, scalable, and dependable. This special section provides insights into a number of these exciting challenges

    Application-level fault tolerance in real-time embedded systems

    Get PDF
    Critical real-time embedded systems need to make use of fault tolerance techniques to cope with operation time errors, either in hardware or software. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This work proposes and evaluates a fault tolerance framework for supporting the development of dependable applications. This framework is build upon basic operating system services and middleware communications and brings flexible and transparent support for application threads. A case study involving radar filtering is described and the framework advantages and drawbacks are discussed.Fundação para a Ciência e a Tecnologia (FCT

    A Development Process for the Design, Implementation and Code Generation of Fault Tolerant Reconfigurable Real Time Systems

    Get PDF
    The implementation of hard real-time systems is extremely a hard task today due to safety and dynamic reconfiguration requirements. For that, whatever the taken precautions, the occurrence of faults in such systems is sometimes unavoidable. So, developers have to take into account the presence of faults since the design level. In this context, we notice the need of techniques ensuring the dependability of real-time distributed dynamically reconfigurable systems. We focus on fault-tolerance, that means avoiding service failures in the presence of faults. In this paper, we have defined a development process for modeling and generating fault tolerance code for real-time systems using aspect oriented programming. First, we integrate fault tolerance elements since the modeling step of a system in order to take advantage of features of analysis, proof and verification possible at this stage using AADL and its annex Error Model Annex. Second, we extend an aspect oriented language and adapt it to respect real-time requirements. Finally, we define a code generation process for both functional preoccupations and cross-cutting ones like fault tolerance and we propose an extension of an existent middleware. To validate our contribution, we use AADL and its annexes to design a landing gear system as an embedded distributed one

    An integrated methodology for the performance and reliability evaluation of fault-tolerant systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (leaves 220-224).This thesis proposes a new methodology for the integrated performance and reliability evaluation of embedded fault-tolerant systems used in aircraft, space, tactical, and automotive applications. This methodology uses a behavioral model of the system dynamics, similar to the ones used by control engineers when designing the control system, but incorporates additional artifacts to model the failure behavior of the system components. These artifacts include component failure modes (and associated failure rates) and how those failure modes affect the dynamic behavior of the component. The methodology bases the system evaluation on the analysis of the dynamics of the different configurations the system can reach after component failures occur. For each of the possible system configurations, a performance evaluation of its dynamic behavior is carried out to check whether its properties, e.g., accuracy, overshoot, or settling time, which are called performance metrics, meet system requirements. Markov chains are used to model the stochastic process associated with the different configurations that a system can adopt when failures occur.(cont.) Reliability and unreliability measures can be quantified, as well as probabilistic measures of performance, by merging the values of the performance metrics for each configuration and the system configuration probabilities yielded by the corresponding Markov model. This methodology is not only used for system evaluation, but also for guiding the design process, and further optimization. Thus, within the context of the new methodology, we define new importance measures to rank the contributions of model parameters to system reliability and performance. In order to support this methodology, we developed a MATLAB/SIMULINK® tool, which also provides a common environment with a common language for control engineers and reliability engineers to develop fault-tolerant systems. We illustrate the use of the methodology and the capabilities of the tool with two case-studies. The first one corresponds to the lateral-directional control system of an advanced fighter aircraft. This case-study shows how the methodology can identify weak points in the system design; and point out possible solutions to eliminate them; compare different architecture alternatives from different perspectives; and test different failure detection, isolation, and reconfiguration (FDIR) techniques.(cont.) This case-study also shows the effectiveness of the MATLAB/SIMULINK® tool to analyze large and complex systems. The second case-study compares two very different solutions to achieve fault-tolerance in a steer-by-wire (SbW) system. The first solution is based on the replication of components; and the introduction of failure detection, isolation, and reconfiguration mechanisms. In the second solution, a dissimilar backup mechanism called brake-actuated steering (BAS), is used to achieve fault-tolerance rather than replicating each component within the system. This case-study complements the flight control system one by showing how the performance and MATLAB/SIMULINK® tool can be used to compare very different architectural approaches to achieve fault-tolerance; and therefore, how the methodology can be used to choose the best design in terms of performance and reliability.by Alejandro D. Domínguez-García.Ph.D

    NASA Langley Scientific and Technical Information Output: 1998

    Get PDF
    This document is a compilation of the scientific and technical information that the Langley Research Center has produced during the calendar year 1998. Included are citations for Technical Publications, Conference Publications, Technical Memorandums, Contractor Reports, Journal Articles and Book Publications, Meeting Presentations, Technical Talks, and Patents

    Operating system fault tolerance support for real-time embedded applications

    Get PDF
    Tese de doutoramento em Electrónica Industrial (ramo de conhecimento em Informática Industrial)Fault tolerance is a means of achieving high dependability for critical and highavailability systems. Despite the efforts to prevent and remove faults during the development of these systems, the application of fault tolerance is usually required because the hardware may fail during system operation and software faults are very hard to eliminate completely. One of the difficulties in implementing fault tolerance techniques is the lack of support from operating systems and middleware. In most fault tolerant projects, the programmer has to develop a fault tolerance implementation for each application. This strong customization makes the fault-tolerant software costly and difficult to implement and maintain. In particular, for small-scale embedded systems, the introduction of fault tolerance techniques may also have impact on their restricted resources, such as processing power and memory size. The purpose of this research is to provide fault tolerance support for real-time applications in small-scale embedded systems. The main approach of this thesis is to develop and integrate a customizable and extendable fault tolerance framework into a real-time operating system, in order to fulfill the needs of a large range of dependable applications. Special attention is taken to allow the coexistence of fault tolerance with real-time constraints. The utilization of the proposed framework features several advantages over ad-hoc implementations, such as simplifying application-level programming and improving the system configurability and maintainability. In addition, this thesis also investigates the application of aspect-oriented techniques to the development of real-time embedded fault-tolerant software. Aspect- Oriented Programming (AOP) is employed to modularize all fault tolerant source code, following the principle of separation of concerns, and to integrate the proposed framework into the operating system. Two case studies are used to evaluate the proposed implementation in terms of performance and resource costs. The results show that the overheads related to the framework application are acceptable and the ones related to the AOP implementation are negligible.Tolerância a falhas é um meio de obter-se alta confiabilidade para sistemas críticos e de elevada disponibilidade. Apesar dos esforços para prevenir e remover falhas durante o desenvolvimento destes sistemas, a aplicação de tolerância a falhas é normalmente necessária, já que o hardware pode falhar durante a operação do sistema e falhas de software são muito difíceis de eliminar completamente. Uma das dificuldades na implementação de técnicas de tolerância a falhas é a falta de suporte por parte dos sistemas operativos e middleware. Na maioria dos projectos tolerantes a falhas, o programador deve desenvolver uma implementação de tolerância a falhas para cada aplicação. Esta elevada adaptação torna o software tolerante a falhas dispendioso e difícil de implementar e manter. Em particular, para sistemas embebidos de pequena escala, a introdução de técnicas de tolerância a falhas pode também ter impacto nos seus restritos recursos, tais como capacidade de processamento e tamanho da memória. O propósito desta tese é prover suporte à tolerância a falhas para aplicações de tempo real em sistemas embebidos de pequena escala. A principal abordagem utilizada nesta tese foi desenvolver e integrar uma framework tolerante a falhas, customizável e extensível, a um sistema operativo de tempo real, a fim de satisfazer às necessidades de uma larga gama de aplicações confiáveis. Especial atenção foi dada para permitir a coexistência de tolerância a falhas com restrições de tempo real. A utilização da framework proposta apresenta diversas vantagens sobre implementações ad-hoc, tais como simplificar a programação a nível da aplicação e melhorar a configurabilidade e a facilidade de manutenção do sistema. Além disto, esta tese também investiga a aplicação de técnicas orientadas a aspectos no desenvolvimento de software tolerante a falhas, embebido e de tempo real. A Programação Orientada a Aspectos (POA) é empregada para segregar em módulos isolados todo o código fonte tolerante a falhas, seguindo o princípio da separação de interesses, e para integrar a framework proposta com o sistema operativo. Dois casos de estudo são utilizados para avaliar a implementação proposta em termos de desempenho e utilização de recursos. Os resultados mostram que os acréscimos de recursos relativos à aplicação da framework são aceitáveis e os relativos à implementação POA são insignificantes

    Energy- and quality-aware scheduling of periodic tasks in embedded real-time systems

    Get PDF
    Mobile Geräte dienen immer häufiger zur Ausführung von Echtzeitanwendungen, sie bieten immer mehr Rechenleistung und sie werden kleiner und leichter. Hohe Rechenleistung erfordert jedoch sehr viel Energie, was im Gegensatz zu den geringen Akkukapazitäten, die aus der Forderung nach kleinen und leichten Geräten resultieren, steht. Bei der Echtzeiteinplanung von Rechenprozessen gewinnt daher der Energieverbrauch der Geräte neben der rechtzeitigen Beendigung von Anwendungen zunehmend an Bedeutung, weil sie möglichst lange unabhängig vom Stromnetz betrieben werden sollen. Andererseits werden auf diesen Geräten rechenintensive Anwendungen ausgeführt, bei denen es wünschenswert ist, die maximale mit der verfügbaren Rechenleistung erzielbare Qualität zu erhalten. In dieser Arbeit wird ein Systemmodell vorgestellt, das den Design-to-time-Ansatz mit den Möglichkeiten der dynamischen Leistungsanpassung (Rechenleistung und verbrauchte elektrische Leistung) moderner Prozessoren vereinigt. Der Design-to-time-Ansatz ermöglicht Energieeinsparungen oder Qualitätssteigerungen durch die dynamische Auswahl alternativer Implementierungen, welche dieselbe Aufgabe mit unterschiedlicher Ausführungsdauer und Qualität bzw. Energieverbrauch erfüllen. Das Systemmodell umfaßt unter anderem periodische Prozesse mit harten Echtzeitbedingungen, Datenabhängigkeiten und alternativen Implementierungen, sowie Prozessoren mit diskreten Leistungsstufen. Die Einplanung der Prozesse erfolgt in zwei Phasen. In der Offline-Phase wird ein flexibler Schedule berechnet, der für die zur Laufzeit möglichen Kombinationen von verstrichener Zeit und noch einzuplanender Prozeßmenge den jeweils einzuplanenden Prozeß, sowie die zu verwendende Implementierung und gegebenenfalls die einzustellende Leistungsstufe beinhaltet. Dieser flexible Schedule wird während der Online-Phase mit vernachlässigbarem Zeit- und Energieaufwand von einem Scheduler interpretiert. Für die Berechnung der optimalen flexiblen Schedules wurde ein Optimierer entwickelt, der eine Folge von flexiblen Schedules mit monoton steigender Güte (niedriger Energieverbrauch bzw. hohe Qualität) generiert, und damit der Klasse der Anytime-Algorithmen zuzuordnen ist. Eine Variante der Dynamischen Programmierung dient zur Bestimmung global optimaler, flexibler Schedules, die beispielsweise als Basis für Benchmarks dienen. Eine auf Simulated Annealing basierende Variante des Optimierers ermöglicht ein schnelleres Auffinden guter, flexibler Schedules für umfangreichere Anwendungen.Mobile devices are excessively used for executing real-time applications, today. They provide increasing performance and they are getting more lightweight and smaller every day. Unfortunately, high processing performance demands much energy and thus anticipates smaller battery capacities resulting from the required size and weight of the devices. Therefore, energy consumption gains importance besides the timely completion of real-time tasks when a schedule has to be calculated, to provide a longer operating time independent of a power outlet. On the other hand, when calculation intensive tasks are being executed, high performance should be provided, to obtain maximum quality. This work presents a system model joining the design-to-time approach with modern processor's capabilities to run at different clock frequencies. Design-to-time scheduling allows for energy savings or quality enhancements by dynamically selecting alternative implementations, which fulfill a task's function with different time and with different energy consumption or quality. The system model comprises periodic tasks with hard real-time constraints, data-dependencies and alternative implementations, as well as processors with multiple clock modes. Scheduling is split in two phases. First, in an offline phase a flexible plan is calculated. It contains the task, implementation and clock frequency to be scheduled for every possible combination of elapsed time and unscheduled task set. Second, the flexible plan is interpreted by an online scheduler with a negligible amount of time and energy. A pair of optimization algorithms has been developed for calculating optimal flexible plans. They deliver a series of flexible plans with increasing quality or decreasing energy demand, and therefore they belong to the class of anytime algorithms. A variation of dynamic programming is used for finding globally optimal plans, e.g. aiming as reference values for benchmarks, whereas for complex system models an optimizer based on simulated annealing is provided, that finds good flexible plans fast
    corecore