2,182 research outputs found

    Software dependability modeling using an industry-standard architecture description language

    Full text link
    Performing dependability evaluation along with other analyses at architectural level allows both making architectural tradeoffs and predicting the effects of architectural decisions on the dependability of an application. This paper gives guidelines for building architectural dependability models for software systems using the AADL (Architecture Analysis and Design Language). It presents reusable modeling patterns for fault-tolerant applications and shows how the presented patterns can be used in the context of a subsystem of a real-life application

    Developing a distributed electronic health-record store for India

    Get PDF
    The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India

    Exception handling in the development of fault-tolerant component-based systems

    Get PDF
    Orientador: Cecilia Mary Fischer RubiraTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Mecanismos de tratamento de exceçÔes foram concebidos com o intuito de facilitar o gerenciamento da complexidade de sistemas de software tolerantes a falhas. Eles promovem uma separação textual explĂ­cita entre o cĂłdigo normal e o cĂłdigo que lida com situaçÔes anormais, afim de dar suporte a construção de programas que sĂŁo mais concisos fĂĄceis de evoluir e confĂĄveis. Diversas linguagens de programação modernas e a maioria dos modelos de componentes implementam mecanismos de tratamento de exceçÔes. Apesar de seus muitos benefĂ­cios, tratamento de exceçÔes pode ser a fonte de diversas falhas de projeto se usado de maneira indisciplinada. Estudos recentes mostram que desenvolvedores de sistemas de grande escala baseados em infra-estruturas de componentes tĂȘm hĂĄbitos, no tocante ao uso de tratamento de exceçÔes, que tornam suas aplicaçÔes vulnerĂĄveis a falhas e difĂ­ceis de se manter. Componentes de software criam novos desafios com os quais mecanismos de tratamento de exceçÔes tradicionais nĂŁo lidam, o que aumenta a probabilidade de que problemas ocorram. Alguns exemplos sĂŁo indisponibilidade de cĂłdigo fonte e incompatibilidades arquiteturais. Neste trabalho propomos duas tĂ©cnicas complementares centradas em tratamento de exceçÔes para a construção de sistemas tolerantes a falhas baseados em componentes. Ambas tĂȘm ĂȘnfase na estrutura do sistema como um meio para se reduzir o impacto de mecanismos de tolerĂąncia a falhas em sua complexidade total e o nĂșmero de falhas de projeto decorrentes dessa complexidade. A primeira Ă© uma abordagem para o projeto arquitetural dos mecanismos de recuperação de erros de um sistema. Ela trata do problema de verificar se uma arquitetura de software satisfaz certas propriedades relativas ao fluxo de exceçÔes entre componentes arquiteturais, por exemplo, se todas as exceçÔes lançadas no nĂ­vel arquitetural sĂŁo tratadas. A abordagem proposta lança de diversas ferramentas existentes para automatizar ao mĂĄximo esse processo. A segunda consiste em aplicar programação orientada a aspectos (AOP) afim de melhorar a modularização de cĂłdigo de tratamento de exceçÔes. Conduzimos um estudo aprofundado com o objetivo de melhorar o entendimento geral sobre o efeitos de AOP no cĂłdigo de tratamento de exceçÔes e identificar as situaçÔes onde seu uso Ă© vantajoso e onde nĂŁo Ă©Abstract: Exception handling mechanisms were conceived as a means to help managing the complexity of fault-tolerant software. They promote an explicit textual separation between normal code and the code that deals with abnormal situations, in order to support the construction of programs that are more concise, evolvable, and reliable. Several mainstream programming languages and most of the existing component models implement exception handling mechanisms. In spite of its many bene?ts, exception handling can be a source of many design faults if used in an ad hoc fashion. Recent studies show that developers of large-scale software systems based on component infrastructures have habits concerning the use of exception handling that make applications vulnerable to faults and hard to maintain. Software components introduce new challenges which are not addressed by traditional exception handling mechanisms and increase the chances of problems occurring. Examples include unavailability of source code and architectural mismatches. In this work, we propose two complementary techniques centered on exception handling for the construction of fault-tolerant component-based systems. Both of them emphasize system structure as a means to reduce the impactof fault tolerance mechanisms on the overall complexity of a software system and the number of design faults that stem from complexity. The ?rst one is an approach for the architectural design of a system?s error handling capabilities. It addresses the problem of verifying whether a software architecture satis?es certain properties of interest pertaining the ?ow of exceptions between architectural components, e.g., if all the exceptions signaled at the architectural level are eventually handled. The proposed approach is based on a set of existing tools that automate this process as much as possible. The second one consists in applying aspect-oriented programming (AOP) to better modularize exception handling code. We have conducted a through study aimed at improving our understanding of the efects of AOP on exception handling code and identifying the situations where its use is advantageous and the ones where it is notDoutoradoDoutor em CiĂȘncia da Computaçã

    Rigorous Design of Distributed Transactions

    No full text
    Database replication is traditionally envisaged as a way of increasing fault-tolerance and availability. It is advantageous to replicate the data when transaction workload is predominantly read-only. However, updating replicated data within a transactional framework is a complex affair due to failures and race conditions among conflicting transactions. This thesis investigates various mechanisms for the management of replicas in a large distributed system, formalizing and reasoning about the behavior of such systems using Event-B. We begin by studying current approaches for the management of replicated data and explore the use of broadcast primitives for processing transactions. Subsequently, we outline how a refinement based approach can be used for the development of a reliable replicated database system that ensures atomic commitment of distributed transactions using ordered broadcasts. Event-B is a formal technique that consists of describing rigorously the problem in an abstract model, introducing solutions or design details in refinement steps to obtain more concrete specifications, and verifying that the proposed solutions are correct. This technique requires the discharge of proof obligations for consistency checking and refinement checking. The B tools provide significant automated proof support for generation of the proof obligations and discharging them. The majority of the proof obligations are proved by the automatic prover of the tools. However, some complex proof obligations require interaction with the interactive prover. These proof obligations also help discover new system invariants. The proof obligations and the invariants help us to understand the complexity of the problem and the correctness of the solutions. They also provide a clear insight into the system and enhance our understanding of why a design decision should work. The objective of the research is to demonstrate a technique for the incremental construction of formal models of distributed systems and reasoning about them, to develop the technique for the discovery of gluing invariants due to prover failure to automatically discharge a proof obligation and to develop guidelines for verification of distributed algorithms using the technique of abstraction and refinement

    Study of fault-tolerant software technology

    Get PDF
    Presented is an overview of the current state of the art of fault-tolerant software and an analysis of quantitative techniques and models developed to assess its impact. It examines research efforts as well as experience gained from commercial application of these techniques. The paper also addresses the computer architecture and design implications on hardware, operating systems and programming languages (including Ada) of using fault-tolerant software in real-time aerospace applications. It concludes that fault-tolerant software has progressed beyond the pure research state. The paper also finds that, although not perfectly matched, newer architectural and language capabilities provide many of the notations and functions needed to effectively and efficiently implement software fault-tolerance

    EOS: A project to investigate the design and construction of real-time distributed embedded operating systems

    Get PDF
    The EOS project is investigating the design and construction of a family of real-time distributed embedded operating systems for reliable, distributed aerospace applications. Using the real-time programming techniques developed in co-operation with NASA in earlier research, the project staff is building a kernel for a multiple processor networked system. The first six months of the grant included a study of scheduling in an object-oriented system, the design philosophy of the kernel, and the architectural overview of the operating system. In this report, the operating system and kernel concepts are described. An environment for the experiments has been built and several of the key concepts of the system have been prototyped. The kernel and operating system is intended to support future experimental studies in multiprocessing, load-balancing, routing, software fault-tolerance, distributed data base design, and real-time processing

    Autonomous spacecraft maintenance study group

    Get PDF
    A plan to incorporate autonomous spacecraft maintenance (ASM) capabilities into Air Force spacecraft by 1989 is outlined. It includes the successful operation of the spacecraft without ground operator intervention for extended periods of time. Mechanisms, along with a fault tolerant data processing system (including a nonvolatile backup memory) and an autonomous navigation capability, are needed to replace the routine servicing that is presently performed by the ground system. The state of the art fault handling capabilities of various spacecraft and computers are described, and a set conceptual design requirements needed to achieve ASM is established. Implementations for near term technology development needed for an ASM proof of concept demonstration by 1985, and a research agenda addressing long range academic research for an advanced ASM system for 1990s are established
    • 

    corecore