1,064,361 research outputs found

    Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)

    Get PDF
    Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of logic-based models, and fall short of checking the actual executable code. In this paper, we present a new methodology for testing distributed systems. Our approach applies advanced systematic testing techniques to thoroughly check that the executable code adheres to its high-level specifications, which significantly improves coverage of important system behaviors. Our methodology has been applied to three distributed storage systems in the Microsoft Azure cloud computing platform. In the process, numerous bugs were identified, reproduced, confirmed and fixed. These bugs required a subtle combination of concurrency and failures, making them extremely difficult to find with conventional testing techniques. An important advantage of our approach is that a bug is uncovered in a small setting and witnessed by a full system trace, which dramatically increases the productivity of debugging

    A Case Study in Testing Distributed Systems

    Get PDF
    This paper describes a case study in the testing of distributed systems. The software under test is a middleware system developed in Java. The full test lifecycle is examined including unit testing, integration testing, and system testing. Where possible, traditional tools and techniques are used to carry out the testing. One aspect where this is not possible is the testing of the low-level concurrency, which is often overlooked when testing commercial distributed systems, since the middleware or application server is already developed by a third-party and is assumed to operate correctly. This paper examines testing the middleware system itself, and therefore, a method for testing the concurrency properties of the system is used. The testing revealed a number of faults and design weaknesses, and showed that, with some adaptation, traditional tools and techniques go a long way in the testing of distributed applications

    Efficient testing based on logical architecture

    Get PDF
    The rapid increase of software-intensive systems' size and complexity makes it infeasible to exhaustively run testing on the low level of source code. Instead, the testing should be executed on the high level of system architecture, i.e., at a level where component or subsystems relate and interoperate or interact collectively with the system environment. Testing at this level is system testing, including hardware and software in union. Moreover, when integrating complex, distributed systems and providing support for conformance, interoperability and interoperation tests, we need to have an explicit test description. In this vision paper, we discuss (1) how to select tests from logical architecture, especially based on the dependencies within the system, and (2) how to represent the selected tests in explicit and readable manner, so that the software systems can be cost-e!ciently maintained and evolved over their entire life-cycle. In addition, we further study the relevance between di↵erent tests, based on which, we can optimise the test suites for e!cient testing, and propose optimal resource allocation strategies for cloud-based testing

    Towards quality-oriented architecture: Integration in a global context

    Get PDF
    This paper introduces an architectural framework for developing systems of systems, where the development plants are geographically distributed across different countries. The focus of our ongoing work is on architectural sustainability, in the sense of cost-effective longevity and endurance, and on quality assurance from the perspectives of integration in a global context. The core of our framework are different levels of abstraction, where state-of-the-art industrial development process is extended by the level of remote virtual system representation. Each abstraction level is associated with a different level of context-dependent architecture as well as the corresponding testing approaches

    Adaptive Resonance: An Emerging Neural Theory of Cognition

    Full text link
    Adaptive resonance is a theory of cognitive information processing which has been realized as a family of neural network models. In recent years, these models have evolved to incorporate new capabilities in the cognitive, neural, computational, and technological domains. Minimal models provide a conceptual framework, for formulating questions about the nature of cognition; an architectural framework, for mapping cognitive functions to cortical regions; a semantic framework, for precisely defining terms; and a computational framework, for testing hypotheses. These systems are here exemplified by the distributed ART (dART) model, which generalizes localist ART systems to allow arbitrarily distributed code representations, while retaining basic capabilities such as stable fast learning and scalability. Since each component is placed in the context of a unified real-time system, analysis can move from the level of neural processes, including learning laws and rules of synaptic transmission, to cognitive processes, including attention and consciousness. Local design is driven by global functional constraints, with each network synthesizing a dynamic balance of opposing tendencies. The self-contained working ART and dART models can also be transferred to technology, in areas that include remote sensing, sensor fusion, and content-addressable information retrieval from large databases.Office of Naval Research and the defense Advanced Research Projects Agency (N00014-95-1-0409, N00014-1-95-0657); National Institutes of Health (20-316-4304-5

    Advancements in Real-Time Simulation of Power and Energy Systems

    Get PDF
    Modern power and energy systems are characterized by the wide integration of distributed generation, storage and electric vehicles, adoption of ICT solutions, and interconnection of different energy carriers and consumer engagement, posing new challenges and creating new opportunities. Advanced testing and validation methods are needed to efficiently validate power equipment and controls in the contemporary complex environment and support the transition to a cleaner and sustainable energy system. Real-time hardware-in-the-loop (HIL) simulation has proven to be an effective method for validating and de-risking power system equipment in highly realistic, flexible, and repeatable conditions. Controller hardware-in-the-loop (CHIL) and power hardware-in-the-loop (PHIL) are the two main HIL simulation methods used in industry and academia that contribute to system-level testing enhancement by exploiting the flexibility of digital simulations in testing actual controllers and power equipment. This book addresses recent advances in real-time HIL simulation in several domains (also in new and promising areas), including technique improvements to promote its wider use. It is composed of 14 papers dealing with advances in HIL testing of power electronic converters, power system protection, modeling for real-time digital simulation, co-simulation, geographically distributed HIL, and multiphysics HIL, among other topics

    Aspect-oriented fault tolerance for real-time embedded systems

    Get PDF
    Real-time embedded systems for safety-critical applications have to introduce fault tolerance mechanisms in order to cope with hardware and software errors. Fault tolerance is usually applied by means of redundancy and diversity. Redundant hardware implies the establishment of a distributed system executing a set of fault tolerance strategies by software, and may also employ some form of diversity, by using different variants or versions for the same processing. This paper describes our approach to introduce fault tolerance in distributed embedded systems applications, using aspect-oriented programming (AOP). A real-time operating system sup-porting middleware thread communication was integrated to a fault tolerant framework. The introduction of fault tolerance in the system is performed by AOP at the application thread level. The advantages of this approach include higher modularization, less efforts for legacy systems evolution and better configurability for testing and product line development. This work has been tested and evaluated successfully in several fault tolerant configurations and presented no significant performance or memory footprint costs.Fundação para a Ciência e a Tecnologia (FCT
    • …
    corecore