544,088 research outputs found

    A support architecture for reliable distributed computing systems

    Get PDF
    The Clouds kernel design was through several design phases and is nearly complete. The object manager, the process manager, the storage manager, the communications manager, and the actions manager are examined

    Computing fractal dimension in supertransient systems directly, fast and reliable

    Full text link
    Chaotic transients occur in many experiments including those in fluids, in simulations of the plane Couette flow, and in coupled map lattices and they are a common phenomena in dynamical systems. Superlong chaotic transients are caused by the presence of chaotic saddles whose stable sets have fractal dimensions that are close to phase-space dimension. For many physical systems chaotic saddles have a big impact on laboratory measurements, and it is important to compute the dimension of such stable sets including fractal basin boundaries through a direct method. In this work, we present a new method to compute the dimension of stable sets of chaotic saddles directly, fast, and reliable.Comment: 6 pages, 3 figure

    An approach to rollback recovery of collaborating mobile agents

    Get PDF
    Fault-tolerance is one of the main problems that must be resolved to improve the adoption of the agents' computing paradigm. In this paper, we analyse the execution model of agent platforms and the significance of the faults affecting their constituent components on the reliable execution of agent-based applications, in order to develop a pragmatic framework for agent systems fault-tolerance. The developed framework deploys a communication-pairs independent check pointing strategy to offer a low-cost, application-transparent model for reliable agent- based computing that covers all possible faults that might invalidate reliable agent execution, migration and communication and maintains the exactly-one execution property

    Ultra Reliable Computing Systems

    Get PDF
    For high security and safety applications as well as general purpose applications, it is necessary to have ultra reliable computing systems. This dissertation describes our system of self-testable and self-repairable digital devices, especially, EPLDs (Electrically Programmable Logic Devices). In addition to significantly improving the reliability of digital systems, our self-healing and re-configurable system design with added repair capability can also provide higher yields, lower testing costs, and faster time-to-market for the semiconductor industry. The digital system in our approach is composed of blocks, which realize combinational and sequential circuits using GALs (Generic Array Logic Devices). We describe three techniques for fault-locating and fault-repairing in these devices. The methodology we used for evaluation of these methods and a comparison with devices that have no self-repair capability was simulation of the self-repair algorithms. Our simulations show that the lifetime for a GAL-based EPLD that uses our multiple self-repairing methods is longer than the lifetime of a GAL-based EPLD that uses a single self-repair method or no self-repair method. Specifically, our work demonstrates that the lifetime of a GAL can be increased by adding extra columns in the AND array of a GAL and extra output ORs in a GAL. It also gives information on how many extra columns and extra ORs a GAL needs and which self-repairing method should be used to guarantee a given lifetime. Thus, we can estimate an ideal point, where the maximum reliability can be reached with the minimum cost

    Plug in to grid computing

    Get PDF
    This article discusses the potential benefits of grid computing for future power networks. It is also intended to alert the power system community to the concept of grid computing and to initiate a discussion of its potential applications in future power systems. Much like the Web, the grid can operate over the Internet or any other suitable computer networking technology. Grid computing offers an inexpensive and efficient means for participants to compete (but also cooperate) in providing reliable, cheap, and sustainable electrical energy supply. It also provides a relatively inexpensive new technology allowing the output of embedded generators to be monitored and, when necessary, controlled. Basically, the ability of grid-enabled systems to interact autonomously is vital for small generators where manned operation is likely to be viable

    An Algebraic Model For Quorum Systems

    Get PDF
    Quorum systems are a key mathematical abstraction in distributed fault-tolerant computing for capturing trust assumptions. A quorum system is a collection of subsets of all processes, called quorums, with the property that each pair of quorums have a non-empty intersection. They can be found at the core of many reliable distributed systems, such as cloud computing platforms, distributed storage systems and blockchains. In this paper we give a new interpretation of quorum systems, starting with classical majority-based quorum systems and extending this to Byzantine quorum systems. We propose an algebraic representation of the theory underlying quorum systems making use of multivariate polynomial ideals, incorporating properties of these systems, and studying their algebraic varieties. To achieve this goal we will exploit properties of Boolean Groebner bases. The nice nature of Boolean Groebner bases allows us to avoid part of the combinatorial computations required to check consistency and availability of quorum systems. Our results provide a novel approach to test quorum systems properties from both algebraic and algorithmic perspectives.Comment: 15 pages, 3 algorithm

    A NASA initiative: Software engineering for reliable complex systems

    Get PDF
    The objective is the development of methods, technology, and skills that will enable NASA to cost-effectively specify, build, and manage reliable software which can evolve and be maintained over an extended period. The need for such software is rooted in the increasing integration of software and computing components into NASA systems. Current NASA Software Engineering expertise was applied toward some of the largest reliable systems including: shuttle launch; ground support; shuttle simulation; minor control; satellite tracking; and scientific data systems. Unfortunately, no theory exists for reliable complex software systems. NASA is seeking to fill this theoretical gap through a number of approaches. One such approach is to conduct research on theoretical foundations for managing complex software systems. It includes: communication models, new and modified paradigms, and life-cycle models. Another approach is research in the theoretical foundations for reliable software development and validation. It focuses upon formal specifications, programming languages, software engineering systems, software reuse, formal verification, and software safety. Further approaches involve benchmarking a NASA software environment, experimentation within the NASA context, evolution of present NASA methodology, and transfer of technology to the space station software support environment

    Reliable spin-based computing systems

    Get PDF
    Scaling of logic devices has enabled tremendous improvement in computational efficiency. However, computational scaling beyond the electronics based on Moore's law requires the adoption of alternate state variables including spin. Spin based devices offer several advantages such as low device count and non-volatility, and have the potential to beat the energy-delay product of CMOS. However, thermal noise in these devices makes their switching delay a random variable. Deterministic von Neumann style computing requires them to operate at worst case delay (and low error-rate), thereby completely offsetting the energy-delay benefits of spin devices and making them non-competitive against CMOS. In this thesis, we show that, by exploiting inherent device characteristics and architectural-level techniques, it is possible to shape the system-level output error distribution, thereby enabling effective error compensation and reliable system behavior. In particular, we demonstrate that, for a simple binary classifier, 33Ă— improvement in accuracy over conventional design can be achieved while tolerating device error rate of 10%. This work paves a way towards the design of reliable spin-based systems using highly error prone, but energy-efficient spin devices

    Welcome to EICS 2016

    Get PDF
    [Extract] The ACM SIGCHI Symposium on Engineering Interactive Computing Systems (EICS) is a yearly international conference devoted to engineering usable and reliable interactive computing systems. Research presented at EICS revolves around methods, processes, techniques and tools that support specifying, designing, developing, deploying and verifying interactive systems. This 8th ACM SIGCHI Symposium on Engineering Interactive Computing Systems (EICS'16) took place in Brussels, Belgium (21-24 June 2016) – at the heart of Europe...info:eu-repo/semantics/publishedVersio
    • …
    corecore