30,065 research outputs found

    Software that Learns from its Own Failures

    Full text link
    All non-trivial software systems suffer from unanticipated production failures. However, those systems are passive with respect to failures and do not take advantage of them in order to improve their future behavior: they simply wait for them to happen and trigger hard-coded failure recovery strategies. Instead, I propose a new paradigm in which software systems learn from their own failures. By using an advanced monitoring system they have a constant awareness of their own state and health. They are designed in order to automatically explore alternative recovery strategies inferred from past successful and failed executions. Their recovery capabilities are assessed by self-injection of controlled failures; this process produces knowledge in prevision of future unanticipated failures

    Process membership in asynchronous environments

    Get PDF
    The development of reliable distributed software is simplified by the ability to assume a fail-stop failure model. The emulation of such a model in an asynchronous distributed environment is discussed. The solution proposed, called Strong-GMP, can be supported through a highly efficient protocol, and was implemented as part of a distributed systems software project at Cornell University. The precise definition of the problem, the protocol, correctness proofs, and an analysis of costs are addressed

    Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration

    Full text link
    Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if traceability links between code and tests are not available. This paper introduces Retecs, a new method for automatically learning test case selection and prioritization in CI with the goal to minimize the round-trip time between code commits and developer feedback on failed test cases. The Retecs method uses reinforcement learning to select and prioritize test cases according to their duration, previous last execution and failure history. In a constantly changing environment, where new test cases are created and obsolete test cases are deleted, the Retecs method learns to prioritize error-prone test cases higher under guidance of a reward function and by observing previous CI cycles. By applying Retecs on data extracted from three industrial case studies, we show for the first time that reinforcement learning enables fruitful automatic adaptive test case selection and prioritization in CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. In Proceedings of 26th International Symposium on Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC

    Machine learning and its applications in reliability analysis systems

    Get PDF
    In this thesis, we are interested in exploring some aspects of Machine Learning (ML) and its application in the Reliability Analysis systems (RAs). We begin by investigating some ML paradigms and their- techniques, go on to discuss the possible applications of ML in improving RAs performance, and lastly give guidelines of the architecture of learning RAs. Our survey of ML covers both levels of Neural Network learning and Symbolic learning. In symbolic process learning, five types of learning and their applications are discussed: rote learning, learning from instruction, learning from analogy, learning from examples, and learning from observation and discovery. The Reliability Analysis systems (RAs) presented in this thesis are mainly designed for maintaining plant safety supported by two functions: risk analysis function, i.e., failure mode effect analysis (FMEA) ; and diagnosis function, i.e., real-time fault location (RTFL). Three approaches have been discussed in creating the RAs. According to the result of our survey, we suggest currently the best design of RAs is to embed model-based RAs, i.e., MORA (as software) in a neural network based computer system (as hardware). However, there are still some improvement which can be made through the applications of Machine Learning. By implanting the 'learning element', the MORA will become learning MORA (La MORA) system, a learning Reliability Analysis system with the power of automatic knowledge acquisition and inconsistency checking, and more. To conclude our thesis, we propose an architecture of La MORA

    Simplified Distributed Programming with Micro Objects

    Full text link
    Developing large-scale distributed applications can be a daunting task. object-based environments have attempted to alleviate problems by providing distributed objects that look like local objects. We advocate that this approach has actually only made matters worse, as the developer needs to be aware of many intricate internal details in order to adequately handle partial failures. The result is an increase of application complexity. We present an alternative in which distribution transparency is lessened in favor of clearer semantics. In particular, we argue that a developer should always be offered the unambiguous semantics of local objects, and that distribution comes from copying those objects to where they are needed. We claim that it is often sufficient to provide only small, immutable objects, along with facilities to group objects into clusters.Comment: In Proceedings FOCLASA 2010, arXiv:1007.499

    Learning to Represent Haptic Feedback for Partially-Observable Tasks

    Full text link
    The sense of touch, being the earliest sensory system to develop in a human body [1], plays a critical part of our daily interaction with the environment. In order to successfully complete a task, many manipulation interactions require incorporating haptic feedback. However, manually designing a feedback mechanism can be extremely challenging. In this work, we consider manipulation tasks that need to incorporate tactile sensor feedback in order to modify a provided nominal plan. To incorporate partial observation, we present a new framework that models the task as a partially observable Markov decision process (POMDP) and learns an appropriate representation of haptic feedback which can serve as the state for a POMDP model. The model, that is parametrized by deep recurrent neural networks, utilizes variational Bayes methods to optimize the approximate posterior. Finally, we build on deep Q-learning to be able to select the optimal action in each state without access to a simulator. We test our model on a PR2 robot for multiple tasks of turning a knob until it clicks.Comment: IEEE International Conference on Robotics and Automation (ICRA), 201

    Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning

    Full text link
    Robots that navigate among pedestrians use collision avoidance algorithms to enable safe and efficient operation. Recent works present deep reinforcement learning as a framework to model the complex interactions and cooperation. However, they are implemented using key assumptions about other agents' behavior that deviate from reality as the number of agents in the environment increases. This work extends our previous approach to develop an algorithm that learns collision avoidance among a variety of types of dynamic agents without assuming they follow any particular behavior rules. This work also introduces a strategy using LSTM that enables the algorithm to use observations of an arbitrary number of other agents, instead of previous methods that have a fixed observation size. The proposed algorithm outperforms our previous approach in simulation as the number of agents increases, and the algorithm is demonstrated on a fully autonomous robotic vehicle traveling at human walking speed, without the use of a 3D Lidar

    Organisational culture in airworthiness management programs: Developing a measurement model

    Get PDF
    All civil and private aircraft are required to comply with the airworthiness standards set by their national airworthiness authority and throughout their operational life must be in a condition of safe operation. Aviation accident data shows that over twenty percent of all fatal accidents in aviation are due to airworthiness issues, specifically aircraft mechanical failures. Ultimately it is the responsibility of each registered operator to ensure that their aircraft remain in a condition of safe operation, and this is done through both effective management of airworthiness activities and the effective program governance of safety outcomes. Typically, the projects within these airworthiness management programs are focused on acquiring, modifying and maintaining the aircraft as a capability supporting the business. Program governance provides the structure through which the goals and objectives of airworthiness programs are set along with the means of attaining them. Whilst the principal causes of failures in many programs can be traced to inadequate program governance, many of the failures in large scale projects can have their root causes in the organisational culture and more specifically in the organisational processes related to decision-making. This paper examines the primary theme of project and program based enterprises, and introduces a model for measuring organisational culture in airworthiness management programs using measures drawn from 211 respondents in Australian airline programs. The paper describes the theoretical perspectives applied to modifying an original model to specifically focus it on measuring the organisational culture of programs for managing airworthiness; identifying the most important factors needed to explain the relationship between the measures collected, and providing a description of the nature of these factors. The paper concludes by identifying a model that best describes the organisational culture data collected from seven airworthiness management programs

    Principles of Antifragile Software

    Full text link
    The goal of this paper is to study and define the concept of "antifragile software". For this, I start from Taleb's statement that antifragile systems love errors, and discuss whether traditional software dependability fits into this class. The answer is somewhat negative, although adaptive fault tolerance is antifragile: the system learns something when an error happens, and always imrpoves. Automatic runtime bug fixing is changing the code in response to errors, fault injection in production means injecting errors in business critical software. I claim that both correspond to antifragility. Finally, I hypothesize that antifragile development processes are better at producing antifragile software systems.Comment: see https://refuses.github.io

    Cooperating intelligent systems

    Get PDF
    Some of the issues connected to the development of a bureaucratic system are discussed. Emphasis is on a layer multiagent approach to distributed artificial intelligence (DAI). The division of labor in a bureaucracy is considered. The bureaucratic model seems to be a fertile model for further examination since it allows for the growth and change of system components and system protocols and rules. The first part of implementing the system would be the construction of a frame based reasoner and the appropriate B-agents and E-agents. The agents themselves should act as objects and the E-objects in particular should have the capability of taking on a different role. No effort was made to address the problems of automated failure recovery, problem decomposition, or implementation. Instead what has been achieved is a framework that can be developed in several distinct ways, and which provides a core set of metaphors and issues for further research
    corecore