129 research outputs found

    Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

    Get PDF
    We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge about the target domain can guide this process, and typically is captured in parameter settings. Often, domain expertise is subconscious and not expressed explicitly. Directly interacting with the learning algorithm makes it easier to utilize this knowledge effectively.Comment: 4 pages, presented at the Human in the Loop workshop at ICML 201

    Automatic Test Set Generation for Event-Driven Systems in the Absence of Specifications Combining Testing with Model Inference

    Get PDF
    The growing dependency of human activities on software technologies is leading to the need for designing more and more accurate testing techniques to ensure the quality and reliability of software components. A recent literature review of software testing methodologies reveals that several new approaches, which differ in the way test inputs are generated to efficiently explore systems behaviour, have been proposed. This paper is concerned with the challenge of automatically generating test input sets for Event-Driven Systems (EDS) for which neither source code nor specifications are available, therefore we propose an innovative fully automatic testing with model learning technique. It basically involves active learning to automatically infer a behavioural model of the System Under Test (SUT) using tests as queries, generates further tests based on the learned model to systematically explore unseen parts of the subject system, and makes use of passive learning to refine the current model hypothesis as soon as an inconsistency is found with the observed behaviour. Our passive learning algorithm uses the basic steps of Evidence-Driven State Merging (EDSM) and introduces an effective heuristic for choosing the pair of states to merge to obtain the target machine. Finally, the effectiveness of the proposed testing technique is demonstrated within the context of event-based functional testing of Android Graphical User Interface (GUI) applications and compared with that of existing baseline approaches

    Program boosting: program synthesis via crowd-sourcing

    Get PDF
    In this paper, we investigate an approach to program synthesis that is based on crowd-sourcing. With the help of crowd-sourcing, we aim to capture the "wisdom of the crowds" to find good if not perfect solutions to inherently tricky programming tasks, which elude even expert developers and lack an easy-to-formalize specification. We propose an approach we call program boosting, which involves crowd-sourcing imperfect solutions to a difficult programming problem from developers and then blending these programs together in a way that improves their correctness. We implement this approach in a system called CROWDBOOST and show in our experiments that interesting and highly non-trivial tasks such as writing regular expressions for URLs or email addresses can be effectively crowd-sourced. We demonstrate that carefully blending the crowd-sourced results together consistently produces a boost, yielding results that are better than any of the starting programs. Our experiments on 465 program pairs show consistent boosts in accuracy and demonstrate that program boosting can be performed at a relatively modest monetary cost

    Improving Software Model Inference by Combining State Merging and Markov Models

    Get PDF
    Labelled-transition systems (LTS) are widely used by developers and testers to model software systems in terms of their sequential behaviour. They provide an overview of the behaviour of the system and their reaction to different inputs. LTS models are the foundation for various automated verification techniques such as model-checking and model-based testing. These techniques require up-to-date models to be meaningful. Unfortunately, software models are rare in practice. Due to the effort and time required to build these models manually, a software engineer would want to infer them automatically from traces (sequences of events or function calls). Many techniques have focused on inferring LTS models from given traces of system execution, where these traces are produced by running a system on a series of tests. State-merging is the foundation of some of the most successful LTS inference techniques to construct LTS models. Passive inference approaches such as k-tail and Evidence-Driven State Merging (EDSM) can infer LTS models from these traces. Moreover, the best-performing methods of inferring LTS models rely on the availability of negatives, i.e. traces that are not permitted from specific states and such information is not usually available. The long-standing challenge for such inference approaches is constructing models well from very few traces and without negatives. Active inference techniques such as Query-driven State Merging (QSM) can learn LTSs from traces by asking queries as tests to a system being learnt. It may lead to infer inaccurate LTSs since the performance of QSM relies on the availability of traces. The challenge for such inference approaches is inferring LTSs well from very few traces and with fewer queries asked. In this thesis, investigations of the existing techniques are presented to the challenge of inferring LTS models from few positive traces. These techniques fail to find correct LTS models in cases of insufficient training data. This thesis focuses on finding better solutions to this problem by using evidence obtained from the Markov models to bias the EDSM learner towards merging states that are more likely to correspond to the same state in a model. Markov models are used to capture the dependencies between event sequences in the collected traces. Those dependencies rely on whether elements of event permitted or prohibited to follow short sequences appear in the traces. This thesis proposed EDSM-Markov a passive inference technique that aimed to improve the existing ones in the absence of negative traces and to prevent the over-generalization problem. In this thesis, improvements obtained by the proposed learners are demonstrated by a series of experiments using randomly-generated labelled-transition systems and case studies. The results obtained from the conducted experiments showed that EDSM-Markov can infer better LTSs compared to other techniques. This thesis also proposes modifications to the QSM learner to improve the accuracy of the inferred LTSs. This results in a new learner, which is named ModifiedQSM. This includes considering more tests to the system being inferred in order to avoid the over-generalization problem. It includes investigations of using Markov models to reduce the number of queries consumed by the ModifiedQSM learner. Hence, this thesis introduces a new LTS inference technique, which is called MarkovQSM. Moreover, enhancements of LTSs inferred by ModifiedQSM and MarkovQSM learners are demonstrated by a series of experiments. The results from the experiments demonstrate that ModifiedQSM can infer better LTSs compared to other techniques. Moreover, MarkovQSM has proven to significantly reduce the number of membership queries consumed compared to ModifiedQSM with a very small loss of accuracy

    Reverse Engineering Systems to Identify Flaws and Understand Behaviour

    Get PDF
    Accurate system models are applicable to many software engineering tasks. Despite their utility, models are often neglected during development. It is therefore desirable to reverse engineer them from existing systems. One way to do this is to record traces of the system and infer a model by generalising from this behaviour. Unfortunately, the models inferred by current techniques often cannot represent how the data values associated with each action affect system behaviour. This raises the following questions. What kind of model do we need in order to show the interplay between behaviour and data? How can we infer such models from system traces? How can we infer functions to relate input data with subsequent outputs? How can we use our models once they have been inferred? To answer these questions, the first contribution of this thesis is a new model definition designed to show the relationship between data and behaviour. Secondly, I present a technique to infer such models from system traces, and define a preprocessing step to infer functions that relate system inputs and outputs. I then empirically evaluate the models produced by my technique and compare them to those produced by a state-of-the-art tool. Finally, I show how the inferred models can be used to analyse properties of the systems they represent. The results show that my technique infers models which are more accurate and intuitive than the current state of the art. My tool can also handle circumstances where the output of a system depends on data values not present in the traces, and can identify situations where the result of particular actions depends on specific data values. The models inferred by my tool can be used by existing verification tools to prove and refute properties of the underlying systems

    Model construction, evolution, and use in testing of software systems

    Get PDF
    The ubiquity of software places emphasis on the need for techniques that allow us to ensure that software behaves as we expect it to behave. The most widely-used approach to ensuring software quality is unit testing, but this is arguably not a very efficient solution, since each test only checks that the software behaves as expected in one single scenario. There exist more advanced techniques, like property-based testing, model-checking, and formal verification, but they usually rely on properties, models, and specifications. One source of friction faced by testers that want to use these advanced techniques is that they require the use of abstraction and, as humans, we tend to find it more difficult to think of abstract specifications than to think of concrete examples. In this thesis, we study how to make it easier to create models that can be used for testing software. In particular, we research the creation of reusable models, ways of automating the generalisation of code and models, and ways of automating the generation of models from legacy unit tests and execution traces. As a result, we provide techniques for generating tests from state machine models, techniques for inferring parametrised state machines from code, and refactorings that automate the introduction of abstraction for property-based testing models and code in general. All these techniques are illustrated with concrete examples and with open-source implementations that are publicly available

    Model construction, evolution, and use in testing of software systems

    Get PDF
    The ubiquity of software places emphasis on the need for techniques that allow us to ensure that software behaves as we expect it to behave. The most widely-used approach to ensuring software quality is unit testing, but this is arguably not a very efficient solution, since each test only checks that the software behaves as expected in one single scenario. There exist more advanced techniques, like property-based testing, model-checking, and formal verification, but they usually rely on properties, models, and specifications. One source of friction faced by testers that want to use these advanced techniques is that they require the use of abstraction and, as humans, we tend to find it more difficult to think of abstract specifications than to think of concrete examples. In this thesis, we study how to make it easier to create models that can be used for testing software. In particular, we research the creation of reusable models, ways of automating the generalisation of code and models, and ways of automating the generation of models from legacy unit tests and execution traces. As a result, we provide techniques for generating tests from state machine models, techniques for inferring parametrised state machines from code, and refactorings that automate the introduction of abstraction for property-based testing models and code in general. All these techniques are illustrated with concrete examples and with open-source implementations that are publicly available
    • …
    corecore