7,375 research outputs found
Experiments towards model-based testing using Plan 9: Labelled transition file systems, stacking file systems, on-the-fly coverage measuring
We report on experiments that we did on Plan 9/Inferno to gain more experience with the file-system-as-tool-interface approach. We reimplemented functionality that we earlier worked on in Unix, trying to use Plan 9 file system interfaces. The application domain for those experiments was model-based testing.\ud
\ud
The idea we wanted to experiment with consists of building small, reusable pieces of functionality which are then composed to achieve the intended functionality. In particular we want to experiment with the idea of 'stacking' file servers (fs) on top of each other, where the upper fs acts as a 'filter' on the data and structure provided by the lower fs.\ud
\ud
For this experiment we designed a file system interface (ltsfs) that gives fine-grained access to a labelled transition system, and made two implementations of it.\ud
We developed a small fs that, when 'stacked' on top of the ltsfs, extends it with additional files, and an application that uses the resulting file system.\ud
\ud
The hope was that an interface like the one offered by ltsfs could be used as a general interface between (specification language specific) programs that give access to state spaces and (specification language independent) programs that use (walk) those state spaces like simulators, model checkers, or test derivation programs.\ud
\ud
Initial results (obtained on a less-than-modern machine) suggest that, although the approach by itself is definitely feasible in principle, in practice the fine-grained access offered by ltsfs may involve many file (9p) transactions which may seriously affect performance. In Unix we used a more conservative approach where the access was less fine-grained which likely explains why there we did not suffer from this problem.\ud
\ud
In addition we report on experiments to use acid to obtain coverage information that is updated on-the-fly while the program is running. This worked quite well. The main observation from those experiments is that the basic block notion of this approach, which has a more 'semantical' nature, differs from the more 'syntactical' nature of the basic block notion in Unix coverage measurement tools\ud
like tcov or gcov
Consistent SDNs through Network State Fuzzing
The conventional wisdom is that a software-defined network (SDN) operates under the premise that the logically centralized control plane has an accurate representation of the actual data plane state. Nevertheless, bugs, misconfigurations, faults or attacks can introduce inconsistencies that undermine correct operation. Previous work in this area, however, lacks a holistic methodology to tackle this problem and thus, addresses only certain parts of the problem. Yet, the consistency of the overall system is only as good as its least consistent part. Motivated by an analogy of network consistency checking with program testing, we propose to add active probe-based network state fuzzing to our consistency check repertoire. Hereby, our system, PAZZ, combines production traffic with active probes to continuously test if the actual forwarding path and decision elements (on the data plane) correspond to the expected ones (on the control plane). Our insight is that active traffic covers the inconsistency cases beyond the ones identified by passive traffic. PAZZ prototype was built and evaluated on topologies of varying scale and complexity. Our results show that PAZZ requires minimal network resources to detect persistent data plane faults through fuzzing and localize them quickly
Consistent SDNs through Network State Fuzzing
The conventional wisdom is that a software-defined network (SDN) operates
under the premise that the logically centralized control plane has an accurate
representation of the actual data plane state. Unfortunately, bugs,
misconfigurations, faults or attacks can introduce inconsistencies that
undermine correct operation. Previous work in this area, however, lacks a
holistic methodology to tackle this problem and thus, addresses only certain
parts of the problem. Yet, the consistency of the overall system is only as
good as its least consistent part. Motivated by an analogy of network
consistency checking with program testing, we propose to add active probe-based
network state fuzzing to our consistency check repertoire. Hereby, our system,
PAZZ, combines production traffic with active probes to periodically test if
the actual forwarding path and decision elements (on the data plane) correspond
to the expected ones (on the control plane). Our insight is that active traffic
covers the inconsistency cases beyond the ones identified by passive traffic.
PAZZ prototype was built and evaluated on topologies of varying scale and
complexity. Our results show that PAZZ requires minimal network resources to
detect persistent data plane faults through fuzzing and localize them quickly
while outperforming baseline approaches.Comment: Added three extra relevant references, the arXiv later was accepted
in IEEE Transactions of Network and Service Management (TNSM), 2019 with the
title "Towards Consistent SDNs: A Case for Network State Fuzzing
JWalk: a tool for lazy, systematic testing of java classes by design introspection and user interaction
Popular software testing tools, such as JUnit, allow frequent retesting of modified code; yet the manually created test scripts are often seriously incomplete. A unit-testing tool called JWalk has therefore been developed to address the need for systematic unit testing within the context of agile methods. The tool operates directly on the compiled code for Java classes and uses a new lazy method for inducing the changing design of a class on the fly. This is achieved partly through introspection, using Javaâs reflection capability, and partly through interaction with the user, constructing and saving test oracles on the fly. Predictive rules reduce the number of oracle values that must be confirmed by the tester. Without human intervention, JWalk performs bounded exhaustive exploration of the classâs method protocols and may be directed to explore the space of algebraic constructions, or the intended design state-space of the tested class. With some human interaction, JWalk performs up to the equivalent of fully automated state-based testing, from a specification that was acquired incrementally
Testing real-time systems using TINA
The paper presents a technique for model-based black-box conformance testing of real-time systems using the Time Petri Net Analyzer TINA. Such test suites are derived from a prioritized time Petri net composed of two concurrent sub-nets specifying respectively the expected behaviour of the system under test and its environment.We describe how the toolbox TINA has been extended to support automatic generation of time-optimal test suites. The result is optimal in the sense that the set of test cases in the test suite have the shortest possible accumulated time to be executed. Input/output conformance serves as the notion of implementation correctness, essentially timed trace inclusion taking environment assumptions into account. Test cases selection is based either on using manually formulated test purposes or automatically from various coverage criteria specifying structural criteria of the model to be fulfilled by the test suite. We discuss how test purposes and coverage criterion are specified in the linear temporal logic SE-LTL, derive test sequences, and assign verdicts
FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing Coverage
In recent years, fuzz testing has proven itself to be one of the most
effective techniques for finding correctness bugs and security vulnerabilities
in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has
become popular thanks to its ease-of-use and bug-finding power. However, AFL
remains limited in the depth of program coverage it achieves, in particular
because it does not consider which parts of program inputs should not be
mutated in order to maintain deep program coverage. We propose an approach,
FairFuzz, that helps alleviate this limitation in two key steps. First,
FairFuzz automatically prioritizes inputs exercising rare parts of the program
under test. Second, it automatically adjusts the mutation of inputs so that the
mutated inputs are more likely to exercise these same rare parts of the
program. We conduct evaluation on real-world programs against state-of-the-art
versions of AFL, thoroughly repeating experiments to get good measures of
variability. We find that on certain benchmarks FairFuzz shows significant
coverage increases after 24 hours compared to state-of-the-art versions of AFL,
while on others it achieves high program coverage at a significantly faster
rate
Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
A common and natural intuition among software testers is that test cases need
to differ if a software system is to be tested properly and its quality
ensured. Consequently, much research has gone into formulating distance
measures for how test cases, their inputs and/or their outputs differ. However,
common to these proposals is that they are data type specific and/or calculate
the diversity only between pairs of test inputs, traces or outputs.
We propose a new metric to measure the diversity of sets of tests: the test
set diameter (TSDm). It extends our earlier, pairwise test diversity metrics
based on recent advances in information theory regarding the calculation of the
normalized compression distance (NCD) for multisets. An advantage is that TSDm
can be applied regardless of data type and on any test-related information, not
only the test inputs. A downside is the increased computational time compared
to competing approaches.
Our experiments on four different systems show that the test set diameter can
help select test sets with higher structural and fault coverage than random
selection even when only applied to test inputs. This can enable early test
design and selection, prior to even having a software system to test, and
complement other types of test automation and analysis. We argue that this
quantification of test set diversity creates a number of opportunities to
better understand software quality and provides practical ways to increase it.Comment: In submissio
Visualizing test diversity to support test optimisation
Diversity has been used as an effective criteria to optimise test suites for
cost-effective testing. Particularly, diversity-based (alternatively referred
to as similarity-based) techniques have the benefit of being generic and
applicable across different Systems Under Test (SUT), and have been used to
automatically select or prioritise large sets of test cases. However, it is a
challenge to feedback diversity information to developers and testers since
results are typically many-dimensional. Furthermore, the generality of
diversity-based approaches makes it harder to choose when and where to apply
them. In this paper we address these challenges by investigating: i) what are
the trade-off in using different sources of diversity (e.g., diversity of test
requirements or test scripts) to optimise large test suites, and ii) how
visualisation of test diversity data can assist testers for test optimisation
and improvement. We perform a case study on three industrial projects and
present quantitative results on the fault detection capabilities and redundancy
levels of different sets of test cases. Our key result is that test similarity
maps, based on pair-wise diversity calculations, helped industrial
practitioners identify issues with their test repositories and decide on
actions to improve. We conclude that the visualisation of diversity information
can assist testers in their maintenance and optimisation activities
- âŠ