1,569 research outputs found
Recommended from our members
Unifying regression testing with mutation testing
textSoftware testing is the most commonly used methodology for validating quality of software systems. Conceptually, testing is simple, but in practice, given the huge (practically infinite) space of inputs to test against, it requires solving a number of challenging problems, including evaluating and reusing tests efficiently and effectively as software evolves. While software testing research has seen much progress in recent years, many crucial bugs still evade state-of-the-art approaches and cause significant monetary losses and sometimes are responsible for loss of life. My thesis is that a unified, bi-dimensional, change-driven methodology can form the basis of novel techniques and tools that can make testing significantly more effective and efficient, and allow us to find more bugs at a reduced cost. We propose a novel unification of the following two dimensions of change: (1) real manual changes made by programmers, e.g., as commonly used to support more effective and efficient regression testing techniques; and (2) mechanically introduced changes to code or specifications, e.g., as originally conceived in mutation testing for evaluating quality of test suites. We believe such unification can lay the foundation of a scalable and highly effective methodology for testing and maintaining real software systems. The primary contribution of my thesis is two-fold. One, it introduces new techniques to address central problems in both regression testing (e.g., test prioritization) and mutation testing (e.g., selective mutation testing). Two, it introduces a new methodology that uses the foundations of regression testing to speed up mutation testing, and also uses the foundations of mutation testing to help with the fault localization problem raised in regression testing. The central ideas are embodied in a suite of prototype tools. Rigorous experimental evaluation is used to validate the efficacy of the proposed techniques using a variety of real-world Java programs.Electrical and Computer Engineerin
Recommended from our members
Control flow graph visualization and its application to coverage and fault localization in Python
textThis report presents a software testing tool that creates visualizations of the Control Flow Graph (CFG) from Python source code. The CFG is a representation of a program that shows execution paths that may be taken by the machine. Similar techniques to the ones here could be applied to many other languages, but the CFGs in this tool are tailored to the Python language. As computers get faster, tools to help programmers be effective at work can become more complex and still give quick feedback, without causing an undue performance burden. This tool explores several approaches to giving feedback to developers through a visualization of the CFG. First, just the viewing of a CFG gives a different perspective on the code. A programmer could choose to juxtapose the CFG with complexity metrics during development, seeing increased complexity as graphs grow larger. Second, the tool implements a mechanism to provide code coverage to Python modules. This feature extends the visualization to show code coverage as a highlighted CFG. Test coverage requirements are calculated to check node, edge, edge-pair, and prime path coverage. From studying existing testing tools, it appears no existing tool for Python provides all these test coverage levels. Third, the tool provides an interface for adding custom highlighting of the CFG, used here to visualize fault localization. Seeing the most suspicious locations from fault localization techniques could be used to reduce debugging time. The results of running the tool on several popular Python packages, and on itself, show its performance is competitive with the most popular coverage tool when measuring branch coverage. It is slightly slower on statement cover- age alone, but much faster against an unoptimized version and a logic coverage tool. This report also presents ideas for extensions to the tool. Among them is to incorporate program repair using fault localization and mutation operators. Visualizing code as a CFG provides interesting ways to look at many software testing metrics.Electrical and Computer Engineerin
An Empirical Study of Fault Localization in Python Programs
Despite its massive popularity as a programming language, especially in novel
domains like data science programs, there is comparatively little research
about fault localization that targets Python. Even though it is plausible that
several findings about programming languages like C/C++ and Java -- the most
common choices for fault localization research -- carry over to other
languages, whether the dynamic nature of Python and how the language is used in
practice affect the capabilities of classic fault localization approaches
remain open questions to investigate.
This paper is the first large-scale empirical study of fault localization on
real-world Python programs and faults. Using Zou et al.'s recent large-scale
empirical study of fault localization in Java as the basis of our study, we
investigated the effectiveness (i.e., localization accuracy), efficiency (i.e.,
runtime performance), and other features (e.g., different entity granularities)
of seven well-known fault-localization techniques in four families
(spectrum-based, mutation-based, predicate switching, and stack-trace based) on
135 faults from 13 open-source Python projects from the BugsInPy curated
collection.
The results replicate for Python several results known about Java, and shed
light on whether Python's peculiarities affect the capabilities of fault
localization. The replication package that accompanies this paper includes
detailed data about our experiments, as well as the tool FauxPy that we
implemented to conduct the study.Comment: Related work update
FlakiMe: Laboratory-Controlled Test Flakiness Impact Assessment
Much research on software testing makes an implicit assumption that test failures are deterministic such that they always witness the presence of the same defects. However, this assumption is not always true because some test failures are due to so-called flaky tests, i.e., tests with non-deterministic outcomes. To help testing researchers better investigate flakiness, we introduce a test flakiness assessment and experimentation platform, called FlakiMe. FlakiMe supports the seeding of a (controllable) degree of flakiness into the behaviour of a given test suite. Thereby, FlakiMe equips researchers with ways to investigate the impact of test flakiness on their techniques under laboratory-controlled conditions. To demonstrate the application of FlakiMe, we use it to assess the impact of flakiness on mutation testing and program repair (the PRAPR and ARJA methods). These results indicate that a 10% flakiness is sufficient to affect the mutation score, but the effect size is modest (2% - 5%), while it reduces the number of patches produced for repair by 20% up to 100% of repair problems; a devastating impact on this application of testing. Our experiments with FlakiMe demonstrate that flakiness affects different testing applications in very different ways, thereby motivating the need for a laboratory-controllable flakiness impact assessment platform and approach such as FlakiMe
- …