Search CORE

108 research outputs found

Hashing Fuzzing: Introducing Input Diversity to Improve Crash Detection

Author: Clark D
Menendez HD
Publication venue
Publication date: 01/07/2021
Field of study

The utility of a test set of program inputs is strongly influenced by its diversity and its size. Syntax coverage has become a standard proxy for diversity. Although more sophisticated measures exist, such as proximity of a sample to a uniform distribution, methods to use them tend to be type dependent. We use r-wise hash functions to create a novel, semantics preserving, testability transformation for C programs that we call HashFuzz. Use of HashFuzz improves the diversity of test sets produced by instrumentation-based fuzzers. We evaluate the effect of the HashFuzz transformation on eight programs from the Google Fuzzer Test Suite using four state-of-the-art fuzzers that have been widely used in previous research. We demonstrate pronounced improvements in the performance of the test sets for the transformed programs across all the fuzzers that we used. These include strong improvements in diversity in every case, maintenance or small improvement in branch coverage -- up to 4.8% improvement in the best case, and significant improvement in unique crash detection numbers -- between 28% to 97% increases compared to test sets for untransformed programs

UCL Discovery

Seeding Contradiction: a fast method for generating full-coverage test suites

Author: Huang Li
Meyer Bertrand
Oriol Manuel
Publication venue
Publication date: 08/09/2023
Field of study

The regression test suite, a key resource for managing program evolution, needs to achieve 100% coverage, or very close, to be useful. Devising a test suite manually is unacceptably tedious, but existing automated methods are often inefficient. The method described in this article, ``Seeding Contradiction'', inserts incorrect instructions into every basic block of the program, enabling an SMT-based Hoare-style prover to generate a counterexample for every branch of the program and, from the collection of all such counterexamples, a test suite. The method is static, works fast, and achieves excellent coverage

arXiv.org e-Print Archive

Hashing fuzzing: introducing input diversity to improve crash detection

Author: Clark D.
Clark D.
Menéndez H.
Menéndez H.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2022
Field of study

The utility of a test set of program inputs is strongly influenced by its diversity and its size. Syntax coverage has become a standard proxy for diversity. Although more sophisticated measures exist, such as proximity of a sample to a uniform distribution, methods to use them tend to be type dependent. We use r-wise hash functions to create a novel, semantics preserving, testability transformation for C programs that we call HashFuzz. Use of HashFuzz improves the diversity of test sets produced by instrumentation-based fuzzers. We evaluate the effect of the HashFuzz transformation on eight programs from the Google Fuzzer Test Suite using four state-of-the-art fuzzers that have been widely used in previous research. We demonstrate pronounced improvements in the performance of the test sets for the transformed programs across all the fuzzers that we used. These include strong improvements in diversity in every case, maintenance or small improvement in branch coverage – up to 4.8% improvement in the best case, and significant improvement in unique crash detection numbers – between 28% to 97% increases compared to test sets for untransformed program

Middlesex University Research Repository

Embedding and classifying test execution traces using neural networks

Author: Allamanis Miltiadis
Rajan Ajitha
Rooijackers Gwenyth
Tsimpourlas Foivos
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/06/2022
Field of study

Edinburgh Research Explorer

Recommended from our members

Regression testing experiments

Author: Sayre Kent
Publication venue: 'Oregon State University'
Publication date
Field of study

Software maintenance is an expensive part of the software lifecycle: estimates put its cost at up to two-thirds of the entire cost of software. Regression testing, which tests software after it has been modified to help assess and increase its reliability, is responsible for a large part of this cost. Thus, making regression testing more efficient and effective is worthwhile. This thesis performs two experiments with regression testing techniques. The first experiment involves two regression test selection techniques, Dejavu and Pythia. These techniques select a subset of tests from the original test suite to be rerun instead of the entire original test suite in an attempt to save valuable testing time. The experiment investigates the cost and benefit tradeoffs between these techniques. The data indicate that Dejavu can occasionally select smaller test suites than Pythia while Pythia often is more efficient at figuring out which test cases to select than Dejavu. The second experiment involves the investigation of program spectra as a tool to enhance regression testing. Program spectra characterize a program's behavior. The experiment investigates the applicability of program spectra to the detection of faults in modified software. The data indicate that certain types of spectra identify faults on a consistent basis. The data also reveal cost-benefit tradeoffs among spectra types

ScholarsArchive@OSU

Automatically Generating Test Cases for Safety-Critical Software via Symbolic Execution

Author: Braione Pietro
Briola Daniela
Denaro Giovanni
Kurian Elson
Publication venue
Publication date: 22/09/2022
Field of study

Automated test generation based on symbolic execution can be beneficial for systematically testing safety-critical software, to facilitate test engineers to pursue the strict testing requirements mandated by the certification standards, while controlling at the same time the costs of the testing process. At the same time, the development of safety-critical software is often constrained with programming languages or coding conventions that ban linguistic features which are believed to downgrade the safety of the programs, e.g., they do not allow dynamic memory allocation and variable-length arrays, limit the way in which loops are used, forbid recursion, and bound the complexity of control conditions. As a matter of facts, these linguistic features are also the main efficiency-blockers for the test generation approaches based on symbolic execution at the state of the art. This paper contributes new evidence of the effectiveness of generating test cases with symbolic execution for a significant class of industrial safety critical-systems. We specifically focus on Scade, a largely adopted model-based development language for safety-critical embedded software, and we report on a case study in which we exploited symbolic execution to automatically generate test cases for a set of safety-critical programs developed in Scade. To this end, we introduce a novel test generator that we developed in a recent industrial project on testing safety-critical railway software written in Scade, and we report on our experience of using this test generator for testing a set of Scade programs that belong to the development of an on-board signaling unit for high-speed rail. The results provide empirically evidence that symbolic execution is indeed a viable approach for generating high-quality test suites for the safety-critical programs considered in our case study

arXiv.org e-Print Archive

Hashing fuzzing: introducing input diversity to improve crash detection

Author: Clark David
Menéndez Héctor D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/07/2021
Field of study

The utility of a test set of program inputs is strongly influenced by its diversity and its size. Syntax coverage has become a standard proxy for diversity. Although more sophisticated measures exist, such as proximity of a sample to a uniform distribution, methods to use them tend to be type dependent. We use r-wise hash functions to create a novel, semantics preserving, testability transformation for C programs that we call HashFuzz. Use of HashFuzz improves the diversity of test sets produced by instrumentation-based fuzzers. We evaluate the effect of the HashFuzz transformation on eight programs from the Google Fuzzer Test Suite using four state-of-the-art fuzzers that have been widely used in previous research. We demonstrate pronounced improvements in the performance of the test sets for the transformed programs across all the fuzzers that we used. These include strong improvements in diversity in every case, maintenance or small improvement in branch coverage – up to 4.8% improvement in the best case, and significant improvement in unique crash detection numbers – between 28% to 97% increases compared to test sets for untransformed program

Middlesex University Research Repository

King's Research Portal

Speeding up Test Execution with Increased Cache Locality

Author: Baluda
Beck
Beizer
Bertolino
Beyls
Carr
Chang
Denning
Do
Ebert
Elbaum
Gay
Harrold
Hwu
Inozemtseva
Jones
Luk
McKinley
Nagappan
Nethercote
Poovey
Rajan
Ramirez
Rothermel
Slaney
Tikir
Wolf
Young
Publication venue: 'Wiley'
Publication date: 22/06/2018
Field of study

Crossref

Edinburgh Research Explorer

Detecting Trivial Mutant Equivalences via Compiler Optimisations

Author: Harman M
Jia Y
Kintis M
Le Traon Y
Malevris N
Papadakis M
Publication venue
Publication date: 01/01/2018
Field of study

Mutation testing realises the idea of fault-based testing, i.e., using artificial defects to guide the testing process. It is used to evaluate the adequacy of test suites and to guide test case generation. It is a potentially powerful form of testing, but it is well-known that its effectiveness is inhibited by the presence of equivalent mutants. We recently studied Trivial Compiler Equivalence (TCE) as a simple, fast and readily applicable technique for identifying equivalent mutants for C programs. In the present work, we augment our findings with further results for the Java programming language. TCE can remove a large portion of all mutants because they are determined to be either equivalent or duplicates of other mutants. In particular, TCE equivalent mutants account for 7.4% and 5.7% of all C and Java mutants, while duplicated mutants account for a further 21% of all C mutants and 5.4% Java mutants, on average. With respect to a benchmark ground truth suite (of known equivalent mutants), approximately 30% (for C) and 54% (for Java) are TCE equivalent. It is unsurprising that results differ between languages, since mutation characteristics are language-dependent. In the case of Java, our new results suggest that TCE may be particularly effective, finding almost half of all equivalent mutants

UCL Discovery