21 research outputs found
Identifying Bugs in Make and JVM-Oriented Builds
Incremental and parallel builds are crucial features of modern build systems.
Parallelism enables fast builds by running independent tasks simultaneously,
while incrementality saves time and computing resources by processing the build
operations that were affected by a particular code change. Writing build
definitions that lead to error-free incremental and parallel builds is a
challenging task. This is mainly because developers are often unable to predict
the effects of build operations on the file system and how different build
operations interact with each other. Faulty build scripts may seriously degrade
the reliability of automated builds, as they cause build failures, and
non-deterministic and incorrect build results.
To reason about arbitrary build executions, we present buildfs, a
generally-applicable model that takes into account the specification (as
declared in build scripts) and the actual behavior (low-level file system
operation) of build operations. We then formally define different types of
faults related to incremental and parallel builds in terms of the conditions
under which a file system operation violates the specification of a build
operation. Our testing approach, which relies on the proposed model, analyzes
the execution of single full build, translates it into buildfs, and uncovers
faults by checking for corresponding violations.
We evaluate the effectiveness, efficiency, and applicability of our approach
by examining hundreds of Make and Gradle projects. Notably, our method is the
first to handle Java-oriented build systems. The results indicate that our
approach is (1) able to uncover several important issues (245 issues found in
45 open-source projects have been confirmed and fixed by the upstream
developers), and (2) orders of magnitude faster than a state-of-the-art tool
for Make builds
Regression test selection model: a comparison between ReTSE and pythia
As software systems change and evolve over time regression tests have to be run to validate these changes. Regression testing is an expensive but essential activity in software maintenance. The purpose of this paper is to compare a new regression test selection model called ReTSE with Pythia. The ReTSE model uses decomposition slicing in order to identify the relevant regression tests. Decomposition slicing provides a technique that is capable of identifying the unchanged parts of a system. Pythia is a regression test selection technique based on textual differencing. Both techniques are compare using a Power program taken from Vokolos and Frankl’s paper. The analysis of this comparison has shown promising results in reducing the number of tests to be run after changes are introduced
Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing
Deep Neural Networks~(DNNs) have been widely deployed in software to address
various tasks~(e.g., autonomous driving, medical diagnosis). However, they
could also produce incorrect behaviors that result in financial losses and even
threaten human safety. To reveal the incorrect behaviors in DNN and repair
them, DNN developers often collect rich unlabeled datasets from the natural
world and label them to test the DNN models. However, properly labeling a large
number of unlabeled datasets is a highly expensive and time-consuming task.
To address the above-mentioned problem, we propose NSS, Neuron Sensitivity
guided test case Selection, which can reduce the labeling time by selecting
valuable test cases from unlabeled datasets. NSS leverages the internal
neuron's information induced by test cases to select valuable test cases, which
have high confidence in causing the model to behave incorrectly. We evaluate
NSS with four widely used datasets and four well-designed DNN models compared
to SOTA baseline methods. The results show that NSS performs well in assessing
the test cases' probability of fault triggering and model improvement
capabilities. Specifically, compared with baseline approaches, NSS obtains a
higher fault detection rate~(e.g., when selecting 5\% test case from the
unlabeled dataset in MNIST \& LeNet1 experiment, NSS can obtain 81.8\% fault
detection rate, 20\% higher than baselines)
Feature Map Testing for Deep Neural Networks
Due to the widespread application of deep neural networks~(DNNs) in
safety-critical tasks, deep learning testing has drawn increasing attention.
During the testing process, test cases that have been fuzzed or selected using
test metrics are fed into the model to find fault-inducing test units (e.g.,
neurons and feature maps, activating which will almost certainly result in a
model error) and report them to the DNN developer, who subsequently repair
them~(e.g., retraining the model with test cases). Current test metrics,
however, are primarily concerned with the neurons, which means that test cases
that are discovered either by guided fuzzing or selection with these metrics
focus on detecting fault-inducing neurons while failing to detect
fault-inducing feature maps.
In this work, we propose DeepFeature, which tests DNNs from the feature map
level. When testing is conducted, DeepFeature will scrutinize every internal
feature map in the model and identify vulnerabilities that can be enhanced
through repairing to increase the model's overall performance. Exhaustive
experiments are conducted to demonstrate that (1) DeepFeature is a strong tool
for detecting the model's vulnerable feature maps; (2) DeepFeature's test case
selection has a high fault detection rate and can detect more types of
faults~(comparing DeepFeature to coverage-guided selection techniques, the
fault detection rate is increased by 49.32\%). (3) DeepFeature's fuzzer also
outperforms current fuzzing techniques and generates valuable test cases more
efficiently.Comment: 12 pages, 5 figures. arXiv admin note: text overlap with
arXiv:2307.1101
MicroFuzz: An Efficient Fuzzing Framework for Microservices
This paper presents a novel fuzzing framework, called MicroFuzz, specifically
designed for Microservices. Mocking-Assisted Seed Execution, Distributed
Tracing, Seed Refresh and Pipeline Parallelism approaches are adopted to
address the environmental complexities and dynamics of Microservices and
improve the efficiency of fuzzing. MicroFuzz has been successfully implemented
and deployed in Ant Group, a prominent FinTech company. Its performance has
been evaluated in three distinct industrial scenarios: normalized fuzzing,
iteration testing, and taint verification.Throughout five months of operation,
MicroFuzz has diligently analyzed a substantial codebase, consisting of 261
Apps with over 74.6 million lines of code (LOC). The framework's effectiveness
is evident in its detection of 5,718 potential quality or security risks, with
1,764 of them confirmed and fixed as actual security threats by software
specialists. Moreover, MicroFuzz significantly increased program coverage by
12.24% and detected program behavior by 38.42% in the iteration testing.Comment: Accepted by ICSE-SEIP 202
Regression testing framework for test cases generation and prioritization
A regression test is a significant part of software testing. It is used to find the maximum number of faults in software applications. Test Case Prioritization (TCP) is an approach to prioritize and schedule test cases. It is used to detect faults in the earlier stage of testing environment. Code coverage is one of the features of a Regression Test (RT) that detects more number of faults from a software application. However, code coverage and fault detection are reducing the performance of existing test case prioritization by consuming a lot of time for scanning an entire code. The process of generating test cases plays an important role in the prioritization of test cases. The existing automated generation and prioritization techniques produces insufficient test cases that cause less fault detection rate or consumes more computation time to detect more faults. Unified Modelling Language (UML) based test case generation techniques can extract test cases from UML diagrams by covering maximum part of a module of an application. Therefore, a UML based test case generation can support a test case prioritization technique to find a greater number of faults with shorter execution time. A multi-objective optimization technique able to handle multiple objectives that supports RT to generate more number of test cases as well as increase fault detection rate and produce a better result. The aim of this research is to develop a framework to detect maximum number of faults with less execution time for improving the RT. The performance of the RT can be improved by an efficient test case generation and prioritization method based on a multi-objective optimization technique by handling both test cases and rate of fault detection. This framework consists of two important models: Test Case Generation (TCG) and TCP. The TCG model requires an UML use case diagram to extract test cases. A meta heuristic approach is employed that uses tokens for generating test cases. And, TCP receives the extracted test cases with faults as input to produce the prioritized set of test cases. The proposed research has modified the existing Hill Climbing based TCP by altering its test case swapping feature and detect faults in a reasonable execution time. The proposed framework intends to improve the performance of regression testing by generating and prioritizing test cases in order to find a greater number of faults in an application. Two case studies are conducted in the research in order to gather Test Case (TC) and faults for multiple modules. The proposed framework yielded a 92.2% of Average Percentage Fault Detection with less amount of testing time comparing to the other artificial intelligence-based TCP. The findings were proved that the proposed framework produced a sufficient amount of TC and found the maximum number of faults in less amount of time