319 research outputs found
Test Case Purification for Improving Fault Localization
Finding and fixing bugs are time-consuming activities in software
development. Spectrum-based fault localization aims to identify the faulty
position in source code based on the execution trace of test cases. Failing
test cases and their assertions form test oracles for the failing behavior of
the system under analysis. In this paper, we propose a novel concept of
spectrum driven test case purification for improving fault localization. The
goal of test case purification is to separate existing test cases into small
fractions (called purified test cases) and to enhance the test oracles to
further localize faults. Combining with an original fault localization
technique (e.g., Tarantula), test case purification results in better ranking
the program statements. Our experiments on 1800 faults in six open-source Java
programs show that test case purification can effectively improve existing
fault localization techniques
Finding Faulty Functions From the Traces of Field Failures
Corrective maintenance, which rectifies field faults, consumes 30-60% time of software maintenance. Literature indicates that 50% to 90% of the field failures are rediscoveries of previous faults, and that 20% of the code is responsible for 80% of the faults. Despite this, identification of the location of the field failures in system code remains challenging and consumes substantial (30-40%) time of corrective maintenance. Prior fault discovery techniques for field traces require many pass-fail traces, discover only crashing failures, or identify faulty coarse grain code such as files as the source of faults. This thesis (which is in the integrated article format) first describes a novel technique (F007) that focuses on identifying finer grain faulty code (faulty functions) from only the failing traces of deployed software. F007 works by training the decision trees on the function-call level failed traces of previous faults of a program. When a new failed trace arrives, F007 then predicts a ranked list of faulty functions based on the probability of fault proneness obtained via the decision trees. Second, this thesis describes a novel strategy, F007-plus, that trains F007 on the failed traces of mutants (artificial faults) and previous faults. F007-plus facilitates F007 in discovering new faulty functions that could not be discovered because they were not faulty in the traces of previously known actual faults. F007 (including F007-plus) was evaluated on the Siemens suite, Space program, four UNIX utilities, and a large commercial application of size approximately 20 millions LOC. F007 (including the use of F007-plus) was able to identify faulty functions in approximately 90% of the failed traces by reviewing approximately less than 10% of the code (i.e., by reviewing only the first few functions in the ranked list). These results, in fact, lead to an emerging theory that a faulty function can be identified by using prior traces of at least one fault in that function. Thus, F007 and F007-plus can correctly identify faulty functions in the failed traces of the majority (80%-90%) of the field failures by using the knowledge of faults in a small percentage (20%) of functions
Self-repair ability of evolved self-assembling systems in cellular automata
Self-repairing systems are those that are able to reconfigure themselves following disruptions to bring them back into a defined normal state. In this paper we explore the self-repair ability of some cellular automata-like systems, which differ from classical cellular automata by the introduction of a local diffusion process inspired by chemical signalling processes in biological development. The update rules in these systems are evolved using genetic programming to self-assemble towards a target pattern. In particular, we demonstrate that once the update rules have been evolved for self-assembly, many of those update rules also provide a self-repair ability without any additional evolutionary process aimed specifically at self-repair
An Effective Data-Driven Approach for Localizing Deep Learning Faults
Deep Learning (DL) applications are being used to solve problems in critical
domains (e.g., autonomous driving or medical diagnosis systems). Thus,
developers need to debug their systems to ensure that the expected behavior is
delivered. However, it is hard and expensive to debug DNNs. When the failure
symptoms or unsatisfied accuracies are reported after training, we lose the
traceability as to which part of the DNN program is responsible for the
failure. Even worse, sometimes, a deep learning program has different types of
bugs. To address the challenges of debugging DNN models, we propose a novel
data-driven approach that leverages model features to learn problem patterns.
Our approach extracts these features, which represent semantic information of
faults during DNN training. Our technique uses these features as a training
dataset to learn and infer DNN fault patterns. Also, our methodology
automatically links bug symptoms to their root causes, without the need for
manually crafted mappings, so that developers can take the necessary steps to
fix faults. We evaluate our approach using real-world and mutated models. Our
results demonstrate that our technique can effectively detect and diagnose
different bug types. Finally, our technique achieved better accuracy,
precision, and recall than prior work for mutated models. Also, our approach
achieved comparable results for real-world models in terms of accuracy and
performance to the state-of-the-art
Recommended from our members
Improving System Reliability for Cyber-Physical Systems
Cyber-physical systems (CPS) are systems featuring a tight combination of, and coordination between, the system's computational and physical elements. Cyber-physical systems include systems ranging from critical infrastructure such as a power grid and transportation system to health and biomedical devices. System reliability, i.e., the ability of a system to perform its intended function under a given set of environmental and operational conditions for a given period of time, is a fundamental requirement of cyber-physical systems. An unreliable system often leads to disruption of service, financial cost and even loss of human life. An important and prevalent type of cyber-physical system meets the following criteria: processing large amounts of data; employing software as a system component; running online continuously; having operator-in-the-loop because of human judgment and an accountability requirement for safety critical systems. This thesis aims to improve system reliability for this type of cyber-physical system. To improve system reliability for this type of cyber-physical system, I present a system evaluation approach entitled automated online evaluation (AOE), which is a data-centric runtime monitoring and reliability evaluation approach that works in parallel with the cyber-physical system to conduct automated evaluation along the workflow of the system continuously using computational intelligence and self-tuning techniques and provide operator-in-the-loop feedback on reliability improvement. For example, abnormal input and output data at or between the multiple stages of the system can be detected and flagged through data quality analysis. As a result, alerts can be sent to the operator-in-the-loop. The operator can then take actions and make changes to the system based on the alerts in order to achieve minimal system downtime and increased system reliability. One technique used by the approach is data quality analysis using computational intelligence, which applies computational intelligence in evaluating data quality in an automated and efficient way in order to make sure the running system perform reliably as expected. Another technique used by the approach is self-tuning which automatically self-manages and self-configures the evaluation system to ensure that it adapts itself based on the changes in the system and feedback from the operator. To implement the proposed approach, I further present a system architecture called autonomic reliability improvement system (ARIS). This thesis investigates three hypotheses. First, I claim that the automated online evaluation empowered by data quality analysis using computational intelligence can effectively improve system reliability for cyber-physical systems in the domain of interest as indicated above. In order to prove this hypothesis, a prototype system needs to be developed and deployed in various cyber-physical systems while certain reliability metrics are required to measure the system reliability improvement quantitatively. Second, I claim that the self-tuning can effectively self-manage and self-configure the evaluation system based on the changes in the system and feedback from the operator-in-the-loop to improve system reliability. Third, I claim that the approach is efficient. It should not have a large impact on the overall system performance and introduce only minimal extra overhead to the cyberphysical system. Some performance metrics should be used to measure the efficiency and added overhead quantitatively. Additionally, in order to conduct efficient and cost-effective automated online evaluation for data-intensive CPS, which requires large volumes of data and devotes much of its processing time to I/O and data manipulation, this thesis presents COBRA, a cloud-based reliability assurance framework. COBRA provides automated multi-stage runtime reliability evaluation along the CPS workflow using data relocation services, a cloud data store, data quality analysis and process scheduling with self-tuning to achieve scalability, elasticity and efficiency. Finally, in order to provide a generic way to compare and benchmark system reliability for CPS and to extend the approach described above, this thesis presents FARE, a reliability benchmark framework that employs a CPS reliability model, a set of methods and metrics on evaluation environment selection, failure analysis, and reliability estimation. The main contributions of this thesis include validation of the above hypotheses and empirical studies of ARIS automated online evaluation system, COBRA cloud-based reliability assurance framework for data-intensive CPS, and FARE framework for benchmarking reliability of cyber-physical systems. This work has advanced the state of the art in the CPS reliability research, expanded the body of knowledge in this field, and provided some useful studies for further research
Automatically Repairing Programs Using Both Tests and Bug Reports
The success of automated program repair (APR) depends significantly on its
ability to localize the defects it is repairing. For fault localization (FL),
APR tools typically use either spectrum-based (SBFL) techniques that use test
executions or information-retrieval-based (IRFL) techniques that use bug
reports. These two approaches often complement each other, patching different
defects. No existing repair tool uses both SBFL and IRFL. We develop RAFL
(Rank-Aggregation-Based Fault Localization), a novel FL approach that combines
multiple FL techniques. We also develop Blues, a new IRFL technique that uses
bug reports, and an unsupervised approach to localize defects. On a dataset of
818 real-world defects, SBIR (combined SBFL and Blues) consistently localizes
more bugs and ranks buggy statements higher than the two underlying techniques.
For example, SBIR correctly identifies a buggy statement as the most suspicious
for 18.1% of the defects, while SBFL does so for 10.9% and Blues for 3.1%. We
extend SimFix, a state-of-the-art APR tool, to use SBIR, SBFL, and Blues.
SimFix using SBIR patches 112 out of the 818 defects; 110 when using SBFL, and
55 when using Blues. The 112 patched defects include 55 defects patched
exclusively using SBFL, 7 patched exclusively using IRFL, 47 patched using both
SBFL and IRFL and 3 new defects. SimFix using Blues significantly outperforms
iFixR, the state-of-the-art IRFL-based APR tool. Overall, SimFix using our FL
techniques patches ten defects no prior tools could patch. By evaluating on a
benchmark of 818 defects, 442 previously unused in APR evaluations, we find
that prior evaluations on the overused Defects4J benchmark have led to overly
generous findings. Our paper is the first to (1) use combined FL for APR, (2)
apply a more rigorous methodology for measuring patch correctness, and (3)
evaluate on the new, substantially larger version of Defects4J.Comment: working pape
- …