195 research outputs found

    Comprehensive Evaluation of Association Measures for Fault Localization

    Get PDF
    To cite the data package, please use the following citation: Lucia, Lo, D., Lingxiao, J., & Budi, A. (2010). Data from: Comprehensive evaluation of association measures for fault localization. InK Repository at Singapore Management University. http://ink.library.smu.edu.sg/sis_research/1330</p

    Graph Mining for Software Fault Localization: An Edge Ranking based Approach

    Get PDF
    Fault localization is considered one of the most challenging activities in the software debugging process. It is vital to guarantee software reliability. Hence, there has been a great demand for automated methods that can pinpoint faults for software developers. Various fault localization techniques that are based on graph mining have been proposed in the literature. These techniques rely on detecting discriminative sub-graphs between failing and passing traces. However, these approaches may not be applicable when the fault does not appear in a discriminative pattern. On the other hand, many approaches focus on selecting potentially faulty program components (statements or predicates) and then ranking these components according to their degree of suspiciousness. One of the difficulties encountered by such approaches is to understand the context of fault occurrence. To address these issues, this paper introduces an approach that helps in analyzing the context of execution traces based on control flow graphs. The proposed approach uses the edge-ranking of basic blocks in software programs using Dstar that proved to be more effective than many fault localization techniques. The proposed method helps in detecting some types of faults that could not be previously detected by many other approaches. Using Siemens benchmark, experiments show the effectiveness of the proposed technique compared to some well-known approaches such as Dstar, Tarantula, SOBER, Cause Transition and Liblit05. The percentage of localized faulty versions versus the percentage of code examined is taken as a measure. For instance, when the percentage of examined code is 30%, the proposed technique can localize nearly 81% of the faulty versions, which outperforms the other four techniques

    Graph Mining for Software Fault Localization: An Edge Ranking based Approach

    Get PDF
    Fault localization is considered one of the most challenging activities in the software debugging process. It is vital to guarantee software reliability. Hence, there has been a great demand for automated methods that can pinpoint faults for software developers. Various fault localization techniques that are based on graph mining have been proposed in the literature. These techniques rely on detecting discriminative sub-graphs between failing and passing traces. However, these approaches may not be applicable when the fault does not appear in a discriminative pattern. On the other hand, many approaches focus on selecting potentially faulty program components (statements or predicates) and then ranking these components according to their degree of suspiciousness. One of the difficulties encountered by such approaches is to understand the context of fault occurrence. To address these issues, this paper introduces an approach that helps in analyzing the context of execution traces based on control flow graphs. The proposed approach uses the edge-ranking of basic blocks in software programs using Dstar that proved to be more effective than many fault localization techniques. The proposed method helps in detecting some types of faults that could not be previously detected by many other approaches. Using Siemens benchmark, experiments show the effectiveness of the proposed technique compared to some well-known approaches such as Dstar, Tarantula, SOBER, Cause Transition and Liblit05. The percentage of localized faulty versions versus the percentage of code examined is taken as a measure. For instance, when the percentage of examined code is 30%, the proposed technique can localize nearly 81% of the faulty versions, which outperforms the other four techniques

    Extended Comprehensive Study of Association Measures for Fault Localization

    Get PDF
    To cite the data package, please use the following citation: Lucia, L., Lo, D., Jiang, L., Thung, F., & Budi, A. (2014). Data from: Extended Comprehensive Study of Association Measures for Fault Localization. InK Repository at Singapore Management University. http://ink.library.smu.edu.sg/sis_research/1818</p

    Body of Knowledge for Graphics Processing Units (GPUs)

    Get PDF
    Graphics Processing Units (GPU) have emerged as a proven technology that enables high performance computing and parallel processing in a small form factor. GPUs enhance the traditional computer paradigm by permitting acceleration of complex mathematics and providing the capability to perform weighted calculations, such as those in artificial intelligence systems. Despite the performance enhancements provided by this type of microprocessor, there exist tradeoffs in regards to reliability and radiation susceptibility, which may impact mission success. This report provides an insight into GPU architecture and its potential applications in space and other similar markets. It also discusses reliability, qualification, and radiation considerations for testing GPUs

    Evidence-driven testing and debugging of software systems

    Get PDF
    Program debugging is the process of testing, exposing, reproducing, diagnosing and fixing software bugs. Many techniques have been proposed to aid developers during software testing and debugging. However, researchers have found that developers hardly use or adopt the proposed techniques in software practice. Evidently, this is because there is a gap between proposed methods and the state of software practice. Most methods fail to address the actual needs of software developers. In this dissertation, we pose the following scientific question: How can we bridge the gap between software practice and the state-of-the-art automated testing and debugging techniques? To address this challenge, we put forward the following thesis: Software testing and debugging should be driven by empirical evidence collected from software practice. In particular, we posit that the feedback from software practice should shape and guide (the automation) of testing and debugging activities. In this thesis, we focus on gathering evidence from software practice by conducting several empirical studies on software testing and debugging activities in the real-world. We then build tools and methods that are well-grounded and driven by the empirical evidence obtained from these experiments. Firstly, we conduct an empirical study on the state of debugging in practice using a survey and a human study. In this study, we ask developers about their debugging needs and observe the tools and strategies employed by developers while testing, diagnosing and repairing real bugs. Secondly, we evaluate the effectiveness of the state-of-the-art automated fault localization (AFL) methods on real bugs and programs. Thirdly, we conducted an experiment to evaluate the causes of invalid inputs in software practice. Lastly, we study how to learn input distributions from real-world sample inputs, using probabilistic grammars. To bridge the gap between software practice and the state of the art in software testing and debugging, we proffer the following empirical results and techniques: (1) We collect evidence on the state of practice in program debugging and indeed, we found that there is a chasm between (available) debugging tools and developer needs. We elicit the actual needs and concerns of developers when testing and diagnosing real faults and provide a benchmark (called DBGBench) to aid the automated evaluation of debugging and repair tools. (2) We provide empirical evidence on the effectiveness of several state-of-the-art AFL techniques (such as statistical debugging formulas and dynamic slicing). Building on the obtained empirical evidence, we provide a hybrid approach that outperforms the state-of-the-art AFL techniques. (3) We evaluate the prevalence and causes of invalid inputs in software practice, and we build on the lessons learned from this experiment to build a general-purpose algorithm (called ddmax) that automatically diagnoses and repairs real-world invalid inputs. (4) We provide a method to learn the distribution of input elements in software practice using probabilistic grammars and we further employ the learned distribution to drive the test generation of inputs that are similar (or dissimilar) to sample inputs found in the wild. In summary, we propose an evidence-driven approach to software testing and debugging, which is based on collecting empirical evidence from software practice to guide and direct software testing and debugging. In our evaluation, we found that our approach is effective in improving the effectiveness of several debugging activities in practice. In particular, using our evidence-driven approach, we elicit the actual debugging needs of developers, improve the effectiveness of several automated fault localization techniques, effectively debug and repair invalid inputs, and generate test inputs that are (dis)similar to real-world inputs. Our proposed methods are built on empirical evidence and they improve over the state-of-the-art techniques in testing and debugging.Software-Debugging bezeichnet das Testen, Aufspüren, Reproduzieren, Diagnostizieren und das Beheben von Fehlern in Programmen. Es wurden bereits viele Debugging-Techniken vorgestellt, die Softwareentwicklern beim Testen und Debuggen unterstützen. Dennoch hat sich in der Forschung gezeigt, dass Entwickler diese Techniken in der Praxis kaum anwenden oder adaptieren. Das könnte daran liegen, dass es einen großen Abstand zwischen den vorgestellten und in der Praxis tatsächlich genutzten Techniken gibt. Die meisten Techniken genügen den Anforderungen der Entwickler nicht. In dieser Dissertation stellen wir die folgende wissenschaftliche Frage: Wie können wir die Kluft zwischen Software-Praxis und den aktuellen wissenschaftlichen Techniken für automatisiertes Testen und Debugging schließen? Um diese Herausforderung anzugehen, stellen wir die folgende These auf: Das Testen und Debuggen von Software sollte von empirischen Daten, die in der Software-Praxis gesammelt wurden, vorangetrieben werden. Genauer gesagt postulieren wir, dass das Feedback aus der Software-Praxis die Automation des Testens und Debuggens formen und bestimmen sollte. In dieser Arbeit fokussieren wir uns auf das Sammeln von Daten aus der Software-Praxis, indem wir einige empirische Studien über das Testen und Debuggen von Software in der echten Welt durchführen. Auf Basis der gesammelten Daten entwickeln wir dann Werkzeuge, die sich auf die Daten der durchgeführten Experimente stützen. Als erstes führen wir eine empirische Studie über den Stand des Debuggens in der Praxis durch, wobei wir eine Umfrage und eine Humanstudie nutzen. In dieser Studie befragen wir Entwickler zu ihren Bedürfnissen, die sie beim Debuggen haben und beobachten die Werkzeuge und Strategien, die sie beim Diagnostizieren, Testen und Aufspüren echter Fehler einsetzen. Als nächstes bewerten wir die Effektivität der aktuellen Automated Fault Localization (AFL)- Methoden zum automatischen Aufspüren von echten Fehlern in echten Programmen. Unser dritter Schritt ist ein Experiment, um die Ursachen von defekten Eingaben in der Software-Praxis zu ermitteln. Zuletzt erforschen wir, wie Häufigkeitsverteilungen von Teileingaben mithilfe einer Grammatik von echten Beispiel-Eingaben aus der Praxis gelernt werden können. Um die Lücke zwischen Software-Praxis und der aktuellen Forschung über Testen und Debuggen von Software zu schließen, bieten wir die folgenden empirischen Ergebnisse und Techniken: (1) Wir sammeln aktuelle Forschungsergebnisse zum Stand des Software-Debuggens und finden in der Tat eine Diskrepanz zwischen (vorhandenen) Debugging-Werkzeugen und dem, was der Entwickler tatsächlich benötigt. Wir sammeln die tatsächlichen Bedürfnisse von Entwicklern beim Testen und Debuggen von Fehlern aus der echten Welt und entwickeln einen Benchmark (DbgBench), um das automatische Evaluieren von Debugging-Werkzeugen zu erleichtern. (2) Wir stellen empirische Daten zur Effektivität einiger aktueller AFL-Techniken vor (z.B. Statistical Debugging-Formeln und Dynamic Slicing). Auf diese Daten aufbauend, stellen wir einen hybriden Algorithmus vor, der die Leistung der aktuellen AFL-Techniken übertrifft. (3) Wir evaluieren die Häufigkeit und Ursachen von ungültigen Eingaben in der Softwarepraxis und stellen einen auf diesen Daten aufbauenden universell einsetzbaren Algorithmus (ddmax) vor, der automatisch defekte Eingaben diagnostiziert und behebt. (4) Wir stellen eine Methode vor, die Verteilung von Schnipseln von Eingaben in der Software-Praxis zu lernen, indem wir Grammatiken mit Wahrscheinlichkeiten nutzen. Die gelernten Verteilungen benutzen wir dann, um den Beispiel-Eingaben ähnliche (oder verschiedene) Eingaben zu erzeugen. Zusammenfassend stellen wir einen auf der Praxis beruhenden Ansatz zum Testen und Debuggen von Software vor, welcher auf empirischen Daten aus der Software-Praxis basiert, um das Testen und Debuggen zu unterstützen. In unserer Evaluierung haben wir festgestellt, dass unser Ansatz effektiv viele Debugging-Disziplinen in der Praxis verbessert. Genauer gesagt finden wir mit unserem Ansatz die genauen Bedürfnisse von Entwicklern, verbessern die Effektivität vieler AFL-Techniken, debuggen und beheben effektiv fehlerhafte Eingaben und generieren Test-Eingaben, die (un)ähnlich zu Eingaben aus der echten Welt sind. Unsere vorgestellten Methoden basieren auf empirischen Daten und verbessern die aktuellen Techniken des Testens und Debuggens

    On Test Selection, Prioritization, Bisection, and Guiding Bisection with Risk Models

    Get PDF
    The cost of software testing has become a burden for software companies in the era of rapid release and continuous integration. In the first part of our work, we evaluate the results of adopting multiple test selection and prioritization approaches for improving test effectiveness in of the test stages of our industrial partner Ericsson Inc. We confirm the existence of valuable information in the test execution history. In particular, the association between test failures provide the most value to the test selection and prioritization processes. More importantly, during this exercise, we encountered various challenges that are unseen or undiscussed in prior research. We document the challenges, our solutions and the lessons learned as an experience report. In the second part of our work, we explore batch testing in test execution environments and how it can help to reduce the test execution costs. One approach to reducing the test execution costs is to group changes into batches and test them at once. In this work, we study the impact of batch testing in reducing the number of test executions to deliver changes and find culprit commits. We factor test flakiness into our simulations and its impact on the optimal batch size. Moreover, we address another problem with batch testing, how to find a culprit commit when a batch fails. We introduce a novel technique where we guide bisection based on two risk models: a bug model and a test execution history model. We isolate the risky commits by testing them individually, while the less risky commits are tested in a single large batch. Our results show that batch testing can be improved by adopting our culprit prediction models. The results we present here have convinced Ericsson developers to implement our culprit risk predictions in the CulPred tool that will make their continuous integration pipeline more efficient

    A Framework for Detecting and Diagnosing Configuration Faults in Web Applications

    Get PDF
    Software portability is a key concern when target operational environments are highly configurable; variations in configuration settings can significantly impact software correctness. While portability is key for a wide range of software types, it is a significant challenge in web application development. The client configuration used to navigate and interact with web content is known to be an important factor in the subsequent quality of deployed web applications. With the widespread use of diverse, heterogeneous web client configurations, the results of web application deployment can vary unpredictably among users. Given existing approaches and limited development resources, attempting to develop web applications that are viewable, functional, and portable for the vast web configuration space is a significant undertaking. As a result, faults that only surface in precise configurations, termed configuration faults, have the potential to escape detection until web applications are fielded. This dissertation presents an automated, model-based framework that uses static analysis to detect and diagnose web configuration faults. This approach overcomes the limitations of current techniques by featuring an extensible model of the configuration space that enables efficient portability analysis across the vast array of client environments. The basic idea behind this approach is that source code fragments (i.e., HTML tags and CSS rules) embedded in web application source code adversely impact portability of web applications when they are unsupported in target client configurations; without proper support, the source code is either processed incorrectly or ignored, resulting in configuration faults. Using static analysis, configuration fault detection is performed by applying a model of the web application source against knowledge of support criteria; any unsupported source code detected is considered an index to potential configuration faults. In the effort to fully exploit this approach, improve practicality, and maximize fault detection efficiency, manual and automated approaches to knowledge acquisition have been implemented, variations of web application and client support knowledge models have been investigated, and visualization of configuration fault detection results has been explored. To optimize the automated acquisition of support knowledge, alternate learning strategies have been empirically investigated and provisions for capturing tag interaction have been integrated into the process
    corecore