122 research outputs found

    Evidence-driven testing and debugging of software systems

    Get PDF
    Program debugging is the process of testing, exposing, reproducing, diagnosing and fixing software bugs. Many techniques have been proposed to aid developers during software testing and debugging. However, researchers have found that developers hardly use or adopt the proposed techniques in software practice. Evidently, this is because there is a gap between proposed methods and the state of software practice. Most methods fail to address the actual needs of software developers. In this dissertation, we pose the following scientific question: How can we bridge the gap between software practice and the state-of-the-art automated testing and debugging techniques? To address this challenge, we put forward the following thesis: Software testing and debugging should be driven by empirical evidence collected from software practice. In particular, we posit that the feedback from software practice should shape and guide (the automation) of testing and debugging activities. In this thesis, we focus on gathering evidence from software practice by conducting several empirical studies on software testing and debugging activities in the real-world. We then build tools and methods that are well-grounded and driven by the empirical evidence obtained from these experiments. Firstly, we conduct an empirical study on the state of debugging in practice using a survey and a human study. In this study, we ask developers about their debugging needs and observe the tools and strategies employed by developers while testing, diagnosing and repairing real bugs. Secondly, we evaluate the effectiveness of the state-of-the-art automated fault localization (AFL) methods on real bugs and programs. Thirdly, we conducted an experiment to evaluate the causes of invalid inputs in software practice. Lastly, we study how to learn input distributions from real-world sample inputs, using probabilistic grammars. To bridge the gap between software practice and the state of the art in software testing and debugging, we proffer the following empirical results and techniques: (1) We collect evidence on the state of practice in program debugging and indeed, we found that there is a chasm between (available) debugging tools and developer needs. We elicit the actual needs and concerns of developers when testing and diagnosing real faults and provide a benchmark (called DBGBench) to aid the automated evaluation of debugging and repair tools. (2) We provide empirical evidence on the effectiveness of several state-of-the-art AFL techniques (such as statistical debugging formulas and dynamic slicing). Building on the obtained empirical evidence, we provide a hybrid approach that outperforms the state-of-the-art AFL techniques. (3) We evaluate the prevalence and causes of invalid inputs in software practice, and we build on the lessons learned from this experiment to build a general-purpose algorithm (called ddmax) that automatically diagnoses and repairs real-world invalid inputs. (4) We provide a method to learn the distribution of input elements in software practice using probabilistic grammars and we further employ the learned distribution to drive the test generation of inputs that are similar (or dissimilar) to sample inputs found in the wild. In summary, we propose an evidence-driven approach to software testing and debugging, which is based on collecting empirical evidence from software practice to guide and direct software testing and debugging. In our evaluation, we found that our approach is effective in improving the effectiveness of several debugging activities in practice. In particular, using our evidence-driven approach, we elicit the actual debugging needs of developers, improve the effectiveness of several automated fault localization techniques, effectively debug and repair invalid inputs, and generate test inputs that are (dis)similar to real-world inputs. Our proposed methods are built on empirical evidence and they improve over the state-of-the-art techniques in testing and debugging.Software-Debugging bezeichnet das Testen, Aufspüren, Reproduzieren, Diagnostizieren und das Beheben von Fehlern in Programmen. Es wurden bereits viele Debugging-Techniken vorgestellt, die Softwareentwicklern beim Testen und Debuggen unterstützen. Dennoch hat sich in der Forschung gezeigt, dass Entwickler diese Techniken in der Praxis kaum anwenden oder adaptieren. Das könnte daran liegen, dass es einen großen Abstand zwischen den vorgestellten und in der Praxis tatsächlich genutzten Techniken gibt. Die meisten Techniken genügen den Anforderungen der Entwickler nicht. In dieser Dissertation stellen wir die folgende wissenschaftliche Frage: Wie können wir die Kluft zwischen Software-Praxis und den aktuellen wissenschaftlichen Techniken für automatisiertes Testen und Debugging schließen? Um diese Herausforderung anzugehen, stellen wir die folgende These auf: Das Testen und Debuggen von Software sollte von empirischen Daten, die in der Software-Praxis gesammelt wurden, vorangetrieben werden. Genauer gesagt postulieren wir, dass das Feedback aus der Software-Praxis die Automation des Testens und Debuggens formen und bestimmen sollte. In dieser Arbeit fokussieren wir uns auf das Sammeln von Daten aus der Software-Praxis, indem wir einige empirische Studien über das Testen und Debuggen von Software in der echten Welt durchführen. Auf Basis der gesammelten Daten entwickeln wir dann Werkzeuge, die sich auf die Daten der durchgeführten Experimente stützen. Als erstes führen wir eine empirische Studie über den Stand des Debuggens in der Praxis durch, wobei wir eine Umfrage und eine Humanstudie nutzen. In dieser Studie befragen wir Entwickler zu ihren Bedürfnissen, die sie beim Debuggen haben und beobachten die Werkzeuge und Strategien, die sie beim Diagnostizieren, Testen und Aufspüren echter Fehler einsetzen. Als nächstes bewerten wir die Effektivität der aktuellen Automated Fault Localization (AFL)- Methoden zum automatischen Aufspüren von echten Fehlern in echten Programmen. Unser dritter Schritt ist ein Experiment, um die Ursachen von defekten Eingaben in der Software-Praxis zu ermitteln. Zuletzt erforschen wir, wie Häufigkeitsverteilungen von Teileingaben mithilfe einer Grammatik von echten Beispiel-Eingaben aus der Praxis gelernt werden können. Um die Lücke zwischen Software-Praxis und der aktuellen Forschung über Testen und Debuggen von Software zu schließen, bieten wir die folgenden empirischen Ergebnisse und Techniken: (1) Wir sammeln aktuelle Forschungsergebnisse zum Stand des Software-Debuggens und finden in der Tat eine Diskrepanz zwischen (vorhandenen) Debugging-Werkzeugen und dem, was der Entwickler tatsächlich benötigt. Wir sammeln die tatsächlichen Bedürfnisse von Entwicklern beim Testen und Debuggen von Fehlern aus der echten Welt und entwickeln einen Benchmark (DbgBench), um das automatische Evaluieren von Debugging-Werkzeugen zu erleichtern. (2) Wir stellen empirische Daten zur Effektivität einiger aktueller AFL-Techniken vor (z.B. Statistical Debugging-Formeln und Dynamic Slicing). Auf diese Daten aufbauend, stellen wir einen hybriden Algorithmus vor, der die Leistung der aktuellen AFL-Techniken übertrifft. (3) Wir evaluieren die Häufigkeit und Ursachen von ungültigen Eingaben in der Softwarepraxis und stellen einen auf diesen Daten aufbauenden universell einsetzbaren Algorithmus (ddmax) vor, der automatisch defekte Eingaben diagnostiziert und behebt. (4) Wir stellen eine Methode vor, die Verteilung von Schnipseln von Eingaben in der Software-Praxis zu lernen, indem wir Grammatiken mit Wahrscheinlichkeiten nutzen. Die gelernten Verteilungen benutzen wir dann, um den Beispiel-Eingaben ähnliche (oder verschiedene) Eingaben zu erzeugen. Zusammenfassend stellen wir einen auf der Praxis beruhenden Ansatz zum Testen und Debuggen von Software vor, welcher auf empirischen Daten aus der Software-Praxis basiert, um das Testen und Debuggen zu unterstützen. In unserer Evaluierung haben wir festgestellt, dass unser Ansatz effektiv viele Debugging-Disziplinen in der Praxis verbessert. Genauer gesagt finden wir mit unserem Ansatz die genauen Bedürfnisse von Entwicklern, verbessern die Effektivität vieler AFL-Techniken, debuggen und beheben effektiv fehlerhafte Eingaben und generieren Test-Eingaben, die (un)ähnlich zu Eingaben aus der echten Welt sind. Unsere vorgestellten Methoden basieren auf empirischen Daten und verbessern die aktuellen Techniken des Testens und Debuggens

    Fundamental Approaches to Software Engineering

    Get PDF
    computer software maintenance; computer software selection and evaluation; formal logic; formal methods; formal specification; programming languages; semantics; software engineering; specifications; verificatio

    THE SCALABLE AND ACCOUNTABLE BINARY CODE SEARCH AND ITS APPLICATIONS

    Get PDF
    The past decade has been witnessing an explosion of various applications and devices. This big-data era challenges the existing security technologies: new analysis techniques should be scalable to handle “big data” scale codebase; They should be become smart and proactive by using the data to understand what the vulnerable points are and where they locate; effective protection will be provided for dissemination and analysis of the data involving sensitive information on an unprecedented scale. In this dissertation, I argue that the code search techniques can boost existing security analysis techniques (vulnerability identification and memory analysis) in terms of scalability and accuracy. In order to demonstrate its benefits, I address two issues of code search by using the code analysis: scalability and accountability. I further demonstrate the benefit of code search by applying it for the scalable vulnerability identification [57] and the cross-version memory analysis problems [55, 56]. Firstly, I address the scalability problem of code search by learning “higher-level” semantic features from code [57]. Instead of conducting fine-grained testing on a single device or program, it becomes much more crucial to achieve the quick vulnerability scanning in devices or programs at a “big data” scale. However, discovering vulnerabilities in “big code” is like finding a needle in the haystack, even when dealing with known vulnerabilities. This new challenge demands a scalable code search approach. To this end, I leverage successful techniques from the image search in computer vision community and propose a novel code encoding method for scalable vulnerability search in binary code. The evaluation results show that this approach can achieve comparable or even better accuracy and efficiency than the baseline techniques. Secondly, I tackle the accountability issues left in the vulnerability searching problem by designing vulnerability-oriented raw features [58]. The similar code does not always represent the similar vulnerability, so it requires that the feature engineering for the code search should focus on semantic level features rather than syntactic ones. I propose to extract conditional formulas as higher-level semantic features from the raw binary code to conduct the code search. A conditional formula explicitly captures two cardinal factors of a vulnerability: 1) erroneous data dependencies and 2) missing or invalid condition checks. As a result, the binary code search on conditional formulas produces significantly higher accuracy and provides meaningful evidence for human analysts to further examine the search results. The evaluation results show that this approach can further improve the search accuracy of existing bug search techniques with very reasonable performance overhead. Finally, I demonstrate the potential of the code search technique in the memory analysis field, and apply it to address their across-version issue in the memory forensic problem [55, 56]. The memory analysis techniques for COTS software usually rely on the so-called “data structure profiles” for their binaries. Construction of such profiles requires the expert knowledge about the internal working of a specified software version. However, it is still a cumbersome manual effort most of time. I propose to leverage the code search technique to enable a notion named “cross-version memory analysis”, which can update a profile for new versions of a software by transferring the knowledge from the model that has already been trained on its old version. The evaluation results show that the code search based approach advances the existing memory analysis methods by reducing the manual efforts while maintaining the reasonable accuracy. With the help of collaborators, I further developed two plugins to the Volatility memory forensic framework [2], and show that each of the two plugins can construct a localized profile to perform specified memory forensic tasks on the same memory dump, without the need of manual effort in creating the corresponding profile

    Ernst Denert Award for Software Engineering 2020

    Get PDF
    This open access book provides an overview of the dissertations of the eleven nominees for the Ernst Denert Award for Software Engineering in 2020. The prize, kindly sponsored by the Gerlind & Ernst Denert Stiftung, is awarded for excellent work within the discipline of Software Engineering, which includes methods, tools and procedures for better and efficient development of high quality software. An essential requirement for the nominated work is its applicability and usability in industrial practice. The book contains eleven papers that describe the works by Jonathan Brachthäuser (EPFL Lausanne) entitled What You See Is What You Get: Practical Effect Handlers in Capability-Passing Style, Mojdeh Golagha’s (Fortiss, Munich) thesis How to Effectively Reduce Failure Analysis Time?, Nikolay Harutyunyan’s (FAU Erlangen-Nürnberg) work on Open Source Software Governance, Dominic Henze’s (TU Munich) research about Dynamically Scalable Fog Architectures, Anne Hess’s (Fraunhofer IESE, Kaiserslautern) work on Crossing Disciplinary Borders to Improve Requirements Communication, Istvan Koren’s (RWTH Aachen U) thesis DevOpsUse: A Community-Oriented Methodology for Societal Software Engineering, Yannic Noller’s (NU Singapore) work on Hybrid Differential Software Testing, Dominic Steinhofel’s (TU Darmstadt) thesis entitled Ever Change a Running System: Structured Software Reengineering Using Automatically Proven-Correct Transformation Rules, Peter Wägemann’s (FAU Erlangen-Nürnberg) work Static Worst-Case Analyses and Their Validation Techniques for Safety-Critical Systems, Michael von Wenckstern’s (RWTH Aachen U) research on Improving the Model-Based Systems Engineering Process, and Franz Zieris’s (FU Berlin) thesis on Understanding How Pair Programming Actually Works in Industry: Mechanisms, Patterns, and Dynamics – which actually won the award. The chapters describe key findings of the respective works, show their relevance and applicability to practice and industrial software engineering projects, and provide additional information and findings that have only been discovered afterwards, e.g. when applying the results in industry. This way, the book is not only interesting to other researchers, but also to industrial software professionals who would like to learn about the application of state-of-the-art methods in their daily work

    Ernst Denert Award for Software Engineering 2020

    Get PDF
    This open access book provides an overview of the dissertations of the eleven nominees for the Ernst Denert Award for Software Engineering in 2020. The prize, kindly sponsored by the Gerlind & Ernst Denert Stiftung, is awarded for excellent work within the discipline of Software Engineering, which includes methods, tools and procedures for better and efficient development of high quality software. An essential requirement for the nominated work is its applicability and usability in industrial practice. The book contains eleven papers that describe the works by Jonathan Brachthäuser (EPFL Lausanne) entitled What You See Is What You Get: Practical Effect Handlers in Capability-Passing Style, Mojdeh Golagha’s (Fortiss, Munich) thesis How to Effectively Reduce Failure Analysis Time?, Nikolay Harutyunyan’s (FAU Erlangen-Nürnberg) work on Open Source Software Governance, Dominic Henze’s (TU Munich) research about Dynamically Scalable Fog Architectures, Anne Hess’s (Fraunhofer IESE, Kaiserslautern) work on Crossing Disciplinary Borders to Improve Requirements Communication, Istvan Koren’s (RWTH Aachen U) thesis DevOpsUse: A Community-Oriented Methodology for Societal Software Engineering, Yannic Noller’s (NU Singapore) work on Hybrid Differential Software Testing, Dominic Steinhofel’s (TU Darmstadt) thesis entitled Ever Change a Running System: Structured Software Reengineering Using Automatically Proven-Correct Transformation Rules, Peter Wägemann’s (FAU Erlangen-Nürnberg) work Static Worst-Case Analyses and Their Validation Techniques for Safety-Critical Systems, Michael von Wenckstern’s (RWTH Aachen U) research on Improving the Model-Based Systems Engineering Process, and Franz Zieris’s (FU Berlin) thesis on Understanding How Pair Programming Actually Works in Industry: Mechanisms, Patterns, and Dynamics – which actually won the award. The chapters describe key findings of the respective works, show their relevance and applicability to practice and industrial software engineering projects, and provide additional information and findings that have only been discovered afterwards, e.g. when applying the results in industry. This way, the book is not only interesting to other researchers, but also to industrial software professionals who would like to learn about the application of state-of-the-art methods in their daily work

    Working Notes from the 1992 AAAI Workshop on Automating Software Design. Theme: Domain Specific Software Design

    Get PDF
    The goal of this workshop is to identify different architectural approaches to building domain-specific software design systems and to explore issues unique to domain-specific (vs. general-purpose) software design. Some general issues that cut across the particular software design domain include: (1) knowledge representation, acquisition, and maintenance; (2) specialized software design techniques; and (3) user interaction and user interface

    Proceedings of the 22nd Conference on Formal Methods in Computer-Aided Design – FMCAD 2022

    Get PDF
    The Conference on Formal Methods in Computer-Aided Design (FMCAD) is an annual conference on the theory and applications of formal methods in hardware and system verification. FMCAD provides a leading forum to researchers in academia and industry for presenting and discussing groundbreaking methods, technologies, theoretical results, and tools for reasoning formally about computing systems. FMCAD covers formal aspects of computer-aided system design including verification, specification, synthesis, and testing
    • …
    corecore