87 research outputs found
Automated Fixing of Programs with Contracts
This paper describes AutoFix, an automatic debugging technique that can fix
faults in general-purpose software. To provide high-quality fix suggestions and
to enable automation of the whole debugging process, AutoFix relies on the
presence of simple specification elements in the form of contracts (such as
pre- and postconditions). Using contracts enhances the precision of dynamic
analysis techniques for fault detection and localization, and for validating
fixes. The only required user input to the AutoFix supporting tool is then a
faulty program annotated with contracts; the tool produces a collection of
validated fixes for the fault ranked according to an estimate of their
suitability.
In an extensive experimental evaluation, we applied AutoFix to over 200
faults in four code bases of different maturity and quality (of implementation
and of contracts). AutoFix successfully fixed 42% of the faults, producing, in
the majority of cases, corrections of quality comparable to those competent
programmers would write; the used computational resources were modest, with an
average time per fix below 20 minutes on commodity hardware. These figures
compare favorably to the state of the art in automated program fixing, and
demonstrate that the AutoFix approach is successfully applicable to reduce the
debugging burden in real-world scenarios.Comment: Minor changes after proofreadin
Recommended from our members
Systematic techniques for more effective fault localization and program repair
Debugging faulty code is a tedious process that is often quite expensive and can require much manual effort. Developers typically perform debugging in two key steps: (1) fault localization, i.e., identifying the location of faulty line(s) of code; and (2) program repair, i.e., modifying the code to remove the fault(s). Automating debugging to reduce its cost has been the focus of a number of research projects during the last decade, which have introduced a variety of techniques.
However, existing techniques suffer from two basic limitations. One, they lack accuracy to handle real programs. Two, they focus on automating only one of the two key steps, thereby leaving the other key step to the developer.
Our thesis is that an approach that integrates systematic search based on state-of-the-art constraint solvers with techniques to analyze artifacts that describe application specific properties and behaviors, provides the basis for developing more effective debugging techniques. We focus on faults in programs that operate on structurally complex inputs, such as heap-allocated data or relational databases.
Our approach lays the foundation for a unified framework for localization and repair of faults in programs. We embody our thesis in a suite of integrated techniques based on propositional satisfiability solving, correctness specifications analysis, test-spectra analysis, and rule-learning algorithms from machine learning, implement them as a prototype tool-set, and evaluate them using several subject programs.Electrical and Computer Engineerin
On the Effectiveness of Automatically Inferred Invariants in Detecting Regression Faults in Spreadsheets
Automatically inferred invariants have been found to be successful in detecting regression faults in traditional software, but their application has not been explored in the context of spreadsheets. In this paper, we investigate the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets. We conduct an exploratory empirical study on eight spreadsheets taken from VEnron and EUSES corpora. We apply automatic invariant inference to them, create tests based on the inferred invariants, and finally seed the sheets with faults. Results indicate that the effectiveness of the inferred invariants, in terms of accuracy of fault detection, largely varies from spreadsheet to spreadsheet. The effectiveness is found to be affected by the formulas and data contained in the spreadsheets, and also by the type of faults to be detected.Software EngineeringSoftware Technolog
Recommended from our members
Enhancing Usability and Explainability of Data Systems
The recent growth of data science expanded its reach to an ever-growing user base of nonexperts, increasing the need for usability, understandability, and explainability in these systems. Enhancing usability makes data systems accessible to people with different skills and backgrounds alike, leading to democratization of data systems. Furthermore, proper understanding of data and data-driven systems is necessary for the users to trust the function of the systems that learn from data. Finally, data systems should be transparent: when a data system behaves unexpectedly or malfunctions, the users deserve proper explanation of what caused the observed incident. Unfortunately, most existing data systems offer limited usability and support for explanations: these systems are usable only by experts with sound technical skills, and even expert users are hindered by the lack of transparency into the systems\u27 inner workings and functions. The aim of my thesis is to bridge the usability gap between nonexpert users and complex data systems, aid all sort of users, including the expert ones, in data and system understanding, and provide explanations that help reason about unexpected outcomes involving data systems. Specifically, my thesis has the following three goals: (1) enhancing usability of data systems for nonexperts, (2) enable data understanding that can assist users in a variety of tasks such as achieving trust in data-driven machine learning, gaining data understanding, and data cleaning, and (3) explaining causes of unexpected outcomes involving data and data systems.
For enhancing usability, we focus on example-driven user intent discovery. We develop systems based on example-driven interactions in two different settings: querying relational databases and personalized document summarization. Towards data understanding, we develop a new data-profiling primitive that can characterize tuples for which a machine-learned model is likely to produce untrustworthy predictions. We also develop an explanation framework to explain causes of such untrustworthy predictions. Additionally, this new data-profiling primitive enables interactive data cleaning. Finally, we develop two explanation frameworks, tailored to provide explanations in debugging data system components, including the data itself. The explanation frameworks focus on explaining the root cause of a concurrent application\u27s intermittent failure and exposing issues in the data that cause a data-driven system to malfunction
Modelling, Reverse Engineering, and Learning Software Variability
The society expects software to deliver the right functionality, in a short amount of time and with fewer resources, in every possible circumstance whatever are the hardware, the operating systems, the compilers, or the data fed as input. For fitting such a diversity of needs, it is common that software comes in many variants and is highly configurable through configuration options, runtime parameters, conditional compilation directives, menu preferences, configuration files, plugins, etc. As there is no one-size-fits-all solution, software variability ("the ability of a software system or artifact to be efficiently extended, changed, customized or configured for use in a particular context") has been studied the last two decades and is a discipline of its own. Though highly desirable, software variability also introduces an enormous complexity due to the combinatorial explosion of possible variants. For example, the Linux kernel has 15000+ options and most of them can have 3 values: "yes", "no", or "module". Variability is challenging for maintaining, verifying, and configuring software systems (Web applications, Web browsers, video tools, etc.). It is also a source of opportunities to better understand a domain, create reusable artefacts, deploy performance-wise optimal systems, or find specialized solutions to many kinds of problems. In many scenarios, a model of variability is either beneficial or mandatory to explore, observe, and reason about the space of possible variants. For instance, without a variability model, it is impossible to establish a sampling strategy that would satisfy the constraints among options and meet coverage or testing criteria. I address a central question in this HDR manuscript: How to model software variability? I detail several contributions related to modelling, reverse engineering, and learning software variability. I first contribute to support the persons in charge of manually specifying feature models, the de facto standard for modeling variability. I develop an algebra together with a language for supporting the composition, decomposition, diff, refactoring, and reasoning of feature models. I further establish the syntactic and semantic relationships between feature models and product comparison matrices, a large class of tabular data. I then empirically investigate how these feature models can be used to test in the large configurable systems with different sampling strategies. Along this effort, I report on the attempts and lessons learned when defining the "right" variability language. From a reverse engineering perspective, I contribute to synthesize variability information into models and from various kinds of artefacts. I develop foundations and methods for reverse engineering feature models from satisfiability formulae, product comparison matrices, dependencies files and architectural information, and from Web configurators. I also report on the degree of automation and show that the involvement of developers and domain experts is beneficial to obtain high-quality models. Thirdly, I contribute to learning constraints and non-functional properties (performance) of a variability-intensive system. I describe a systematic process "sampling, measuring, learning" that aims to enforce or augment a variability model, capturing variability knowledge that domain experts can hardly express. I show that supervised, statistical machine learning can be used to synthesize rules or build prediction models in an accurate and interpretable way. This process can even be applied to huge configuration space, such as the Linux kernel one. Despite a wide applicability and observed benefits, I show that each individual line of contributions has limitations. I defend the following answer: a supervised, iterative process (1) based on the combination of reverse engineering, modelling, and learning techniques; (2) capable of integrating multiple variability information (eg expert knowledge, legacy artefacts, dynamic observations). Finally, this work opens different perspectives related to so-called deep software variability, security, smart build of configurations, and (threats to) science
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications
Fundamental Approaches to Software Engineering
This open access book constitutes the proceedings of the 25th International Conference on Fundamental Approaches to Software Engineering, FASE 2022, which was held during April 4-5, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 17 regular papers presented in this volume were carefully reviewed and selected from 64 submissions. The proceedings also contain 3 contributions from the Test-Comp Competition. The papers deal with the foundations on which software engineering is built, including topics like software engineering as an engineering discipline, requirements engineering, software architectures, software quality, model-driven development, software processes, software evolution, AI-based software engineering, and the specification, design, and implementation of particular classes of systems, such as (self-)adaptive, collaborative, AI, embedded, distributed, mobile, pervasive, cyber-physical, or service-oriented applications
- …