Search CORE

36 research outputs found

Poster: Debugging Inputs

Author: Kirschner Lukas
Soremekun Ezekiel
Zeller Andreas
Publication venue
Publication date: 01/10/2020
Field of study

Program failures are often caused by invalid inputs, for instance due to input corruption. To obtain the passing input, one needs to debug the data. In this paper we present a generic technique called ddmax that (1) identifies which parts of the input data prevent processing, and (2) recovers as much of the (valuable) input data as possible. To the best of our knowledge, ddmax is the first approach that fixes faults in the input data without requiring program analysis. In our evaluation, ddmax repaired about 69% of input files and recovered about 78% of data within one minute per input

CISPA – Helmholtz-Zentrum für Informationssicherheit

Locating Faults with Program Slicing: An Empirical Analysis

Author: Böhme Marcel
Kirschner Lukas
Soremekun Ezekiel
Zeller Andreas
Publication venue
Publication date: 08/01/2021
Field of study

Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best performing statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Open Repository and Bibliography - Luxembourg

Evaluating the Impact of Experimental Assumptions in Automated Fault Localization

Author: Böhme Marcel
Kirschner Lukas
Papadakis Mike
Soremekun Ezekiel
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

peer reviewe

Royal Holloway - Pure

Open Repository and Bibliography - Luxembourg

Locating Faults with Program Slicing: An Empirical Analysis

Author: Böhme Marcel
Kirschner Lukas
Soremekun Ezekiel
Zeller Andreas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2021
Field of study

Statistical fault localization is an easily deployed technique for quickly determining candidates for faulty code locations. If a human programmer has to search the fault beyond the top candidate locations, though, more traditional techniques of following dependencies along dynamic slices may be better suited. In a large study of 457 bugs (369 single faults and 88 multiple faults) in 46 open-source C programs, we compare the effectiveness of statistical fault localization against dynamic slicing. For single faults, we find that dynamic slicing was eight percentage points more effective than the best per- forming statistical debugging formula; for 66% of the bugs, dynamic slicing finds the fault earlier than the best performing statistical debugging formula. In our evaluation, dynamic slicing is more effective for programs with single fault, but statistical debugging performs better on multiple faults. Best results, however, are obtained by a hybrid approach: If programmers first examine at most the top five most suspicious locations from statistical debugging, and then switch to dynamic slices, on average, they will need to examine 15% (30 lines) of the code. These findings hold for 18 most effective statistical debugging formulas and our results are independent of the number of faults (i.e. single or multiple faults) and error type (i.e. artificial or real errors)

CISPA – Helmholtz-Zentrum für Informationssicherheit

Directed Grammar-Based Test Generation - Replication Package

Author: EZEKIEL SOREMEKUN (14685790)
Lukas Kirschner (8406750)
Publication venue
Publication date: 10/12/2023
Field of study

Artifact Structure:This artifact consists of a project archive `TSE-Project.tar.xz` and a results archive `TSE-Results.tar.xz`. To evaluate the artifact, please first extract the project archive into a folder of your choice. It contains all scripts, Dockerfiles and executables required to run the evaluation, as well as the generated graphs and tables from the evaluation inside the `project/out` directory.To build and run the Dockerfile, please consult the "Building and testing the Dockerfile on your local machine" section of the README.md file that is also contained inside the project artifact. We included all intermediate results of our evaluation inside the `TSE-Results.tar.xz` archive. To evaluate those files (including the inputs generated during our evaluation), please extract this archive into the `project/workingdir` directory. It required about 50GiB of disk space.Paper Abstract:Context: To effectively test complex software, it is important to generate goal-specific inputs, i.e., inputs that achieve a specific testing goal. For instance, developers may intend to target one or more testing goal(s) during testing – generate complex inputs or trigger new or error-prone behaviors. Problem: However, most state-of-the-art test generators are not designed to target specific goals. Notably, grammar-based test generators, which (randomly) produce syntactically valid inputs via an input specification (i.e., grammar) have a low probability of achieving an arbitrary testing goal. Aim: This work addresses this challenge by proposing an automated test generation approach (called FDLOOP) which iteratively learns relevant input properties from existing inputs to drive the generation of goal-specific inputs. Method: The main idea of our approach is to leverage test feedback to generate goal-specific inputs via a combination of evolutionary testing and grammar learning. FDLOOP automatically learns a mapping between input structures and a specific testing goal, such mappings allow to generate inputs that target the goal-at-hand. Given a testing goal, FDLOOP iteratively selects, evolves and learn the input distribution of goal-specific test inputs via test feedback and a probabilistic grammar. We concretize FDLOOP for four testing goals, namely unique code coverage, input-to-code complexity, program failures (exceptions) and long execution time. We evaluate FDLOOP using three (3) well-known input formats (JSON, CSS and JavaScript) and 20 open-source software. Results: FDLOOP is up to 89% more effective than the baseline grammar-based test generators (i.e., random, probabilistic and inverse-probabilistic methods) and it outperforms the closest state-of-the-art approach (EvoGfuzz) by up to 77%. In addition, we show that the main components of FDLOOP (i.e., input mutator and grammar mutator) contribute positively to the effectiveness of our approach. We also observed that FDLOOP is effective across varying parameter settings – the number of initial seed inputs, the number of generated inputs, and the number of input generations. Implications: Finally, our evaluation demonstrates that FDLOOP is effective for targeting a specific testing goal – revealing error-prone behaviors, generating complex inputs, or producing inputs with long execution time – and scales to multiple testing goals.</p

FigShare

Customized Software Environment for Remote Learning: Providing Students a Specialized Learning Experience

Author: Karl Kirschner
Lukas Schauer
Michael Rademacher
Thomas Gerlach
Wolfgang Heiden
Publication venue
Publication date: 07/04/2021
Field of study

The Covid-19 pandemic has challenged educators across the world to move their teaching and mentoring from in-person to remote. During nonpandemic semesters at their institutes (e.g. universities), educators can directly provide students the software environment needed to support their learning - either in specialized computer laboratories (e.g. computational chemistry labs) or shared computer spaces. These labs are often supported by staff that maintains the operating systems (OS) and software. But how does one provide a specialized software environment for remote teaching? One solution is to provide students a customized operating system (e.g., Linux) that includes open-source software for supporting your teaching goals. However, such a solution should not require students to install the OS alongside their existing one (i.e. dual/multi-booting) or be used as a complete replacement. Such approaches are risky because of a) the students\u27 possible lack of software expertise, b) the possible disruption of an existing software workflow that is needed in other classes or by other family members, and c) the importance of maintaining a working computer when isolated (e.g. societal restrictions). To illustrate possible solutions, we discuss our approach that used a customized Linux OS and a Docker container in a course that teaches computational chemistry and Python3

ChemRxiv

Customized Software Environment for Remote Learning: Providing Students a Specialized Learning Experience

Author: Gerlach Thomas
Heiden Wolfgang
Kirschner Karl
Rademacher Michael
Schauer Lukas
Publication venue
Publication date: 07/04/2021
Field of study

The Covid-19 pandemic has challenged educators across the world to move their teaching and mentoring from in-person to remote. During nonpandemic semesters at their institutes (e.g. universities), educators can directly provide students the software environment needed to support their learning - either in specialized computer laboratories (e.g. computational chemistry labs) or shared computer spaces. These labs are often supported by staff that maintains the operating systems (OS) and software. But how does one provide a specialized software environment for remote teaching? One solution is to provide students a customized operating system (e.g., Linux) that includes open-source software for supporting your teaching goals. However, such a solution should not require students to install the OS alongside their existing one (i.e. dual/multi-booting) or be used as a complete replacement. Such approaches are risky because of a) the students' possible lack of software expertise, b) the possible disruption of an existing software workflow that is needed in other classes or by other family members, and c) the importance of maintaining a working computer when isolated (e.g. societal restrictions). To illustrate possible solutions, we discuss our approach that used a customized Linux OS and a Docker container in a course that teaches computational chemistry and Python3

pub H-BRS - Publikationsserver der Hochschule Bonn-Rhein-Sieg

Debugging Assumptions Artifact

Author: Ezekiel Soremekun (8421721)
Lukas Kirschner (8406750)
Marcel Böhme (9762367)
Mike Papadakis (14307080)
Publication venue
Publication date: 04/01/2023
Field of study

Artifact for "Evaluating the Impact of Experimental Assumptions in Automated Fault Localization", accepted at ICSE 2023 Technical Track.</p

FigShare

Recommended from our members

Design considerations for haptic-enabled virtual reality simulation for interactive learning of nanoscale science in schools

Author: A Paivio
A Paivio
A Wong
B Hellenkamp
D Deng
F Flores
H Culbertson
H Tuckey
J Sweller
JK Gilbert
JL Campbell
L Shams
Lilianna Malińska
M Webb
PA Kirschner
S Lukas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

This paper reports on a study which investigated whether the addition of haptics (virtual touch) to a 3D virtual reality (VR) simulation promotes understanding of key nanoscale concepts in membrane systems for students aged 12 to 13. We developed a virtual model of a section of the cell membrane and a haptic enabled interface that enables students to interact with the model and to manipulate objects in the model. Students, in two schools in England, worked collaboratively in pairs on activities designed to develop their understanding of key concepts of cell membrane function. Results of pre-and post-tests of conceptual knowledge and understanding showed significant knowledge gains but there were no significant differences between the haptic and non-haptic condition. However, findings from observation of the activities and student interviews revealed that students were very positive about using the system and believed that being able to feel structures and manipulate objects within the model assisted their learning. We examine some of the design challenges and issues affecting the perception of haptic feedback

Central Archive at the University of Reading

Crossref

King's Research Portal

Mining Input Grammars from Dynamic Control Flow

Author: Aschermann Cornelius
Bai Guangdong
Bastani Osbert
Blazytko Tim
Deprez Jean-Christophe
Dongdong Shi Kexin Pei
Godefroid Patrice
Jefery Clinton L
Kirschner Lukas
Mathis Björn
Zeller Andreas
Zeller Andreas
Zeller Andreas
Zeller Andreas
Publication venue
Publication date: 09/11/2020
Field of study

One of the key properties of a program is its input specification. Having a formal input specification can be critical in fields such as vulnerability analysis, reverse engineering, software testing, clone detection, or refactoring. Unfortunately, accurate input specifications for typical programs are often unavailable or out of date. In this paper, we present a general algorithm that takes a program and a small set of sample inputs and automatically infers a readable context-free grammar capturing the input language of the program. We infer the syntactic input structure only by observing access of input characters at different locations of the input parser. This works on all stack based recursive descent input parsers, including parser combinators, and works entirely without program specific heuristics. Our Mimid prototype produced accurate and readable grammars for a variety of evaluation subjects, including complex languages such as JSON, TinyC, and JavaScript

CISPA – Helmholtz-Zentrum für Informationssicherheit

Crossref