11 research outputs found
PRF: A Framework for Building Automatic Program Repair Prototypes for JVM-Based Languages
PRF is a Java-based framework that allows researchers to build prototypes of
test-based generate-and-validate automatic program repair techniques for JVM
languages by simply extending it with their patch generation plugins. The
framework also provides other useful components for constructing automatic
program repair tools, e.g., a fault localization component that provides
spectrum-based fault localization information at different levels of
granularity, a configurable and safe patch validation component that is 11+X
faster than vanilla testing, and a customizable post-processing component to
generate fix reports. A demo video of PRF is available at
https://bit.ly/3ehduSS.Comment: Proceedings of the 28th ACM Joint European Software Engineering
Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE
'20
How software engineering research aligns with design science: A review
Background: Assessing and communicating software engineering research can be
challenging. Design science is recognized as an appropriate research paradigm
for applied research but is seldom referred to in software engineering.
Applying the design science lens to software engineering research may improve
the assessment and communication of research contributions. Aim: The aim of
this study is 1) to understand whether the design science lens helps summarize
and assess software engineering research contributions, and 2) to characterize
different types of design science contributions in the software engineering
literature. Method: In previous research, we developed a visual abstract
template, summarizing the core constructs of the design science paradigm. In
this study, we use this template in a review of a set of 38 top software
engineering publications to extract and analyze their design science
contributions. Results: We identified five clusters of papers, classifying them
according to their alignment with the design science paradigm. Conclusions: The
design science lens helps emphasize the theoretical contribution of research
output---in terms of technological rules---and reflect on the practical
relevance, novelty, and rigor of the rules proposed by the research.Comment: 32 pages, 10 figure
CONTRACTFIX: A Framework for Automatically Fixing Vulnerabilities in Smart Contracts
The increased adoption of smart contracts in many industries has made them an
attractive target for cybercriminals, leading to millions of dollars in losses.
Thus, deploying smart contracts with detected vulnerabilities (known to
developers) are not acceptable, and fixing all the detected vulnerabilities is
needed, which incurs high manual labor cost without effective tool support. To
fill this need, in this paper, we propose ContractFix, a novel framework that
automatically generates security patches for vulnerable smart contracts.
ContractFix is a general framework that can incorporate different fix patterns
for different types of vulnerabilities. Users can use it as a security fix-it
tool that automatically applies patches and verifies the patched contracts
before deploying the contracts. To address the unique challenges in fixing
smart contract vulnerabilities, given an input smart contract, \tool conducts
our proposed ensemble identification based on multiple static verification
tools to identify vulnerabilities that are amenable for automatic fix. Then,
ContractFix generates patches using template-based fix patterns and conducts
program analysis (program dependency computation and pointer analysis) for
smart contracts to accurately infer and populate the parameter values for the
fix patterns. Finally, ContractFix performs static verification that guarantees
the patched contract is free of vulnerabilities. Our evaluations on real
vulnerable contracts demonstrate that \tool can successfully fix of the
detected vulnerabilities ( out of ) and preserve the expected
behaviors of the smart contracts
Frustrated with Code Quality Issues? LLMs can Help!
As software projects progress, quality of code assumes paramount importance
as it affects reliability, maintainability and security of software. For this
reason, static analysis tools are used in developer workflows to flag code
quality issues. However, developers need to spend extra efforts to revise their
code to improve code quality based on the tool findings. In this work, we
investigate the use of (instruction-following) large language models (LLMs) to
assist developers in revising code to resolve code quality issues. We present a
tool, CORE (short for COde REvisions), architected using a pair of LLMs
organized as a duo comprised of a proposer and a ranker. Providers of static
analysis tools recommend ways to mitigate the tool warnings and developers
follow them to revise their code. The \emph{proposer LLM} of CORE takes the
same set of recommendations and applies them to generate candidate code
revisions. The candidates which pass the static quality checks are retained.
However, the LLM may introduce subtle, unintended functionality changes which
may go un-detected by the static analysis. The \emph{ranker LLM} evaluates the
changes made by the proposer using a rubric that closely follows the acceptance
criteria that a developer would enforce. CORE uses the scores assigned by the
ranker LLM to rank the candidate revisions before presenting them to the
developer. CORE could revise 59.2% Python files (across 52 quality checks) so
that they pass scrutiny by both a tool and a human reviewer. The ranker LLM is
able to reduce false positives by 25.8% in these cases. CORE produced revisions
that passed the static analysis tool in 76.8% Java files (across 10 quality
checks) comparable to 78.3% of a specialized program repair tool, with
significantly much less engineering efforts
StaticFixer: From Static Analysis to Static Repair
Static analysis tools are traditionally used to detect and flag programs that
violate properties. We show that static analysis tools can also be used to
perturb programs that satisfy a property to construct variants that violate the
property. Using this insight we can construct paired data sets of unsafe-safe
program pairs, and learn strategies to automatically repair property
violations. We present a system called \sysname, which automatically repairs
information flow vulnerabilities using this approach. Since information flow
properties are non-local (both to check and repair), \sysname also introduces a
novel domain specific language (DSL) and strategy learning algorithms for
synthesizing non-local repairs. We use \sysname to synthesize strategies for
repairing two types of information flow vulnerabilities, unvalidated dynamic
calls and cross-site scripting, and show that \sysname successfully repairs
several hundred vulnerabilities from open source {\sc JavaScript} repositories,
outperforming neural baselines built using {\sc CodeT5} and {\sc Codex}. Our
datasets can be downloaded from \url{http://aka.ms/StaticFixer}
Memory and resource leak defects and their repairs in Java projects
Despite huge software engineering efforts and programming language support,
resource and memory leaks are still a troublesome issue, even in memory-managed
languages such as Java. Understanding the properties of leak-inducing defects,
how the leaks manifest, and how they are repaired is an essential prerequisite
for designing better approaches for avoidance, diagnosis, and repair of
leak-related bugs.
We conduct a detailed empirical study on 491 issues from 15 large open-source
Java projects. The study proposes taxonomies for the leak types, for the
defects causing them, and for the repair actions. We investigate, under several
aspects, the distributions within each taxonomy and the relationships between
them. We find that manual code inspection and manual runtime detection are
still the main methods for leak detection. We find that most of the errors
manifest on error-free execution paths, and developers repair the leak defects
in a shorter time than non-leak defects. We also identify 13 recurring code
transformations in the repair patches. Based on our findings, we draw a variety
of implications on how developers can avoid, detect, isolate and repair
leak-related bugs
A Survey on Automated Program Repair Techniques
With the rapid development and large-scale popularity of program software,
modern society increasingly relies on software systems. However, the problems
exposed by software have also come to the fore. Software defect has become an
important factor troubling developers. In this context, Automated Program
Repair (APR) techniques have emerged, aiming to automatically fix software
defect problems and reduce manual debugging work. In particular, benefiting
from the advances in deep learning, numerous learning-based APR techniques have
emerged in recent years, which also bring new opportunities for APR research.
To give researchers a quick overview of APR techniques' complete development
and future opportunities, we revisit the evolution of APR techniques and
discuss in depth the latest advances in APR research. In this paper, the
development of APR techniques is introduced in terms of four different patch
generation schemes: search-based, constraint-based, template-based, and
learning-based. Moreover, we propose a uniform set of criteria to review and
compare each APR tool, summarize the advantages and disadvantages of APR
techniques, and discuss the current state of APR development. Furthermore, we
introduce the research on the related technical areas of APR that have also
provided a strong motivation to advance APR development. Finally, we analyze
current challenges and future directions, especially highlighting the critical
opportunities that large language models bring to APR research.Comment: This paper's earlier version was submitted to CSUR in August 202