12,698 research outputs found
Towards Automated Performance Bug Identification in Python
Context: Software performance is a critical non-functional requirement,
appearing in many fields such as mission critical applications, financial, and
real time systems. In this work we focused on early detection of performance
bugs; our software under study was a real time system used in the
advertisement/marketing domain.
Goal: Find a simple and easy to implement solution, predicting performance
bugs.
Method: We built several models using four machine learning methods, commonly
used for defect prediction: C4.5 Decision Trees, Na\"{\i}ve Bayes, Bayesian
Networks, and Logistic Regression.
Results: Our empirical results show that a C4.5 model, using lines of code
changed, file's age and size as explanatory variables, can be used to predict
performance bugs (recall=0.73, accuracy=0.85, and precision=0.96). We show that
reducing the number of changes delivered on a commit, can decrease the chance
of performance bug injection.
Conclusions: We believe that our approach can help practitioners to eliminate
performance bugs early in the development cycle. Our results are also of
interest to theoreticians, establishing a link between functional bugs and
(non-functional) performance bugs, and explicitly showing that attributes used
for prediction of functional bugs can be used for prediction of performance
bugs
FixMiner: Mining Relevant Fix Patterns for Automated Program Repair
Patching is a common activity in software development. It is generally
performed on a source code base to address bugs or add new functionalities. In
this context, given the recurrence of bugs across projects, the associated
similar patches can be leveraged to extract generic fix actions. While the
literature includes various approaches leveraging similarity among patches to
guide program repair, these approaches often do not yield fix patterns that are
tractable and reusable as actionable input to APR systems. In this paper, we
propose a systematic and automated approach to mining relevant and actionable
fix patterns based on an iterative clustering strategy applied to atomic
changes within patches. The goal of FixMiner is thus to infer separate and
reusable fix patterns that can be leveraged in other patch generation systems.
Our technique, FixMiner, leverages Rich Edit Script which is a specialized tree
structure of the edit scripts that captures the AST-level context of the code
changes. FixMiner uses different tree representations of Rich Edit Scripts for
each round of clustering to identify similar changes. These are abstract syntax
trees, edit actions trees, and code context trees. We have evaluated FixMiner
on thousands of software patches collected from open source projects.
Preliminary results show that we are able to mine accurate patterns,
efficiently exploiting change information in Rich Edit Scripts. We further
integrated the mined patterns to an automated program repair prototype,
PARFixMiner, with which we are able to correctly fix 26 bugs of the Defects4J
benchmark. Beyond this quantitative performance, we show that the mined fix
patterns are sufficiently relevant to produce patches with a high probability
of correctness: 81% of PARFixMiner's generated plausible patches are correct.Comment: 31 pages, 11 figure
VFCFinder: Seamlessly Pairing Security Advisories and Patches
Security advisories are the primary channel of communication for discovered
vulnerabilities in open-source software, but they often lack crucial
information. Specifically, 63% of vulnerability database reports are missing
their patch links, also referred to as vulnerability fixing commits (VFCs).
This paper introduces VFCFinder, a tool that generates the top-five ranked set
of VFCs for a given security advisory using Natural Language Programming
Language (NL-PL) models. VFCFinder yields a 96.6% recall for finding the
correct VFC within the Top-5 commits, and an 80.0% recall for the Top-1 ranked
commit. VFCFinder generalizes to nine different programming languages and
outperforms state-of-the-art approaches by 36 percentage points in terms of
Top-1 recall. As a practical contribution, we used VFCFinder to backfill over
300 missing VFCs in the GitHub Security Advisory (GHSA) database. All of the
VFCs were accepted and merged into the GHSA database. In addition to
demonstrating a practical pairing of security advisories to VFCs, our general
open-source implementation will allow vulnerability database maintainers to
drastically improve data quality, supporting efforts to secure the software
supply chain
- …