104,830 research outputs found
Towards Automatic Generation of Short Summaries of Commits
Committing to a version control system means submitting a software change to
the system. Each commit can have a message to describe the submission. Several
approaches have been proposed to automatically generate the content of such
messages. However, the quality of the automatically generated messages falls
far short of what humans write. In studying the differences between
auto-generated and human-written messages, we found that 82% of the
human-written messages have only one sentence, while the automatically
generated messages often have multiple lines. Furthermore, we found that the
commit messages often begin with a verb followed by an direct object. This
finding inspired us to use a "verb+object" format in this paper to generate
short commit summaries. We split the approach into two parts: verb generation
and object generation. As our first try, we trained a classifier to classify a
diff to a verb. We are seeking feedback from the community before we continue
to work on generating direct objects for the commits.Comment: 4 pages, accepted in ICPC 2017 ERA Trac
Explainable Software Bot Contributions: Case Study of Automated Bug Fixes
In a software project, esp. in open-source, a contribution is a valuable
piece of work made to the project: writing code, reporting bugs, translating,
improving documentation, creating graphics, etc. We are now at the beginning of
an exciting era where software bots will make contributions that are of similar
nature than those by humans. Dry contributions, with no explanation, are often
ignored or rejected, because the contribution is not understandable per se,
because they are not put into a larger context, because they are not grounded
on idioms shared by the core community of developers. We have been operating a
program repair bot called Repairnator for 2 years and noticed the problem of
"dry patches": a patch that does not say which bug it fixes, or that does not
explain the effects of the patch on the system. We envision program repair
systems that produce an "explainable bug fix": an integrated package of at
least 1) a patch, 2) its explanation in natural or controlled language, and 3)
a highlight of the behavioral difference with examples. In this paper, we
generalize and suggest that software bot contributions must explainable, that
they must be put into the context of the global software development
conversation
Untangling Fine-Grained Code Changes
After working for some time, developers commit their code changes to a
version control system. When doing so, they often bundle unrelated changes
(e.g., bug fix and refactoring) in a single commit, thus creating a so-called
tangled commit. Sharing tangled commits is problematic because it makes review,
reversion, and integration of these commits harder and historical analyses of
the project less reliable. Researchers have worked at untangling existing
commits, i.e., finding which part of a commit relates to which task. In this
paper, we contribute to this line of work in two ways: (1) A publicly available
dataset of untangled code changes, created with the help of two developers who
accurately split their code changes into self contained tasks over a period of
four months; (2) a novel approach, EpiceaUntangler, to help developers share
untangled commits (aka. atomic commits) by using fine-grained code change
information. EpiceaUntangler is based and tested on the publicly available
dataset, and further evaluated by deploying it to 7 developers, who used it for
2 weeks. We recorded a median success rate of 91% and average one of 75%, in
automatically creating clusters of untangled fine-grained code changes
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes
We propose a model to automatically describe changes introduced in the source
code of a program using natural language. Our method receives as input a set of
code commits, which contains both the modifications and message introduced by
an user. These two modalities are used to train an encoder-decoder
architecture. We evaluated our approach on twelve real world open source
projects from four different programming languages. Quantitative and
qualitative results showed that the proposed approach can generate feasible and
semantically sound descriptions not only in standard in-project settings, but
also in a cross-project setting.Comment: Accepted at ACL 201
- …