7 research outputs found
Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning
Code summarization generates brief natural language description given a
source code snippet, while code retrieval fetches relevant source code given a
natural language query. Since both tasks aim to model the association between
natural language and programming language, recent studies have combined these
two tasks to improve their performance. However, researchers have yet been able
to effectively leverage the intrinsic connection between the two tasks as they
train these tasks in a separate or pipeline manner, which means their
performance can not be well balanced. In this paper, we propose a novel
end-to-end model for the two tasks by introducing an additional code generation
task. More specifically, we explicitly exploit the probabilistic correlation
between code summarization and code generation with dual learning, and utilize
the two encoders for code summarization and code generation to train the code
retrieval task via multi-task learning. We have carried out extensive
experiments on an existing dataset of SQL and Python, and results show that our
model can significantly improve the results of the code retrieval task over
the-state-of-art models, as well as achieve competitive performance in terms of
BLEU score for the code summarization task.Comment: Published at The Web Conference (WWW) 2020, full pape
Code Structure Guided Transformer for Source Code Summarization
Code summaries help developers comprehend programs and reduce their time to
infer the program functionalities during software maintenance. Recent efforts
resort to deep learning techniques such as sequence-to-sequence models for
generating accurate code summaries, among which Transformer-based approaches
have achieved promising performance. However, effectively integrating the code
structure information into the Transformer is under-explored in this task
domain. In this paper, we propose a novel approach named SG-Trans to
incorporate code structural properties into Transformer. Specifically, we
inject the local symbolic information (e.g., code tokens and statements) and
global syntactic structure (e.g., data flow graph) into the self-attention
module of Transformer as inductive bias. To further capture the hierarchical
characteristics of code, the local information and global structure are
designed to distribute in the attention heads of lower layers and high layers
of Transformer. Extensive evaluation shows the superior performance of SG-Trans
over the state-of-the-art approaches. Compared with the best-performing
baseline, SG-Trans still improves 1.4% and 2.0% in terms of METEOR score, a
metric widely used for measuring generation quality, respectively on two
benchmark datasets
Branch coverage prediction in automated testing
This is the peer reviewed version which has been published in final form at [DOI]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.Software testing is crucial in continuous integration (CI). Ideally, at every commit, all the test cases should be executed, and moreover, new test cases should be generated for the new source code. This is especially true in a Continuous Test Generation (CTG) environment, where the automatic generation of test cases is integrated into the continuous integration pipeline. In this context, developers want to achieve a certain minimum level of coverage for every software build. However, executing all the test cases and, moreover, generating new ones for all the classes at every commit is not feasible. As a consequence, developers have to select which subset of classes has to be tested and/or targeted by testācase generation. We argue that knowing a priori the branch coverage that can be achieved with testādata generation tools can help developers into taking informed decision about those issues. In this paper, we investigate the possibility to use sourceācode metrics to predict the coverage achieved by testādata generation tools. We use four different categories of sourceācode features and assess the prediction on a large data set involving more than 3'000 Java classes. We compare different machine learning algorithms and conduct a fineāgrained feature analysis aimed at investigating the factors that most impact the prediction accuracy. Moreover, we extend our investigation to four different search budgets. Our evaluation shows that the best model achieves an average 0.15 and 0.21 MAE on nested crossāvalidation over the different budgets, respectively, on EVOSUITE and RANDOOP. Finally, the discussion of the results demonstrate the relevance of couplingārelated features for the prediction accuracy
Automatic sentence annotation for more useful bug report summarization
Bug reports are a useful software artifact with software developers referring to them for various information needs. As bug reports can become long, users of bug reports may need to spend a lot of time reading them. Previous studies developed summarizers and the quality of summaries was determined based on human-created gold-standard summaries. We believe creating such summaries for evaluating summarizers is not a good practice. First, we have observed a high level of disagreement between the annotated summaries. Second, the number of annotators involved is lower than the established minimum for the creation of a stable annotated summary. Finally, the traditional fixed threshold of 25% of the bug report word count does not adequately serve the different information needs. Consequently, we developed an automatic sentence annotation method to identify content in bug report comments which allows bug report users to customize a view for their task-dependent information needs