5 research outputs found
Code Prediction by Feeding Trees to Transformers
We advance the state-of-the-art in the accuracy of code prediction (next
token prediction) used in autocomplete systems. First, we report that using the
recently proposed Transformer architecture even out-of-the-box outperforms
previous neural and non-neural systems for code prediction. We then show that
by making the Transformer architecture aware of the syntactic structure of
code, we further increase the margin by which a Transformer-based system
outperforms previous systems. With this, it outperforms the accuracy of an
RNN-based system (similar to Hellendoorn et al. 2018) by 18.3\%, the Deep3
system (Raychev et al 2016) by 14.1\%, and an adaptation of Code2Seq (Alon et
al., 2018) for code prediction by 14.4\%.
We present in the paper several ways of communicating the code structure to
the Transformer, which is fundamentally built for processing sequence data. We
provide a comprehensive experimental evaluation of our proposal, along with
alternative design choices, on a standard Python dataset, as well as on a
Facebook internal Python corpus. Our code and data preparation pipeline will be
available in open source
Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs
Code completion has become an essential component of integrated development
environments. Contemporary code completion methods rely on the abstract syntax
tree (AST) to generate syntactically correct code. However, they cannot fully
capture the sequential and repetitive patterns of writing code and the
structural information of the AST. To alleviate these problems, we propose a
new code completion approach named CCAG, which models the flattened sequence of
a partial AST as an AST graph. CCAG uses our proposed AST Graph Attention Block
to capture different dependencies in the AST graph for representation learning
in code completion. The sub-tasks of code completion are optimized via
multi-task learning in CCAG, and the task balance is automatically achieved
using uncertainty without the need to tune task weights. The experimental
results show that CCAG has superior performance than state-of-the-art
approaches and it is able to provide intelligent code completion.Comment: Accepted in AAAI 2021. This version contains the appendix for the
derivation of Eq. 1