19 research outputs found
Increasing Code Completion Accuracy in Pythia Models for Non-Standard Python Libraries
Contemporary software development with modern programming languages leverages Integrated Development Environments, smart text editors, and similar tooling with code completion capabilities to increase the efficiency of software developers. Recent code completion research has shown that the combination of natural language processing with recurrent neural networks configured with long short-term memory can improve the accuracy of code completion predictions over prior models. It is well known that the accuracy of predictive systems based on training data is correlated to the quality and the quantity of the training data. This dissertation demonstrates that by expanding the training data set to include more references to specific Python third-party modules, the quality of the predictions increase for those specific Python third-party modules without degrading the quality of predictions of the originally represented modules
Pythia: AI-assisted Code Completion System
In this paper, we propose a novel end-to-end approach for AI-assisted code
completion called Pythia. It generates ranked lists of method and API
recommendations which can be used by software developers at edit time. The
system is currently deployed as part of Intellicode extension in Visual Studio
Code IDE. Pythia exploits state-of-the-art large-scale deep learning models
trained on code contexts extracted from abstract syntax trees. It is designed
to work at a high throughput predicting the best matching code completions on
the order of 100 .
We describe the architecture of the system, perform comparisons to
frequency-based approach and invocation-based Markov Chain language model, and
discuss challenges serving Pythia models on lightweight client devices.
The offline evaluation results obtained on 2700 Python open source software
GitHub repositories show a top-5 accuracy of 92\%, surpassing the baseline
models by 20\% averaged over classes, for both intra and cross-project
settings.Comment: Published in Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining (KDD '19
Code Prediction by Feeding Trees to Transformers
We advance the state-of-the-art in the accuracy of code prediction (next
token prediction) used in autocomplete systems. First, we report that using the
recently proposed Transformer architecture even out-of-the-box outperforms
previous neural and non-neural systems for code prediction. We then show that
by making the Transformer architecture aware of the syntactic structure of
code, we further increase the margin by which a Transformer-based system
outperforms previous systems. With this, it outperforms the accuracy of an
RNN-based system (similar to Hellendoorn et al. 2018) by 18.3\%, the Deep3
system (Raychev et al 2016) by 14.1\%, and an adaptation of Code2Seq (Alon et
al., 2018) for code prediction by 14.4\%.
We present in the paper several ways of communicating the code structure to
the Transformer, which is fundamentally built for processing sequence data. We
provide a comprehensive experimental evaluation of our proposal, along with
alternative design choices, on a standard Python dataset, as well as on a
Facebook internal Python corpus. Our code and data preparation pipeline will be
available in open source
Introduction of an Assistance System to Support Domain Experts in Programming Low-code to Leverage Industry 5.0
The rapid technological leaps of Industry 4.0 increase the pressure and
demands on humans working in automation, which is one of the main motivators of
Industry 5.0. In particular, automation software development for mechatronic
systems becomes increasingly challenging, as both domain knowledge and
programming skills are required for high-quality, maintainable software.
Especially for small companies from automation and robotics without dedicated
software engineering departments, domain-specific low-code platforms become
indispensable that enable domain experts to develop code intuitively using
visual programming languages, e.g., for tasks such as retrofitting mobile
machines. However, for extensive functionalities, visual programs may become
overwhelming due to the scaling-up problem. In addition, the ever-shortening
time-to-market increases the time pressure on programmers. Thus, an assistance
system concept is introduced that can be implemented by low-code platform
suppliers based on combining data mining and static code analysis. Domain
experts are supported in developing low-code by targeted recommendations,
metric-based complexity measurement, and reducing complexity by encapsulating
functionalities. The concept is implemented for the industrial low-code
platform HAWE eDesign to program hydraulic components in mobile machines, and
its benefits are confirmed in a user study and an industrial expert workshop.Comment: 8 pages, https://ieeexplore.ieee.org/abstract/document/983945
Code Completion by Modeling Flattened Abstract Syntax Trees as Graphs
Code completion has become an essential component of integrated development
environments. Contemporary code completion methods rely on the abstract syntax
tree (AST) to generate syntactically correct code. However, they cannot fully
capture the sequential and repetitive patterns of writing code and the
structural information of the AST. To alleviate these problems, we propose a
new code completion approach named CCAG, which models the flattened sequence of
a partial AST as an AST graph. CCAG uses our proposed AST Graph Attention Block
to capture different dependencies in the AST graph for representation learning
in code completion. The sub-tasks of code completion are optimized via
multi-task learning in CCAG, and the task balance is automatically achieved
using uncertainty without the need to tune task weights. The experimental
results show that CCAG has superior performance than state-of-the-art
approaches and it is able to provide intelligent code completion.Comment: Accepted in AAAI 2021. This version contains the appendix for the
derivation of Eq. 1