Search CORE

5 research outputs found

Code Prediction by Feeding Trees to Transformers

Author: Chandra Satish
Kim Seohyun
Tian Yuchi
Zhao Jinman
Publication venue
Publication date: 02/07/2020
Field of study

We advance the state-of-the-art in the accuracy of code prediction (next token prediction) used in autocomplete systems. First, we report that using the recently proposed Transformer architecture even out-of-the-box outperforms previous neural and non-neural systems for code prediction. We then show that by making the Transformer architecture aware of the syntactic structure of code, we further increase the margin by which a Transformer-based system outperforms previous systems. With this, it outperforms the accuracy of an RNN-based system (similar to Hellendoorn et al. 2018) by 18.3\%, the Deep3 system (Raychev et al 2016) by 14.1\%, and an adaptation of Code2Seq (Alon et al., 2018) for code prediction by 14.4\%. We present in the paper several ways of communicating the code structure to the Transformer, which is fundamentally built for processing sequence data. We provide a comprehensive experimental evaluation of our proposal, along with alternative design choices, on a standard Python dataset, as well as on a Facebook internal Python corpus. Our code and data preparation pipeline will be available in open source

arXiv.org e-Print Archive

Generating Effective Sentence Representations: Deep Learning and Reinforcement Learning Approaches

Author: Ahmed Mahtab
Publication venue: Scholarship@Western
Publication date: 28/04/2021
Field of study

Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Many Natural Language applications are powered by machine learning models performing a large variety of underlying tasks. Recently, deep learning approaches have obtained very high performance across many NLP tasks. In order to achieve this high level of performance, it is crucial for computers to have an appropriate representation of sentences. The tasks addressed in the thesis are best approached having shallow semantic representations. These representations are vectors that are then embedded in a semantic space. We present a variety of novel approaches in deep learning applied to NLP for generating effective sentence representations in this space. These semantic representations can either be general or task-specific. We focus on learning task-specific sentence representations, where often these tasks have a good amount of overlap. We design a set of general purpose and task specific sentence encoders combining both word-level semantic knowledge and word- and sentence-level syntactic information. As a method for the former, we perform an intelligent amalgamation of word vectors using modern deep learning modules. For the latter, we use word-level knowledge, such as parts of speech, spelling, and suffix features, and sentence-level information drawn from natural language parse trees which provide the hierarchical structure of a sentence together with grammatical relations between the words. Further expertise is added with reinforcement learning which guides a machine learning model through a reward-penalty game. Rather than just striving for good performance, we always try to design models that are more transparent and explainable. We provide an intuitive explanation about the design of each model and how the model is making a decision. Our extensive experiments show that these models achieve competitive performance compared with the currently available state-of-the-art generalized and task-specific sentence encoders. All but one of the tasks dealt with English language texts. The multilingual semantic similarity task required creating a multilingual corpus for which we provide a novel semi-supervised approach to make artificial negative samples in the presence of just positive samples

Scholarship@Western