12 research outputs found
Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands
To understand diverse natural language commands, virtual assistants today are
trained with numerous labor-intensive, manually annotated sentences. This paper
presents a methodology and the Genie toolkit that can handle new compound
commands with significantly less manual effort. We advocate formalizing the
capability of virtual assistants with a Virtual Assistant Programming Language
(VAPL) and using a neural semantic parser to translate natural language into
VAPL code. Genie needs only a small realistic set of input sentences for
validating the neural model. Developers write templates to synthesize data;
Genie uses crowdsourced paraphrases and data augmentation, along with the
synthesized data, to train a semantic parser. We also propose design principles
that make VAPL languages amenable to natural language translation. We apply
these principles to revise ThingTalk, the language used by the Almond virtual
assistant. We use Genie to build the first semantic parser that can support
compound virtual assistants commands with unquoted free-form parameters. Genie
achieves a 62% accuracy on realistic user inputs. We demonstrate Genie's
generality by showing a 19% and 31% improvement over the previous state of the
art on a music skill, aggregate functions, and access control.Comment: To appear in PLDI 201
Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit
Code intelligence leverages machine learning techniques to extract knowledge
from extensive code corpora, with the aim of developing intelligent tools to
improve the quality and productivity of computer programming. Currently, there
is already a thriving research community focusing on code intelligence, with
efforts ranging from software engineering, machine learning, data mining,
natural language processing, and programming languages. In this paper, we
conduct a comprehensive literature review on deep learning for code
intelligence, from the aspects of code representation learning, deep learning
techniques, and application tasks. We also benchmark several state-of-the-art
neural models for code intelligence, and provide an open-source toolkit
tailored for the rapid prototyping of deep-learning-based code intelligence
models. In particular, we inspect the existing code intelligence models under
the basis of code representation learning, and provide a comprehensive overview
to enhance comprehension of the present state of code intelligence.
Furthermore, we publicly release the source code and data resources to provide
the community with a ready-to-use benchmark, which can facilitate the
evaluation and comparison of existing and future code intelligence models
(https://xcodemind.github.io). At last, we also point out several challenging
and promising directions for future research