51 research outputs found
Relaxations for inference in restricted Boltzmann machines
We propose a relaxation-based approximate inference algorithm that samples
near-MAP configurations of a binary pairwise Markov random field. We experiment
on MAP inference tasks in several restricted Boltzmann machines. We also use
our underlying sampler to estimate the log-partition function of restricted
Boltzmann machines and compare against other sampling-based methods.Comment: ICLR 2014 workshop track submissio
Naturalizing a Programming Language via Interactive Learning
Our goal is to create a convenient natural language interface for performing
well-specified but complex actions such as analyzing data, manipulating text,
and querying databases. However, existing natural language interfaces for such
tasks are quite primitive compared to the power one wields with a programming
language. To bridge this gap, we start with a core programming language and
allow users to "naturalize" the core language incrementally by defining
alternative, more natural syntax and increasingly complex concepts in terms of
compositions of simpler ones. In a voxel world, we show that a community of
users can simultaneously teach a common system a diverse language and use it to
build hundreds of complex voxel structures. Over the course of three days,
these users went from using only the core language to using the naturalized
language in 85.9\% of the last 10K utterances.Comment: 10 pages, ACL201
Simple Recurrent Units for Highly Parallelizable Recurrence
Common recurrent neural architectures scale poorly due to the intrinsic
difficulty in parallelizing their state computations. In this work, we propose
the Simple Recurrent Unit (SRU), a light recurrent unit that balances model
capacity and scalability. SRU is designed to provide expressive recurrence,
enable highly parallelized implementation, and comes with careful
initialization to facilitate training of deep models. We demonstrate the
effectiveness of SRU on multiple NLP tasks. SRU achieves 5--9x speed-up over
cuDNN-optimized LSTM on classification and question answering datasets, and
delivers stronger results than LSTM and convolutional models. We also obtain an
average of 0.7 BLEU improvement over the Transformer model on translation by
incorporating SRU into the architecture.Comment: EMNL
Natural Language to Code Translation with Execution
Generative models of code, pretrained on large corpora of programs, have
shown great success in translating natural language to code (Chen et al., 2021;
Austin et al., 2021; Li et al., 2022, inter alia). While these models do not
explicitly incorporate program semantics (i.e., execution results) during
training, they are able to generate correct solutions for many problems.
However, choosing a single correct program from a generated set for each
problem remains challenging. In this work, we introduce execution result--based
minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it
improves the few-shot performance of pretrained code models on
natural-language-to-code tasks. We select output programs from a generated
candidate set by marginalizing over program implementations that share the same
semantics. Because exact equivalence is intractable, we execute each program on
a small number of test inputs to approximate semantic equivalence. Across
datasets, execution or simulated execution significantly outperforms the
methods that do not involve program semantics. We find that MBR-EXEC
consistently improves over all execution-unaware selection methods, suggesting
it as an effective approach for natural language to code translation. We
open-source our code at github.com/facebookresearch/mbr-exec and data at
dl.fbaipublicfiles.com/mbr-exec/mbr-exec-release.zipComment: EMNLP 202
LEVER: Learning to Verify Language-to-Code Generation with Execution
The advent of pre-trained code language models (CodeLMs) has lead to
significant progress in language-to-code generation. State-of-the-art
approaches in this area combine CodeLM decoding with sample pruning and
reranking using test cases or heuristics based on the execution results.
However, it is challenging to obtain test cases for many real-world
language-to-code applications, and heuristics cannot well capture the semantic
features of the execution results, such as data type and value range, which
often indicates the correctness of the program. In this work, we propose LEVER,
a simple approach to improve language-to-code generation by learning to verify
the generated programs with their execution results. Specifically, we train
verifiers to determine whether a program sampled from the CodeLM is correct or
not based on the natural language input, the program itself and its execution
results. The sampled programs are reranked by combining the verification score
with the CodeLM generation probability, and marginalizing over programs with
the same execution results. On four datasets across the domains of table QA,
math QA and basic Python programming, LEVER consistently improves over the base
CodeLMs (4.6% to 10.9% with code-davinci-002) and achieves new state-of-the-art
results on all of them.Comment: 23 page
Lassie: HOL4 Tactics by Example
Proof engineering efforts using interactive theorem proving have yielded
several impressive projects in software systems and mathematics. A key obstacle
to such efforts is the requirement that the domain expert is also an expert in
the low-level details in constructing the proof in a theorem prover. In
particular, the user needs to select a sequence of tactics that lead to a
successful proof, a task that in general requires knowledge of the exact names
and use of a large set of tactics.
We present Lassie, a tactic framework for the HOL4 theorem prover that allows
individual users to define their own tactic language by example and give
frequently used tactics or tactic combinations easier-to-remember names. The
core of Lassie is an extensible semantic parser, which allows the user to
interactively extend the tactic language through a process of definitional
generalization. Defining tactics in Lassie thus does not require any knowledge
in implementing custom tactics, while proofs written in Lassie retain the
correctness guarantees provided by the HOL4 system. We show through case
studies how Lassie can be used in small and larger proofs by novice and more
experienced interactive theorem prover users, and how we envision it to ease
the learning curve in a HOL4 tutorial
- …