4 research outputs found
Learning to refer informatively by amortizing pragmatic reasoning
A hallmark of human language is the ability to effectively and efficiently
convey contextually relevant information. One theory for how humans reason
about language is presented in the Rational Speech Acts (RSA) framework, which
captures pragmatic phenomena via a process of recursive social reasoning
(Goodman & Frank, 2016). However, RSA represents ideal reasoning in an
unconstrained setting. We explore the idea that speakers might learn to
amortize the cost of RSA computation over time by directly optimizing for
successful communication with an internal listener model. In simulations with
grounded neural speakers and listeners across two communication game datasets
representing synthetic and human-generated data, we find that our amortized
model is able to quickly generate language that is effective and concise across
a range of contexts, without the need for explicit pragmatic reasoning.Comment: Accepted to CogSci 202
Generating Pragmatic Examples to Train Neural Program Synthesizers
Programming-by-example is the task of synthesizing a program that is
consistent with a set of user-provided input-output examples. As examples are
often an under-specification of one's intent, a good synthesizer must choose
the intended program from the many that are consistent with the given set of
examples. Prior work frames program synthesis as a cooperative game between a
listener (that synthesizes programs) and a speaker (a user choosing examples),
and shows that models of computational pragmatic inference are effective in
choosing the user intended programs. However, these models require
counterfactual reasoning over a large set of programs and examples, which is
infeasible in realistic program spaces. In this paper, we propose a novel way
to amortize this search with neural networks. We sample pairs of programs and
examples via self-play between listener and speaker models, and use pragmatic
inference to choose informative training examples from this sample.We then use
the informative dataset to train models to improve the synthesizer's ability to
disambiguate user-provided examples without human supervision. We validate our
method on the challenging task of synthesizing regular expressions from example
strings, and find that our method (1) outperforms models trained without
choosing pragmatic examples by 23% (a 51% relative increase) (2) matches the
performance of supervised learning on a dataset of pragmatic examples provided
by humans, despite using no human data in training
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought
How does language inform our downstream thinking? In particular, how do
humans make meaning from language -- and how can we leverage a theory of
linguistic meaning to build machines that think in more human-like ways? In
this paper, we propose \textit{rational meaning construction}, a computational
framework for language-informed thinking that combines neural models of
language with probabilistic models for rational inference. We frame linguistic
meaning as a context-sensitive mapping from natural language into a
\textit{probabilistic language of thought} (PLoT) -- a general-purpose symbolic
substrate for probabilistic, generative world modeling. Our architecture
integrates two powerful computational tools that have not previously come
together: we model thinking with \textit{probabilistic programs}, an expressive
representation for flexible commonsense reasoning; and we model meaning
construction with \textit{large language models} (LLMs), which support
broad-coverage translation from natural language utterances to code expressions
in a probabilistic programming language. We illustrate our framework in action
through examples covering four core domains from cognitive science:
probabilistic reasoning, logical and relational reasoning, visual and physical
reasoning, and social reasoning about agents and their plans. In each, we show
that LLMs can generate context-sensitive translations that capture
pragmatically-appropriate linguistic meanings, while Bayesian inference with
the generated programs supports coherent and robust commonsense reasoning. We
extend our framework to integrate cognitively-motivated symbolic modules to
provide a unified commonsense thinking interface from language. Finally, we
explore how language can drive the construction of world models themselves
3-я Міжнародна конференція зі сталого майбутнього: екологічні, технологічні, соціальні та економічні аспекти (ICSF 2022) 24-27 травня 2022 року, м. Кривий Ріг, Україна
Матеріали 3-ої Міжнародної конференції зі сталого майбутнього: екологічні, технологічні, соціальні та економічні аспекти (ICSF 2022) 24-27 травня 2022 року, м. Кривий Ріг, Україна.Proceedings of the 3rd International Conference on Sustainable Futures: Environmental, Technological, Social and Economic Matters (ICSF 2022) 24-27 May 2022, Kryvyi Rih, Ukraine