13,572 research outputs found
Learning with Latent Language
The named concepts and compositional operators present in natural language
provide a rich source of information about the kinds of abstractions humans use
to navigate the world. Can this linguistic background knowledge improve the
generality and efficiency of learned classifiers and control policies? This
paper aims to show that using the space of natural language strings as a
parameter space is an effective way to capture natural task structure. In a
pretraining phase, we learn a language interpretation model that transforms
inputs (e.g. images) into outputs (e.g. labels) given natural language
descriptions. To learn a new concept (e.g. a classifier), we search directly in
the space of descriptions to minimize the interpreter's loss on training
examples. Crucially, our models do not require language data to learn these
concepts: language is used only in pretraining to impose structure on
subsequent learning. Results on image classification, text editing, and
reinforcement learning show that, in all settings, models with a linguistic
parameterization outperform those without
Multimodal Grounding for Language Processing
This survey discusses how recent developments in multimodal processing
facilitate conceptual grounding of language. We categorize the information flow
in multimodal processing with respect to cognitive models of human information
processing and analyze different methods for combining multimodal
representations. Based on this methodological inventory, we discuss the benefit
of multimodal grounding for a variety of language processing tasks and the
challenges that arise. We particularly focus on multimodal grounding of verbs
which play a crucial role for the compositional power of language.Comment: The paper has been published in the Proceedings of the 27 Conference
of Computational Linguistics. Please refer to this version for citations:
https://www.aclweb.org/anthology/papers/C/C18/C18-1197
- …