202 research outputs found
Structural Inductive Biases in Emergent Communication
In order to communicate, humans flatten a complex representation of ideas and
their attributes into a single word or a sentence. We investigate the impact of
representation learning in artificial agents by developing graph referential
games. We empirically show that agents parametrized by graph neural networks
develop a more compositional language compared to bag-of-words and sequence
models, which allows them to systematically generalize to new combinations of
familiar features.Comment: The first two authors contributed equally. Poster presented at CogSci
202
Emergent Language Generalization and Acquisition Speed are not tied to Compositionality
Studies of discrete languages emerging when neural agents communicate to
solve a joint task often look for evidence of compositional structure. This
stems for the expectation that such a structure would allow languages to be
acquired faster by the agents and enable them to generalize better. We argue
that these beneficial properties are only loosely connected to
compositionality. In two experiments, we demonstrate that, depending on the
task, non-compositional languages might show equal, or better, generalization
performance and acquisition speed than compositional ones. Further research in
the area should be clearer about what benefits are expected from
compositionality, and how the latter would lead to them
Mathematically Modeling the Lexicon Entropy of Emergent Language
We formulate a stochastic process, FiLex, as a mathematical model of lexicon
entropy in deep learning-based emergent language systems. Defining a model
mathematically allows it to generate clear predictions which can be directly
and decisively tested. We empirically verify across four different environments
that FiLex predicts the correct correlation between hyperparameters (training
steps, lexicon size, learning rate, rollout buffer size, and Gumbel-Softmax
temperature) and the emergent language's entropy in 20 out of 20
environment-hyperparameter combinations. Furthermore, our experiments reveal
that different environments show diverse relationships between their
hyperparameters and entropy which demonstrates the need for a model which can
make well-defined predictions at a precise level of granularity.Comment: 12 pages, 3 figures; added link to GitHub rep
Towards Graph Representation Learning in Emergent Communication
Recent findings in neuroscience suggest that the human brain represents information in a geometric structure (for instance, through conceptual spaces). In order to communicate, we flatten the complex representation of entities and their attributes into a single word or a sentence. In this paper we use graph convolutional networks to support the evolution of language and cooperation in multi-agent systems. Motivated by an image-based referential game, we propose a graph referential game with varying degrees of complexity, and we provide strong baseline models that exhibit desirable properties in terms of language emergence and cooperation. We show that the emerged communication protocol is robust, that the agents uncover the true factors of variation in the game, and that they learn to generalize beyond the samples encountered during training
What Makes a Language Easy to Deep-Learn?
Neural networks drive the success of natural language processing. A
fundamental property of language is its compositional structure, allowing
humans to produce forms for new meanings systematically. However, unlike
humans, neural networks notoriously struggle with systematic generalization,
and do not necessarily benefit from compositional structure in emergent
communication simulations. This poses a problem for using neural networks to
simulate human language learning and evolution, and suggests crucial
differences in the biases of the different learning systems. Here, we directly
test how neural networks compare to humans in learning and generalizing
different input languages that vary in their degree of structure. We evaluate
the memorization and generalization capabilities of a pre-trained language
model GPT-3.5 (analagous to an adult second language learner) and recurrent
neural networks trained from scratch (analaogous to a child first language
learner). Our results show striking similarities between deep neural networks
and adult human learners, with more structured linguistic input leading to more
systematic generalization and to better convergence between neural networks and
humans. These findings suggest that all the learning systems are sensitive to
the structure of languages in similar ways with compositionality being
advantageous for learning. Our findings draw a clear prediction regarding
children's learning biases, as well as highlight the challenges of automated
processing of languages spoken by small communities. Notably, the similarity
between humans and machines opens new avenues for research on language learning
and evolution.Comment: 32 pages, major update: improved text, added new analyses, added
supplementary materia
- …