Search CORE

613 research outputs found

On the Realization of Compositionality in Neural Networks

Author: Baan Joris
Baumgärtner Tim
Bruni Elia
Hupkes Dieuwke
Leible Jana
Nikolaus Mitja
Rau David
Ulmer Dennis
Publication venue
Publication date: 01/01/2019
Field of study

We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions (Hupkes et al.,2019). We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Li\v{s}ka et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.Comment: To appear at BlackboxNLP 2019, AC

arXiv.org e-Print Archive

TUbiblio

Crossref

On the Realization of Compositionality in Neural Networks

Author: Baan J.
Baumgärtner T.
Bruni E.
Hupkes D.
Leible J.
Nikolaus M.
Rau D.
Ulmer D.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Transcoding compositionally: using attention to find more generalizable solutions

Author: Bruni Elia
Dankers Verna
Hupkes Dieuwke
Korrel Kris
Publication venue
Publication date: 01/01/2019
Field of study

While sequence-to-sequence models have shown remarkable generalization power across several natural language tasks, their construct of solutions are argued to be less compositional than human-like generalization. In this paper, we present seq2attn, a new architecture that is specifically designed to exploit attention to find compositional patterns in the input. In seq2attn, the two standard components of an encoder-decoder model are connected via a transcoder, that modulates the information flow between them. We show that seq2attn can successfully generalize, without requiring any additional supervision, on two tasks which are specifically constructed to challenge the compositional skills of neural networks. The solutions found by the model are highly interpretable, allowing easy analysis of both the types of solutions that are found and potential causes for mistakes. We exploit this opportunity to introduce a new paradigm to test compositionality that studies the extent to which a model overgeneralizes when confronted with exceptions. We show that seq2attn exhibits such overgeneralization to a larger degree than a standard sequence-to-sequence model.Comment: to appear at BlackboxNLP 2019, AC

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Inspecting post-16 dance: with guidance on self-evaluation

Author
Publication venue: Office for Standards in Education
Publication date: 01/01/2002
Field of study

Digital Education Resource Archive

Layer-wise Representation Fusion for Compositional Generalization

Author: Chen Yidong
Fu Biao
Lai Zhaohong
Lin Lei
Liu Shan
Rao Wenhao
Shi Xiaodong
Wang Binling
Ye Peigen
Zheng Yafang
Publication venue
Publication date: 20/07/2023
Field of study

Despite successes across a broad range of applications, sequence-to-sequence models' construct of solutions are argued to be less compositional than human-like generalization. There is mounting evidence that one of the reasons hindering compositional generalization is representations of the encoder and decoder uppermost layer are entangled. In other words, the syntactic and semantic representations of sequences are twisted inappropriately. However, most previous studies mainly concentrate on enhancing token-level semantic information to alleviate the representations entanglement problem, rather than composing and using the syntactic and semantic representations of sequences appropriately as humans do. In addition, we explain why the entanglement problem exists from the perspective of recent studies about training deeper Transformer, mainly owing to the ``shallow'' residual connections and its simple, one-step operations, which fails to fuse previous layers' information effectively. Starting from this finding and inspired by humans' strategies, we propose \textsc{FuSion} (\textbf{Fu}sing \textbf{S}yntactic and Semant\textbf{i}c Representati\textbf{on}s), an extension to sequence-to-sequence models to learn to fuse previous layers' information back into the encoding and decoding process appropriately through introducing a \emph{fuse-attention module} at each encoder and decoder layer. \textsc{FuSion} achieves competitive and even \textbf{state-of-the-art} results on two realistic benchmarks, which empirically demonstrates the effectiveness of our proposal.Comment: work in progress. arXiv admin note: substantial text overlap with arXiv:2305.1216

arXiv.org e-Print Archive

Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio

Author: Basu Sanjay
Dubovoy Dmitry
Moorthy Akshay
Roush Allen
Publication venue
Publication date: 28/06/2023
Field of study

Despite rapid advancement in the field of Constrained Natural Language Generation, little time has been spent on exploring the potential of language models which have had their vocabularies lexically, semantically, and/or phonetically constrained. We find that most language models generate compelling text even under significant constraints. We present a simple and universally applicable technique for modifying the output of a language model by compositionally applying filter functions to the language models vocabulary before a unit of text is generated. This approach is plug-and-play and requires no modification to the model. To showcase the value of this technique, we present an easy to use AI writing assistant called Constrained Text Generation Studio (CTGS). CTGS allows users to generate or choose from text with any combination of a wide variety of constraints, such as banning a particular letter, forcing the generated words to have a certain number of syllables, and/or forcing the words to be partial anagrams of another word. We introduce a novel dataset of prose that omits the letter e. We show that our method results in strictly superior performance compared to fine-tuning alone on this dataset. We also present a Huggingface space web-app presenting this technique called Gadsby. The code is available to the public here: https://github.com/Hellisotherpeople/Constrained-Text-Generation-StudioComment: Published in the proceedings of the 2nd Workshop on When Creative AI Meets Conversational AI (CAI2), COLING 2022, 6 pages, System Demonstration Pape

arXiv.org e-Print Archive

Building Machines That Learn and Think Like People

Author: Gershman Samuel J.
Lake Brenden M.
Tenenbaum Joshua B.
Ullman Tomer D.
Publication venue
Publication date: 01/04/2016
Field of study

Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.Comment: In press at Behavioral and Brain Sciences. Open call for commentary proposals (until Nov. 22, 2016). https://www.cambridge.org/core/journals/behavioral-and-brain-sciences/information/calls-for-commentary/open-calls-for-commentar

arXiv.org e-Print Archive

DSpace@MIT