Search CORE

25 research outputs found

The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization

Author: Csordás Róbert
Irie Kazuki
Schmidhuber Jürgen
Publication venue
Publication date: 05/05/2022
Field of study

Despite progress across a broad range of applications, Transformers have limited success in systematic generalization. The situation is especially frustrating in the case of algorithmic tasks, where they often fail to find intuitive solutions that route relevant information to the right node/operation at the right time in the grid represented by Transformer columns. To facilitate the learning of useful control flow, we propose two modifications to the Transformer architecture, copy gate and geometric attention. Our novel Neural Data Router (NDR) achieves 100% length generalization accuracy on the classic compositional table lookup task, as well as near-perfect accuracy on the simple arithmetic task and a new variant of ListOps testing for generalization across computational depths. NDR's attention and gating patterns tend to be interpretable as an intuitive form of neural routing. Our code is public.Comment: Accepted to ICLR 202

arXiv.org e-Print Archive

Core Challenges in Embodied Vision-Language Planning

Author: Francis Jonathan
Kitamura Nariaki
Labelle Felix
Lu Xiaopeng
Navarro Ingrid
Oh Jean
Publication venue
Publication date: 27/07/2021
Field of study

Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.Comment: 35 page

arXiv.org e-Print Archive

Improving Generalization for Multimodal Fake News Detection

Author: Ewerth Ralph
Hakimov Sherzod
Huang Zi (Helen)
Kompatsiaris Ioannis (Yiannis)
Luo Jiebo
Mezaris Vasileios
Müller-Budack Eric
Papadopoulos Symeon
Popescu Adrian
Sebe Nicu
Tahmasebi Sahar
Yao Angela
Publication venue: New York, NY : Association for Computing Machinery
Publication date: 01/01/2023
Field of study

The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In this paper, we propose three models that adopt and fine-tune state-of-the-art multimodal transformers for multimodal fake news detection. We conduct an in-depth analysis by manipulating the input data aimed to explore models performance in realistic use cases on social media. Our study across multiple models demonstrates that these systems suffer significant performance drops against manipulated data. To reduce the bias and improve model generalization, we suggest training data augmentation to conduct more meaningful experiments for fake news detection on social media. The proposed data augmentation techniques enable models to generalize better and yield improved state-of-the-art results

Institutionelles Repositorium der Leibniz Universität Hannover

Improving Generalization for Multimodal Fake News Detection

Author: Ewerth Ralph
Hakimov Sherzod
Müller-Budack Eric
Tahmasebi Sahar
Publication venue: New York, NY : Association for Computing Machinery
Publication date: 01/01/2023
Field of study

Institutionelles Repositorium der Leibniz Universität Hannover

Improving Generalization for Multimodal Fake News Detection

Author: Ewerth Ralph
Hakimov Sherzod
Müller-Budack Eric
Tahmasebi Sahar
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/05/2023
Field of study

arXiv.org e-Print Archive

Interactive Symbol Grounding with Complex Referential Expressions

Author: Lascarides Alex
Rubavicius Rimvydas
Publication venue
Publication date: 01/07/2022
Field of study

Edinburgh Research Explorer