25 research outputs found
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
Despite progress across a broad range of applications, Transformers have
limited success in systematic generalization. The situation is especially
frustrating in the case of algorithmic tasks, where they often fail to find
intuitive solutions that route relevant information to the right node/operation
at the right time in the grid represented by Transformer columns. To facilitate
the learning of useful control flow, we propose two modifications to the
Transformer architecture, copy gate and geometric attention. Our novel Neural
Data Router (NDR) achieves 100% length generalization accuracy on the classic
compositional table lookup task, as well as near-perfect accuracy on the simple
arithmetic task and a new variant of ListOps testing for generalization across
computational depths. NDR's attention and gating patterns tend to be
interpretable as an intuitive form of neural routing. Our code is public.Comment: Accepted to ICLR 202
Core Challenges in Embodied Vision-Language Planning
Recent advances in the areas of multimodal machine learning and artificial
intelligence (AI) have led to the development of challenging tasks at the
intersection of Computer Vision, Natural Language Processing, and Embodied AI.
Whereas many approaches and previous survey pursuits have characterised one or
two of these dimensions, there has not been a holistic analysis at the center
of all three. Moreover, even when combinations of these topics are considered,
more focus is placed on describing, e.g., current architectural methods, as
opposed to also illustrating high-level challenges and opportunities for the
field. In this survey paper, we discuss Embodied Vision-Language Planning
(EVLP) tasks, a family of prominent embodied navigation and manipulation
problems that jointly use computer vision and natural language. We propose a
taxonomy to unify these tasks and provide an in-depth analysis and comparison
of the new and current algorithmic approaches, metrics, simulated environments,
as well as the datasets used for EVLP tasks. Finally, we present the core
challenges that we believe new EVLP works should seek to address, and we
advocate for task construction that enables model generalizability and furthers
real-world deployment.Comment: 35 page
Improving Generalization for Multimodal Fake News Detection
The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In this paper, we propose three models that adopt and fine-tune state-of-the-art multimodal transformers for multimodal fake news detection. We conduct an in-depth analysis by manipulating the input data aimed to explore models performance in realistic use cases on social media. Our study across multiple models demonstrates that these systems suffer significant performance drops against manipulated data. To reduce the bias and improve model generalization, we suggest training data augmentation to conduct more meaningful experiments for fake news detection on social media. The proposed data augmentation techniques enable models to generalize better and yield improved state-of-the-art results
Improving Generalization for Multimodal Fake News Detection
The increasing proliferation of misinformation and its alarming impact have motivated both industry and academia to develop approaches for fake news detection. However, state-of-the-art approaches are usually trained on datasets of smaller size or with a limited set of specific topics. As a consequence, these models lack generalization capabilities and are not applicable to real-world data. In this paper, we propose three models that adopt and fine-tune state-of-the-art multimodal transformers for multimodal fake news detection. We conduct an in-depth analysis by manipulating the input data aimed to explore models performance in realistic use cases on social media. Our study across multiple models demonstrates that these systems suffer significant performance drops against manipulated data. To reduce the bias and improve model generalization, we suggest training data augmentation to conduct more meaningful experiments for fake news detection on social media. The proposed data augmentation techniques enable models to generalize better and yield improved state-of-the-art results.EU/Horizon 2020/812997, BMBF/16KIS151
Improving Generalization for Multimodal Fake News Detection
The increasing proliferation of misinformation and its alarming impact have
motivated both industry and academia to develop approaches for fake news
detection. However, state-of-the-art approaches are usually trained on datasets
of smaller size or with a limited set of specific topics. As a consequence,
these models lack generalization capabilities and are not applicable to
real-world data. In this paper, we propose three models that adopt and
fine-tune state-of-the-art multimodal transformers for multimodal fake news
detection. We conduct an in-depth analysis by manipulating the input data aimed
to explore models performance in realistic use cases on social media. Our study
across multiple models demonstrates that these systems suffer significant
performance drops against manipulated data. To reduce the bias and improve
model generalization, we suggest training data augmentation to conduct more
meaningful experiments for fake news detection on social media. The proposed
data augmentation techniques enable models to generalize better and yield
improved state-of-the-art results.Comment: This paper has been accepted for ICMR 202