122,683 research outputs found
Attending to Entities for Better Text Understanding
Recent progress in NLP witnessed the development of large-scale pre-trained
language models (GPT, BERT, XLNet, etc.) based on Transformer (Vaswani et al.
2017), and in a range of end tasks, such models have achieved state-of-the-art
results, approaching human performance. This demonstrates the power of the
stacked self-attention architecture when paired with a sufficient number of
layers and a large amount of pre-training data. However, on tasks that require
complex and long-distance reasoning where surface-level cues are not enough,
there is still a large gap between the pre-trained models and human
performance. Strubell et al. (2018) recently showed that it is possible to
inject knowledge of syntactic structure into a model through supervised
self-attention. We conjecture that a similar injection of semantic knowledge,
in particular, coreference information, into an existing model would improve
performance on such complex problems. On the LAMBADA (Paperno et al. 2016)
task, we show that a model trained from scratch with coreference as auxiliary
supervision for self-attention outperforms the largest GPT-2 model, setting the
new state-of-the-art, while only containing a tiny fraction of parameters
compared to GPT-2. We also conduct a thorough analysis of different variants of
model architectures and supervision configurations, suggesting future
directions on applying similar techniques to other problems.Comment: Accepted at AAAI 202
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Rich and dense human labeled datasets are among the main enabling factors for
the recent advance on vision-language understanding. Many seemingly distant
annotations (e.g., semantic segmentation and visual question answering (VQA))
are inherently connected in that they reveal different levels and perspectives
of human understandings about the same visual scenes --- and even the same set
of images (e.g., of COCO). The popularity of COCO correlates those annotations
and tasks. Explicitly linking them up may significantly benefit both individual
tasks and the unified vision and language modeling. We present the preliminary
work of linking the instance segmentations provided by COCO to the questions
and answers (QAs) in the VQA dataset, and name the collected links visual
questions and segmentation answers (VQS). They transfer human supervision
between the previously separate tasks, offer more effective leverage to
existing problems, and also open the door for new research problems and models.
We study two applications of the VQS data in this paper: supervised attention
for VQA and a novel question-focused semantic segmentation task. For the
former, we obtain state-of-the-art results on the VQA real multiple-choice task
by simply augmenting the multilayer perceptrons with some attention features
that are learned using the segmentation-QA links as explicit supervision. To
put the latter in perspective, we study two plausible methods and compare them
to an oracle method assuming that the instance segmentations are given at the
test stage.Comment: To appear on ICCV 201
Augmented Memory for Conference Attendees
Human memory at its best can perform astonishing feats - the tiniest snippet of information can trigger whole chains of associations, ending at an item long-believed forgotten. While modern information systems excel at systematic manipulation of structured or semi-structured information or even vast repositories of unstructured textual information, they are still far from these capabilities
Meta-path Augmented Response Generation
We propose a chatbot, namely Mocha to make good use of relevant entities when
generating responses. Augmented with meta-path information, Mocha is able to
mention proper entities following the conversation flow.Comment: AAAI 201
The Structure of Episodic Memory: Ganeri's āMental Time Travel and Attentionā
We offer a framework for assessing what the structure of episodic memory might be, if one accepts the Buddhist denial of persisting selves. This paper is a response to Jonardon Ganeri's paper "Mental time travel and attention", which explores Buddhaghosa's ideas about memory. (It will eventually be published with a reply by Ganeri)
- ā¦