49,936 research outputs found
Table-to-text Generation by Structure-aware Seq2seq Learning
Table-to-text generation aims to generate a description for a factual table
which can be viewed as a set of field-value records. To encode both the content
and the structure of a table, we propose a novel structure-aware seq2seq
architecture which consists of field-gating encoder and description generator
with dual attention. In the encoding phase, we update the cell memory of the
LSTM unit by a field gate and its corresponding field value in order to
incorporate field information into table representation. In the decoding phase,
dual attention mechanism which contains word level attention and field level
attention is proposed to model the semantic relevance between the generated
description and the table. We conduct experiments on the \texttt{WIKIBIO}
dataset which contains over 700k biographies and corresponding infoboxes from
Wikipedia. The attention visualizations and case studies show that our model is
capable of generating coherent and informative descriptions based on the
comprehensive understanding of both the content and the structure of a table.
Automatic evaluations also show our model outperforms the baselines by a great
margin. Code for this work is available on
https://github.com/tyliupku/wiki2bio.Comment: Accepted by AAAI201
The Natural Language Decathlon: Multitask Learning as Question Answering
Deep learning has improved performance on many natural language processing
(NLP) tasks individually. However, general NLP models cannot emerge within a
paradigm that focuses on the particularities of a single metric, dataset, and
task. We introduce the Natural Language Decathlon (decaNLP), a challenge that
spans ten tasks: question answering, machine translation, summarization,
natural language inference, sentiment analysis, semantic role labeling,
zero-shot relation extraction, goal-oriented dialogue, semantic parsing, and
commonsense pronoun resolution. We cast all tasks as question answering over a
context. Furthermore, we present a new Multitask Question Answering Network
(MQAN) jointly learns all tasks in decaNLP without any task-specific modules or
parameters in the multitask setting. MQAN shows improvements in transfer
learning for machine translation and named entity recognition, domain
adaptation for sentiment analysis and natural language inference, and zero-shot
capabilities for text classification. We demonstrate that the MQAN's
multi-pointer-generator decoder is key to this success and performance further
improves with an anti-curriculum training strategy. Though designed for
decaNLP, MQAN also achieves state of the art results on the WikiSQL semantic
parsing task in the single-task setting. We also release code for procuring and
processing data, training and evaluating models, and reproducing all
experiments for decaNLP
ConveRT: Efficient and Accurate Conversational Representations from Transformers
General-purpose pretrained sentence encoders such as BERT are not ideal for
real-world conversational AI applications; they are computationally heavy,
slow, and expensive to train. We propose ConveRT (Conversational
Representations from Transformers), a pretraining framework for conversational
tasks satisfying all the following requirements: it is effective, affordable,
and quick to train. We pretrain using a retrieval-based response selection
task, effectively leveraging quantization and subword-level parameterization in
the dual encoder to build a lightweight memory- and energy-efficient model. We
show that ConveRT achieves state-of-the-art performance across widely
established response selection tasks. We also demonstrate that the use of
extended dialog history as context yields further performance gains. Finally,
we show that pretrained representations from the proposed encoder can be
transferred to the intent classification task, yielding strong results across
three diverse data sets. ConveRT trains substantially faster than standard
sentence encoders or previous state-of-the-art dual encoders. With its reduced
size and superior performance, we believe this model promises wider portability
and scalability for Conversational AI applications
Knowledgeable Dialogue Reading Comprehension on Key Turns
Multi-choice machine reading comprehension (MRC) requires models to choose
the correct answer from candidate options given a passage and a question. Our
research focuses dialogue-based MRC, where the passages are multi-turn
dialogues. It suffers from two challenges, the answer selection decision is
made without support of latently helpful commonsense, and the multi-turn
context may hide considerable irrelevant information. This work thus makes the
first attempt to tackle those two challenges by extracting substantially
important turns and utilizing external knowledge to enhance the representation
of context. In this paper, the relevance of each turn to the question are
calculated to choose key turns. Besides, terms related to the context and the
question in a knowledge graph are extracted as external knowledge. The original
context, question and external knowledge are encoded with the pre-trained
language model, then the language representation and key turns are combined
together with a will-designed mechanism to predict the answer. Experimental
results on a DREAM dataset show that our proposed model achieves great
improvements on baselines
Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots
The challenges of building knowledge-grounded retrieval-based chatbots lie in
how to ground a conversation on its background knowledge and how to match
response candidates with both context and knowledge simultaneously. This paper
proposes a method named Filtering before Iteratively REferring (FIRE) for this
task. In this method, a context filter and a knowledge filter are first built,
which derive knowledge-aware context representations and context-aware
knowledge representations respectively by global and bidirectional attention.
Besides, the entries irrelevant to the conversation are discarded by the
knowledge filter. After that, iteratively referring is performed between
context and response representations as well as between knowledge and response
representations, in order to collect deep matching features for scoring
response candidates. Experimental results show that FIRE outperforms previous
methods by margins larger than 2.8% and 4.1% on the PERSONA-CHAT dataset with
original and revised personas respectively, and margins larger than 3.1% on the
CMU_DoG dataset in terms of top-1 accuracy. We also show that FIRE is more
interpretable by visualizing the knowledge grounding process.Comment: Accepted by EMNLP 2020 Finding
ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
Person search by natural language aims at retrieving a specific person in a
large-scale image pool that matches the given textual descriptions. While most
of the current methods treat the task as a holistic visual and textual feature
matching one, we approach it from an attribute-aligning perspective that allows
grounding specific attribute phrases to the corresponding visual regions. We
achieve success as well as the performance boosting by a robust feature
learning that the referred identity can be accurately bundled by multiple
attribute visual cues. To be concrete, our Visual-Textual Attribute Alignment
model (dubbed as ViTAA) learns to disentangle the feature space of a person
into subspaces corresponding to attributes using a light auxiliary attribute
segmentation computing branch. It then aligns these visual features with the
textual attributes parsed from the sentences by using a novel contrastive
learning loss. Upon that, we validate our ViTAA framework through extensive
experiments on tasks of person search by natural language and by
attribute-phrase queries, on which our system achieves state-of-the-art
performances. Code will be publicly available upon publication.Comment: ECCV2020, 18 pages, 6 figure
Term Definitions Help Hypernymy Detection
Existing methods of hypernymy detection mainly rely on statistics over a big
corpus, either mining some co-occurring patterns like "animals such as cats" or
embedding words of interest into context-aware vectors. These approaches are
therefore limited by the availability of a large enough corpus that can cover
all terms of interest and provide sufficient contextual information to
represent their meaning. In this work, we propose a new paradigm, HyperDef, for
hypernymy detection -- expressing word meaning by encoding word definitions,
along with context driven representation. This has two main benefits: (i)
Definitional sentences express (sense-specific) corpus-independent meanings of
words, hence definition-driven approaches enable strong generalization -- once
trained, the model is expected to work well in open-domain testbeds; (ii)
Global context from a large corpus and definitions provide complementary
information for words. Consequently, our model, HyperDef, once trained on
task-agnostic data, gets state-of-the-art results in multiple benchmarksComment: *SEM'2018 camera-read
Question Generation from SQL Queries Improves Neural Semantic Parsing
We study how to learn a semantic parser of state-of-the-art accuracy with
less supervised training data. We conduct our study on WikiSQL, the largest
hand-annotated semantic parsing dataset to date. First, we demonstrate that
question generation is an effective method that empowers us to learn a
state-of-the-art neural network based semantic parser with thirty percent of
the supervised training data. Second, we show that applying question generation
to the full supervised training data further improves the state-of-the-art
model. In addition, we observe that there is a logarithmic relationship between
the accuracy of a semantic parser and the amount of training data.Comment: The paper will be presented in EMNLP 201
Improving Response Selection in Multi-Turn Dialogue Systems by Incorporating Domain Knowledge
Building systems that can communicate with humans is a core problem in
Artificial Intelligence. This work proposes a novel neural network architecture
for response selection in an end-to-end multi-turn conversational dialogue
setting. The architecture applies context level attention and incorporates
additional external knowledge provided by descriptions of domain-specific
words. It uses a bi-directional Gated Recurrent Unit (GRU) for encoding context
and responses and learns to attend over the context words given the latent
response representation and vice versa.In addition, it incorporates external
domain specific information using another GRU for encoding the domain keyword
descriptions. This allows better representation of domain-specific keywords in
responses and hence improves the overall performance. Experimental results show
that our model outperforms all other state-of-the-art methods for response
selection in multi-turn conversations.Comment: Published as conference paper at CoNLL 201
Microblog Hashtag Generation via Encoding Conversation Contexts
Automatic hashtag annotation plays an important role in content understanding
for microblog posts. To date, progress made in this field has been restricted
to phrase selection from limited candidates, or word-level hashtag discovery
using topic models. Different from previous work considering hashtags to be
inseparable, our work is the first effort to annotate hashtags with a novel
sequence generation framework via viewing the hashtag as a short sequence of
words. Moreover, to address the data sparsity issue in processing short
microblog posts, we propose to jointly model the target posts and the
conversation contexts initiated by them with bidirectional attention. Extensive
experimental results on two large-scale datasets, newly collected from English
Twitter and Chinese Weibo, show that our model significantly outperforms
state-of-the-art models based on classification. Further studies demonstrate
our ability to effectively generate rare and even unseen hashtags, which is
however not possible for most existing methods.Comment: NAACL 2019 (10 pages
- …