1,608 research outputs found
Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus
In Web search, entity-seeking queries often trigger a special Question
Answering (QA) system. It may use a parser to interpret the question to a
structured query, execute that on a knowledge graph (KG), and return direct
entity responses. QA systems based on precise parsing tend to be brittle: minor
syntax variations may dramatically change the response. Moreover, KG coverage
is patchy. At the other extreme, a large corpus may provide broader coverage,
but in an unstructured, unreliable form. We present AQQUCN, a QA system that
gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of
query syntax, between well-formed questions to short `telegraphic' keyword
sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals
from KGs and large corpora to directly rank KG entities, rather than commit to
one semantic interpretation of the query. AQQUCN models the ideal
interpretation as an unobservable or latent variable. Interpretations and
candidate entity responses are scored as pairs, by combining signals from
multiple convolutional networks that operate collectively on the query, KG and
corpus. On four public query workloads, amounting to over 8,000 queries with
diverse query syntax, we see 5--16% absolute improvement in mean average
precision (MAP), compared to the entity ranking performance of recent systems.
Our system is also competitive at entity set retrieval, almost doubling F1
scores for challenging short queries.Comment: Accepted to Information Retrieval Journa
Graph Neural Networks for Natural Language Processing: A Survey
Deep learning has become the dominant approach in coping with various tasks
in Natural LanguageProcessing (NLP). Although text inputs are typically
represented as a sequence of tokens, there isa rich variety of NLP problems
that can be best expressed with a graph structure. As a result, thereis a surge
of interests in developing new deep learning techniques on graphs for a large
numberof NLP tasks. In this survey, we present a comprehensive overview onGraph
Neural Networks(GNNs) for Natural Language Processing. We propose a new
taxonomy of GNNs for NLP, whichsystematically organizes existing research of
GNNs for NLP along three axes: graph construction,graph representation
learning, and graph based encoder-decoder models. We further introducea large
number of NLP applications that are exploiting the power of GNNs and summarize
thecorresponding benchmark datasets, evaluation metrics, and open-source codes.
Finally, we discussvarious outstanding challenges for making the full use of
GNNs for NLP as well as future researchdirections. To the best of our
knowledge, this is the first comprehensive overview of Graph NeuralNetworks for
Natural Language Processing.Comment: 127 page
Bipartite Flat-Graph Network for Nested Named Entity Recognition
In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for
nested named entity recognition (NER), which contains two subgraph modules: a
flat NER module for outermost entities and a graph module for all the entities
located in inner layers. Bidirectional LSTM (BiLSTM) and graph convolutional
network (GCN) are adopted to jointly learn flat entities and their inner
dependencies. Different from previous models, which only consider the
unidirectional delivery of information from innermost layers to outer ones (or
outside-to-inside), our model effectively captures the bidirectional
interaction between them. We first use the entities recognized by the flat NER
module to construct an entity graph, which is fed to the next graph module. The
richer representation learned from graph module carries the dependencies of
inner entities and can be exploited to improve outermost entity predictions.
Experimental results on three standard nested NER datasets demonstrate that our
BiFlaG outperforms previous state-of-the-art models.Comment: Accepted by ACL202
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
We propose a distance supervised relation extraction approach for
long-tailed, imbalanced data which is prevalent in real-world settings. Here,
the challenge is to learn accurate "few-shot" models for classes existing at
the tail of the class distribution, for which little data is available.
Inspired by the rich semantic correlations between classes at the long tail and
those at the head, we take advantage of the knowledge from data-rich classes at
the head of the distribution to boost the performance of the data-poor classes
at the tail. First, we propose to leverage implicit relational knowledge among
class labels from knowledge graph embeddings and learn explicit relational
knowledge using graph convolution networks. Second, we integrate that
relational knowledge into relation extraction model by coarse-to-fine
knowledge-aware attention mechanism. We demonstrate our results for a
large-scale benchmark dataset which show that our approach significantly
outperforms other baselines, especially for long-tail relations.Comment: To be published in NAACL 201
Cross Temporal Recurrent Networks for Ranking Question Answer Pairs
Temporal gates play a significant role in modern recurrent-based neural
encoders, enabling fine-grained control over recursive compositional operations
over time. In recurrent models such as the long short-term memory (LSTM),
temporal gates control the amount of information retained or discarded over
time, not only playing an important role in influencing the learned
representations but also serving as a protection against vanishing gradients.
This paper explores the idea of learning temporal gates for sequence pairs
(question and answer), jointly influencing the learned representations in a
pairwise manner. In our approach, temporal gates are learned via 1D
convolutional layers and then subsequently cross applied across question and
answer for joint learning. Empirically, we show that this conceptually simple
sharing of temporal gates can lead to competitive performance across multiple
benchmarks. Intuitively, what our network achieves can be interpreted as
learning representations of question and answer pairs that are aware of what
each other is remembering or forgetting, i.e., pairwise temporal gating. Via
extensive experiments, we show that our proposed model achieves
state-of-the-art performance on two community-based QA datasets and competitive
performance on one factoid-based QA dataset.Comment: Accepted to AAAI201
On the Robustness of Aspect-based Sentiment Analysis: Rethinking Model, Data, and Training
Aspect-based sentiment analysis (ABSA) aims at automatically inferring the
specific sentiment polarities toward certain aspects of products or services
behind the social media texts or reviews, which has been a fundamental
application to the real-world society. Since the early 2010s, ABSA has achieved
extraordinarily high accuracy with various deep neural models. However,
existing ABSA models with strong in-house performances may fail to generalize
to some challenging cases where the contexts are variable, i.e., low robustness
to real-world environments. In this study, we propose to enhance the ABSA
robustness by systematically rethinking the bottlenecks from all possible
angles, including model, data, and training. First, we strengthen the current
best-robust syntax-aware models by further incorporating the rich external
syntactic dependencies and the labels with aspect simultaneously with a
universal-syntax graph convolutional network. In the corpus perspective, we
propose to automatically induce high-quality synthetic training data with
various types, allowing models to learn sufficient inductive bias for better
robustness. Last, we based on the rich pseudo data perform adversarial training
to enhance the resistance to the context perturbation and meanwhile employ
contrastive learning to reinforce the representations of instances with
contrastive sentiments. Extensive robustness evaluations are conducted. The
results demonstrate that our enhanced syntax-aware model achieves better
robustness performances than all the state-of-the-art baselines. By
additionally incorporating our synthetic corpus, the robust testing results are
pushed with around 10% accuracy, which are then further improved by installing
the advanced training strategies. In-depth analyses are presented for revealing
the factors influencing the ABSA robustness.Comment: Accepted in ACM Transactions on Information System
- …