109,685 research outputs found
Dual Pointer Network for Fast Extraction of Multiple Relations in a Sentence
Relation extraction is a type of information extraction task that recognizes
semantic relationships between entities in a sentence. Many previous studies
have focused on extracting only one semantic relation between two entities in a
single sentence. However, multiple entities in a sentence are associated
through various relations. To address this issue, we propose a relation
extraction model based on a dual pointer network with a multi-head attention
mechanism. The proposed model finds n-to-1 subject-object relations using a
forward object decoder. Then, it finds 1-to-n subject-object relations using a
backward subject decoder. Our experiments confirmed that the proposed model
outperformed previous models, with an F1-score of 80.8% for the ACE-2005 corpus
and an F1-score of 78.3% for the NYT corpus
OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference
In this paper, we consider advancing web-scale knowledge extraction and
alignment by integrating OpenIE extractions in the form of (subject, predicate,
object) triples with Knowledge Bases (KB). Traditional techniques from
universal schema and from schema mapping fall in two extremes: either they
perform instance-level inference relying on embedding for (subject, object)
pairs, thus cannot handle pairs absent in any existing triples; or they perform
predicate-level mapping and completely ignore background evidence from
individual entities, thus cannot achieve satisfying quality. We propose OpenKI
to handle sparsity of OpenIE extractions by performing instance-level
inference: for each entity, we encode the rich information in its neighborhood
in both KB and OpenIE extractions, and leverage this information in relation
inference by exploring different methods of aggregation and attention. In order
to handle unseen entities, our model is designed without creating
entity-specific parameters. Extensive experiments show that this method not
only significantly improves state-of-the-art for conventional OpenIE
extractions like ReVerb, but also boosts the performance on OpenIE from
semi-structured data, where new entity pairs are abundant and data are fairly
sparse
Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction
In this paper, we consider the problem of open information extraction (OIE)
for extracting entity and relation level intermediate structures from sentences
in open-domain. We focus on four types of valuable intermediate structures
(Relation, Attribute, Description, and Concept), and propose a unified
knowledge expression form, SAOKE, to express them. We publicly release a data
set which contains more than forty thousand sentences and the corresponding
facts in the SAOKE format labeled by crowd-sourcing. To our knowledge, this is
the largest publicly available human labeled data set for open information
extraction tasks. Using this labeled SAOKE data set, we train an end-to-end
neural model using the sequenceto-sequence paradigm, called Logician, to
transform sentences into facts. For each sentence, different to existing
algorithms which generally focus on extracting each single fact without
concerning other possible facts, Logician performs a global optimization over
all possible involved facts, in which facts not only compete with each other to
attract the attention of words, but also cooperate to share words. An
experimental study on various types of open domain relation extraction tasks
reveals the consistent superiority of Logician to other states-of-the-art
algorithms. The experiments verify the reasonableness of SAOKE format, the
valuableness of SAOKE data set, the effectiveness of the proposed Logician
model, and the feasibility of the methodology to apply end-to-end learning
paradigm on supervised data sets for the challenging tasks of open information
extraction
Faithful to the Original: Fact Aware Neural Abstractive Summarization
Unlike extractive summarization, abstractive summarization has to fuse
different parts of the source text, which inclines to create fake facts. Our
preliminary study reveals nearly 30% of the outputs from a state-of-the-art
neural summarization system suffer from this problem. While previous
abstractive summarization approaches usually focus on the improvement of
informativeness, we argue that faithfulness is also a vital prerequisite for a
practical abstractive summarization system. To avoid generating fake facts in a
summary, we leverage open information extraction and dependency parse
technologies to extract actual fact descriptions from the source text. The
dual-attention sequence-to-sequence framework is then proposed to force the
generation conditioned on both the source text and the extracted fact
descriptions. Experiments on the Gigaword benchmark dataset demonstrate that
our model can greatly reduce fake summaries by 80%. Notably, the fact
descriptions also bring significant improvement on informativeness since they
often condense the meaning of the source text.Comment: 8 pages, 3 figures, AAAI 201
PARN: Position-Aware Relation Networks for Few-Shot Learning
Few-shot learning presents a challenge that a classifier must quickly adapt
to new classes that do not appear in the training set, given only a few labeled
examples of each new class. This paper proposes a position-aware relation
network (PARN) to learn a more flexible and robust metric ability for few-shot
learning. Relation networks (RNs), a kind of architectures for relational
reasoning, can acquire a deep metric ability for images by just being designed
as a simple convolutional neural network (CNN) [23]. However, due to the
inherent local connectivity of CNN, the CNN-based relation network (RN) can be
sensitive to the spatial position relationship of semantic objects in two
compared images. To address this problem, we introduce a deformable feature
extractor (DFE) to extract more efficient features, and design a dual
correlation attention mechanism (DCA) to deal with its inherent local
connectivity. Successfully, our proposed approach extents the potential of RN
to be position-aware of semantic objects by introducing only a small number of
parameters. We evaluate our approach on two major benchmark datasets, i.e.,
Omniglot and Mini-Imagenet, and on both of the datasets our approach achieves
state-of-the-art performance with the setting of using a shallow feature
extraction network. It's worth noting that our 5-way 1-shot result on Omniglot
even outperforms the previous 5-way 5-shot results
The Natural Language Decathlon: Multitask Learning as Question Answering
Deep learning has improved performance on many natural language processing
(NLP) tasks individually. However, general NLP models cannot emerge within a
paradigm that focuses on the particularities of a single metric, dataset, and
task. We introduce the Natural Language Decathlon (decaNLP), a challenge that
spans ten tasks: question answering, machine translation, summarization,
natural language inference, sentiment analysis, semantic role labeling,
zero-shot relation extraction, goal-oriented dialogue, semantic parsing, and
commonsense pronoun resolution. We cast all tasks as question answering over a
context. Furthermore, we present a new Multitask Question Answering Network
(MQAN) jointly learns all tasks in decaNLP without any task-specific modules or
parameters in the multitask setting. MQAN shows improvements in transfer
learning for machine translation and named entity recognition, domain
adaptation for sentiment analysis and natural language inference, and zero-shot
capabilities for text classification. We demonstrate that the MQAN's
multi-pointer-generator decoder is key to this success and performance further
improves with an anti-curriculum training strategy. Though designed for
decaNLP, MQAN also achieves state of the art results on the WikiSQL semantic
parsing task in the single-task setting. We also release code for procuring and
processing data, training and evaluating models, and reproducing all
experiments for decaNLP
Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction
This paper addresses the problem of key phrase extraction from sentences.
Existing state-of-the-art supervised methods require large amounts of annotated
data to achieve good performance and generalization. Collecting labeled data
is, however, often expensive. In this paper, we redefine the problem as
question-answer extraction, and present SAMIE: Self-Asking Model for
Information Ixtraction, a semi-supervised model which dually learns to ask and
to answer questions by itself. Briefly, given a sentence and an answer ,
the model needs to choose the most appropriate question ; meanwhile,
for the given sentence and same question selected in the previous
step, the model will predict an answer . The model can support few-shot
learning with very limited supervision. It can also be used to perform
clustering analysis when no supervision is provided. Experimental results show
that the proposed method outperforms typical supervised methods especially when
given little labeled data.Comment: 7 pages, 5 figures, submission to IJCAI1
Dual-Glance Model for Deciphering Social Relationships
Since the beginning of early civilizations, social relationships derived from
each individual fundamentally form the basis of social structure in our daily
life. In the computer vision literature, much progress has been made in scene
understanding, such as object detection and scene parsing. Recent research
focuses on the relationship between objects based on its functionality and
geometrical relations. In this work, we aim to study the problem of social
relationship recognition, in still images. We have proposed a dual-glance model
for social relationship recognition, where the first glance fixates at the
individual pair of interest and the second glance deploys attention mechanism
to explore contextual cues. We have also collected a new large scale People in
Social Context (PISC) dataset, which comprises of 22,670 images and 76,568
annotated samples from 9 types of social relationship. We provide benchmark
results on the PISC dataset, and qualitatively demonstrate the efficacy of the
proposed model.Comment: IEEE International Conference on Computer Vision (ICCV), 201
Using Context Information to Enhance Simple Question Answering
With the rapid development of knowledge bases(KBs),question
answering(QA)based on KBs has become a hot research issue. In this paper,we
propose two frameworks(i.e.,pipeline framework,an end-to-end framework)to focus
answering single-relation factoid question. In both of two frameworks,we study
the effect of context information on the quality of QA,such as the entity's
notable type,out-degree. In the end-to-end framework,we combine char-level
encoding and self-attention mechanisms,using weight sharing and multi-task
strategies to enhance the accuracy of QA. Experimental results show that
context information can get better results of simple QA whether it is the
pipeline framework or the end-to-end framework. In addition,we find that the
end-to-end framework achieves results competitive with state-of-the-art
approaches in terms of accuracy and take much shorter time than them.Comment: under review World Wide Web Journa
Medical Time Series Classification with Hierarchical Attention-based Temporal Convolutional Networks: A Case Study of Myotonic Dystrophy Diagnosis
Myotonia, which refers to delayed muscle relaxation after contraction, is the
main symptom of myotonic dystrophy patients. We propose a hierarchical
attention-based temporal convolutional network (HA-TCN) for myotonic dystrohpy
diagnosis from handgrip time series data, and introduce mechanisms that enable
model explainability. We compare the performance of the HA-TCN model against
that of benchmark TCN models, LSTM models with and without attention
mechanisms, and SVM approaches with handcrafted features. In terms of
classification accuracy and F1 score, we found all deep learning models have
similar levels of performance, and they all outperform SVM. Further, the HA-TCN
model outperforms its TCN counterpart with regards to computational efficiency
regardless of network depth, and in terms of performance particularly when the
number of hidden layers is small. Lastly, HA-TCN models can consistently
identify relevant time series segments in the relaxation phase of the handgrip
time series, and exhibit increased robustness to noise when compared to
attention-based LSTM models
- …