2,404 research outputs found
A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations
We introduce an attention-based Bi-LSTM for Chinese implicit discourse
relations and demonstrate that modeling argument pairs as a joint sequence can
outperform word order-agnostic approaches. Our model benefits from a partial
sampling scheme and is conceptually simple, yet achieves state-of-the-art
performance on the Chinese Discourse Treebank. We also visualize its attention
activity to illustrate the model's ability to selectively focus on the relevant
parts of an input sequence.Comment: To appear at ACL2017, code available at
https://github.com/sronnqvist/discourse-ablst
A recurrent neural model with attention for the recognition of Chinese implicit discourse relations
We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches. Our model benefits from a partial sampling scheme and is conceptually simple, yet achieves state-of-the-art performance on the Chinese Discourse Treebank. We also visualize its attention activity to illustrate the model’s ability to selectively focus on the relevant parts of an input sequence
Neural Article Pair Modeling for Wikipedia Sub-article Matching
Nowadays, editors tend to separate different subtopics of a long Wiki-pedia
article into multiple sub-articles. This separation seeks to improve human
readability. However, it also has a deleterious effect on many Wikipedia-based
tasks that rely on the article-as-concept assumption, which requires each
entity (or concept) to be described solely by one article. This underlying
assumption significantly simplifies knowledge representation and extraction,
and it is vital to many existing technologies such as automated knowledge base
construction, cross-lingual knowledge alignment, semantic search and data
lineage of Wikipedia entities. In this paper we provide an approach to match
the scattered sub-articles back to their corresponding main-articles, with the
intent of facilitating automated Wikipedia curation and processing. The
proposed model adopts a hierarchical learning structure that combines multiple
variants of neural document pair encoders with a comprehensive set of explicit
features. A large crowdsourced dataset is created to support the evaluation and
feature extraction for the task. Based on the large dataset, the proposed model
achieves promising results of cross-validation and significantly outperforms
previous approaches. Large-scale serving on the entire English Wikipedia also
proves the practicability and scalability of the proposed model by effectively
extracting a vast collection of newly paired main and sub-articles.Comment: ECML-PKDD 2018. 16 pages, 4 figure
Structure Regularized Neural Network for Entity Relation Classification for Chinese Literature Text
Relation classification is an important semantic processing task in the field
of natural language processing. In this paper, we propose the task of relation
classification for Chinese literature text. A new dataset of Chinese literature
text is constructed to facilitate the study in this task. We present a novel
model, named Structure Regularized Bidirectional Recurrent Convolutional Neural
Network (SR-BRCNN), to identify the relation between entities. The proposed
model learns relation representations along the shortest dependency path (SDP)
extracted from the structure regularized dependency tree, which has the
benefits of reducing the complexity of the whole model. Experimental results
show that the proposed method significantly improves the F1 score by 10.3, and
outperforms the state-of-the-art approaches on Chinese literature text.Comment: Accepted at NAACL HLT 2018. arXiv admin note: substantial text
overlap with arXiv:1711.0250
One-shot Learning for Question-Answering in Gaokao History Challenge
Answering questions from university admission exams (Gaokao in Chinese) is a
challenging AI task since it requires effective representation to capture
complicated semantic relations between questions and answers. In this work, we
propose a hybrid neural model for deep question-answering task from history
examinations. Our model employs a cooperative gated neural network to retrieve
answers with the assistance of extra labels given by a neural turing machine
labeler. Empirical study shows that the labeler works well with only a small
training dataset and the gated mechanism is good at fetching the semantic
representation of lengthy answers. Experiments on question answering
demonstrate the proposed model obtains substantial performance gains over
various neural model baselines in terms of multiple evaluation metrics.Comment: Proceedings of the 27th International Conference on Computational
Linguistics (COLING 2018
Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph
We argue that semantic meanings of a sentence or clause can not be
interpreted independently from the rest of a paragraph, or independently from
all discourse relations and the overall paragraph-level discourse structure.
With the goal of improving implicit discourse relation classification, we
introduce a paragraph-level neural networks that model inter-dependencies
between discourse units as well as discourse relation continuity and patterns,
and predict a sequence of discourse relations in a paragraph. Experimental
results show that our model outperforms the previous state-of-the-art systems
on the benchmark corpus of PDTB.Comment: Accepted by NAACL 201
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
Past work in relation extraction has focused on binary relations in single
sentences. Recent NLP inroads in high-value domains have sparked interest in
the more general setting of extracting n-ary relations that span multiple
sentences. In this paper, we explore a general relation extraction framework
based on graph long short-term memory networks (graph LSTMs) that can be easily
extended to cross-sentence n-ary relation extraction. The graph formulation
provides a unified way of exploring different LSTM approaches and incorporating
various intra-sentential and inter-sentential dependencies, such as sequential,
syntactic, and discourse relations. A robust contextual representation is
learned for the entities, which serves as input to the relation classifier.
This simplifies handling of relations with arbitrary arity, and enables
multi-task learning with related relations. We evaluate this framework in two
important precision medicine settings, demonstrating its effectiveness with
both conventional supervised learning and distant supervision. Cross-sentence
extraction produced larger knowledge bases. and multi-task learning
significantly improved extraction accuracy. A thorough analysis of various LSTM
approaches yielded useful insight the impact of linguistic analysis on
extraction accuracy.Comment: Conditional accepted by TACL in December 2016; published in April
2017; presented at ACL in August 201
Incorporating Relevant Knowledge in Context Modeling and Response Generation
To sustain engaging conversation, it is critical for chatbots to make good
use of relevant knowledge. Equipped with a knowledge base, chatbots are able to
extract conversation-related attributes and entities to facilitate context
modeling and response generation. In this work, we distinguish the uses of
attribute and entity and incorporate them into the encoder-decoder architecture
in different manners. Based on the augmented architecture, our chatbot, namely
Mike, is able to generate responses by referring to proper entities from the
collected knowledge. To validate the proposed approach, we build a movie
conversation corpus on which the proposed approach significantly outperforms
other four knowledge-grounded models
Memorizing All for Implicit Discourse Relation Recognition
Implicit discourse relation recognition is a challenging task due to the
absence of the necessary informative clue from explicit connectives. The
prediction of relations requires a deep understanding of the semantic meanings
of sentence pairs. As implicit discourse relation recognizer has to carefully
tackle the semantic similarity of the given sentence pairs and the severe data
sparsity issue exists in the meantime, it is supposed to be beneficial from
mastering the entire training data. Thus in this paper, we propose a novel
memory mechanism to tackle the challenges for further performance improvement.
The memory mechanism is adequately memorizing information by pairing
representations and discourse relations of all training instances, which right
fills the slot of the data-hungry issue in the current implicit discourse
relation recognizer. Our experiments show that our full model with memorizing
the entire training set reaches new state-of-the-art against strong baselines,
which especially for the first time exceeds the milestone of 60% accuracy in
the 4-way task
SRL4ORL: Improving Opinion Role Labeling using Multi-task Learning with Semantic Role Labeling
For over a decade, machine learning has been used to extract
opinion-holder-target structures from text to answer the question "Who
expressed what kind of sentiment towards what?". Recent neural approaches do
not outperform the state-of-the-art feature-based models for Opinion Role
Labeling (ORL). We suspect this is due to the scarcity of labeled training data
and address this issue using different multi-task learning (MTL) techniques
with a related task which has substantially more data, i.e. Semantic Role
Labeling (SRL). We show that two MTL models improve significantly over the
single-task model for labeling of both holders and targets, on the development
and the test sets. We found that the vanilla MTL model which makes predictions
using only shared ORL and SRL features, performs the best. With deeper analysis
we determine what works and what might be done to make further improvements for
ORL.Comment: Published in NAACL 201
- …