16,753 research outputs found
Method of Tibetan Person Knowledge Extraction
Person knowledge extraction is the foundation of the Tibetan knowledge graph
construction, which provides support for Tibetan question answering system,
information retrieval, information extraction and other researches, and
promotes national unity and social stability. This paper proposes a SVM and
template based approach to Tibetan person knowledge extraction. Through
constructing the training corpus, we build the templates based the shallow
parsing analysis of Tibetan syntactic, semantic features and verbs. Using the
training corpus, we design a hierarchical SVM classifier to realize the entity
knowledge extraction. Finally, experimental results prove the method has
greater improvement in Tibetan person knowledge extraction.Comment: 6 page
Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification
Relation classification is an important semantic processing task in the field
of natural language processing (NLP). In this paper, we present a novel model,
Structure Regularized Bidirectional Recurrent Convolutional Neural
Network(SR-BRCNN), to classify the relation of two entities in a sentence, and
the new dataset of Chinese Sanwen for named entity recognition and relation
classification. Some state-of-the-art systems concentrate on modeling the
shortest dependency path (SDP) between two entities leveraging convolutional or
recurrent neural networks. We further explore how to make full use of the
dependency relations information in the SDP and how to improve the model by the
method of structure regularization. We propose a structure regularized model to
learn relation representations along the SDP extracted from the forest formed
by the structure regularized dependency tree, which benefits reducing the
complexity of the whole model and helps improve the score by 10.3.
Experimental results show that our method outperforms the state-of-the-art
approaches on the Chinese Sanwen task and performs as well on the SemEval-2010
Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used
will be released in the further.Comment: arXiv admin note: text overlap with arXiv:1411.6243 by other author
A Question Answering Approach to Emotion Cause Extraction
Emotion cause extraction aims to identify the reasons behind a certain
emotion expressed in text. It is a much more difficult task compared to emotion
classification. Inspired by recent advances in using deep memory networks for
question answering (QA), we propose a new approach which considers emotion
cause identification as a reading comprehension task in QA. Inspired by
convolutional neural networks, we propose a new mechanism to store relevant
context in different memory slots to model context information. Our proposed
approach can extract both word level sequence features and lexical features.
Performance evaluation shows that our method achieves the state-of-the-art
performance on a recently released emotion cause dataset, outperforming a
number of competitive baselines by at least 3.01% in F-measure.Comment: Accepted by EMNLP 201
A Biomedical Information Extraction Primer for NLP Researchers
Biomedical Information Extraction is an exciting field at the crossroads of
Natural Language Processing, Biology and Medicine. It encompasses a variety of
different tasks that require application of state-of-the-art NLP techniques,
such as NER and Relation Extraction. This paper provides an overview of the
problems in the field and discusses some of the techniques used for solving
them
Relation Extraction : A Survey
With the advent of the Internet, large amount of digital text is generated
everyday in the form of news articles, research publications, blogs, question
answering forums and social media. It is important to develop techniques for
extracting information automatically from these documents, as lot of important
information is hidden within them. This extracted information can be used to
improve access and management of knowledge hidden in large text corpora.
Several applications such as Question Answering, Information Retrieval would
benefit from this information. Entities like persons and organizations, form
the most basic unit of the information. Occurrences of entities in a sentence
are often linked through well-defined relations; e.g., occurrences of person
and organization in a sentence may be linked through relations such as employed
at. The task of Relation Extraction (RE) is to identify such relations
automatically. In this paper, we survey several important supervised,
semi-supervised and unsupervised RE techniques. We also cover the paradigms of
Open Information Extraction (OIE) and Distant Supervision. Finally, we describe
some of the recent trends in the RE techniques and possible future research
directions. This survey would be useful for three kinds of readers - i)
Newcomers in the field who want to quickly learn about RE; ii) Researchers who
want to know how the various RE techniques evolved over time and what are
possible future research directions and iii) Practitioners who just need to
know which RE technique works best in various settings
Emotional Contribution Analysis of Online Reviews
In response to the constant increase in population and tourism worldwide,
there is a need for the development of cross-language market research tools
that are more cost and time effective than surveys or interviews. Focusing on
the Chinese tourism boom and the hotel industry in Japan, we extracted the most
influential keywords in emotional judgement from Chinese online reviews of
Japanese hotels in the portal site Ctrip. Using an entropy based mathematical
model and a machine learning algorithm, we determined the words that most
closely represent the demands and emotions of this customer base
An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering
Relation detection plays a crucial role in Knowledge Base Question Answering
(KBQA) because of the high variance of relation expression in the question.
Traditional deep learning methods follow an encoding-comparing paradigm, where
the question and the candidate relation are represented as vectors to compare
their semantic similarity. Max- or average- pooling operation, which compresses
the sequence of words into fixed-dimensional vectors, becomes the bottleneck of
information. In this paper, we propose to learn attention-based word-level
interactions between questions and relations to alleviate the bottleneck issue.
Similar to the traditional models, the question and relation are firstly
represented as sequences of vectors. Then, instead of merging the sequence into
a single vector with pooling operation, soft alignments between words from the
question and the relation are learned. The aligned words are subsequently
compared with the convolutional neural network (CNN) and the comparison results
are merged finally. Through performing the comparison on low-level
representations, the attention-based word-level interaction model (ABWIM)
relieves the information loss issue caused by merging the sequence into a
fixed-dimensional vector before the comparison. The experimental results of
relation detection on both SimpleQuestions and WebQuestions datasets show that
ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201
Learning Chinese Word Representations From Glyphs Of Characters
In this paper, we propose new methods to learn Chinese word representations.
Chinese characters are composed of graphical components, which carry rich
semantics. It is common for a Chinese learner to comprehend the meaning of a
word from these graphical components. As a result, we propose models that
enhance word representations by character glyphs. The character glyph features
are directly learned from the bitmaps of characters by convolutional
auto-encoder(convAE), and the glyph features improve Chinese word
representations which are already enhanced by character embeddings. Another
contribution in this paper is that we created several evaluation datasets in
traditional Chinese and made them public
An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition
In this paper, an effective unconstrained correlation filter called Uncon-
strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to
robust face recognition. Compared with the conventional correlation filters in
Class-dependence Feature Analysis (CFA), UOOTF improves the overall performance
for unseen patterns by removing the hard constraints on the origin correlation
outputs during the filter design. To handle non-linearly separable
distributions between different classes, we further develop a non- linear
extension of UOOTF based on the kernel technique. The kernel ex- tension of
UOOTF allows for higher flexibility of the decision boundary due to a wider
range of non-linearity properties. Experimental results demon- strate the
effectiveness of the proposed unconstrained correlation filter and its
kernelization in the task of face recognition
Automatic Severity Classification of Coronary Artery Disease via Recurrent Capsule Network
Coronary artery disease (CAD) is one of the leading causes of cardiovascular
disease deaths. CAD condition progresses rapidly, if not diagnosed and treated
at an early stage may eventually lead to an irreversible state of the heart
muscle death. Invasive coronary arteriography is the gold standard technique
for CAD diagnosis. Coronary arteriography texts describe which part has
stenosis and how much stenosis is in details. It is crucial to conduct the
severity classification of CAD. In this paper, we employ a recurrent capsule
network (RCN) to extract semantic relations between clinical named entities in
Chinese coronary arteriography texts, through which we can automatically find
out the maximal stenosis for each lumen to inference how severe CAD is
according to the improved method of Gensini. Experimental results on the corpus
collected from Shanghai Shuguang Hospital show that our proposed method
achieves an accuracy of 97.0\% in the severity classification of CAD.Comment: 8 pages, 5 figure
- …