Search CORE

16,753 research outputs found

Method of Tibetan Person Knowledge Extraction

Author: Sun Yuan
Zhu Zhen
Publication venue
Publication date: 11/04/2016
Field of study

Person knowledge extraction is the foundation of the Tibetan knowledge graph construction, which provides support for Tibetan question answering system, information retrieval, information extraction and other researches, and promotes national unity and social stability. This paper proposes a SVM and template based approach to Tibetan person knowledge extraction. Through constructing the training corpus, we build the templates based the shallow parsing analysis of Tibetan syntactic, semantic features and verbs. Using the training corpus, we design a hierarchical SVM classifier to realize the entity knowledge extraction. Finally, experimental results prove the method has greater improvement in Tibetan person knowledge extraction.Comment: 6 page

arXiv.org e-Print Archive

Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification

Author: Wen Ji
Publication venue
Publication date: 06/11/2017
Field of study

Relation classification is an important semantic processing task in the field of natural language processing (NLP). In this paper, we present a novel model, Structure Regularized Bidirectional Recurrent Convolutional Neural Network(SR-BRCNN), to classify the relation of two entities in a sentence, and the new dataset of Chinese Sanwen for named entity recognition and relation classification. Some state-of-the-art systems concentrate on modeling the shortest dependency path (SDP) between two entities leveraging convolutional or recurrent neural networks. We further explore how to make full use of the dependency relations information in the SDP and how to improve the model by the method of structure regularization. We propose a structure regularized model to learn relation representations along the SDP extracted from the forest formed by the structure regularized dependency tree, which benefits reducing the complexity of the whole model and helps improve the

F_{1}

score by 10.3. Experimental results show that our method outperforms the state-of-the-art approaches on the Chinese Sanwen task and performs as well on the SemEval-2010 Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used will be released in the further.Comment: arXiv admin note: text overlap with arXiv:1411.6243 by other author

arXiv.org e-Print Archive

A Question Answering Approach to Emotion Cause Extraction

Author: Du Jiachen
Gui Lin
He Yulan
Hu Jiannan
Lu Qin
Xu Ruifeng
Publication venue
Publication date: 23/09/2017
Field of study

Emotion cause extraction aims to identify the reasons behind a certain emotion expressed in text. It is a much more difficult task compared to emotion classification. Inspired by recent advances in using deep memory networks for question answering (QA), we propose a new approach which considers emotion cause identification as a reading comprehension task in QA. Inspired by convolutional neural networks, we propose a new mechanism to store relevant context in different memory slots to model context information. Our proposed approach can extract both word level sequence features and lexical features. Performance evaluation shows that our method achieves the state-of-the-art performance on a recently released emotion cause dataset, outperforming a number of competitive baselines by at least 3.01% in F-measure.Comment: Accepted by EMNLP 201

arXiv.org e-Print Archive

A Biomedical Information Extraction Primer for NLP Researchers

Author: Nair Surag
Publication venue
Publication date: 10/05/2017
Field of study

Biomedical Information Extraction is an exciting field at the crossroads of Natural Language Processing, Biology and Medicine. It encompasses a variety of different tasks that require application of state-of-the-art NLP techniques, such as NER and Relation Extraction. This paper provides an overview of the problems in the field and discusses some of the techniques used for solving them

arXiv.org e-Print Archive

Relation Extraction : A Survey

Author: Bhattacharyya Pushpak
Palshikar Girish K.
Pawar Sachin
Publication venue
Publication date: 14/12/2017
Field of study

With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings

arXiv.org e-Print Archive

Emotional Contribution Analysis of Online Reviews

Author: Carreón Elisa Claire Alemán
Hiraoka Toru
Hirota Masaharu
Ito Takao
Kumano Minoru
Nonaka Hirofumi
Publication venue: 'ALife Robotics Corporation Ltd.'
Publication date: 01/05/2019
Field of study

In response to the constant increase in population and tourism worldwide, there is a need for the development of cross-language market research tools that are more cost and time effective than surveys or interviews. Focusing on the Chinese tourism boom and the hotel industry in Japan, we extracted the most influential keywords in emotional judgement from Chinese online reviews of Japanese hotels in the portal site Ctrip. Using an entropy based mathematical model and a machine learning algorithm, we determined the words that most closely represent the demands and emotions of this customer base

arXiv.org e-Print Archive

An Attention-Based Word-Level Interaction Model: Relation Detection for Knowledge Base Question Answering

Author: fu Kun
Huang Tinglei
Liang Xiao
Xu Guandong
Zhang Hongzhi
Publication venue
Publication date: 30/01/2018
Field of study

Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.Comment: Paper submitted to Neurocomputing at 11.12.201

arXiv.org e-Print Archive

Learning Chinese Word Representations From Glyphs Of Characters

Author: Lee Hung-Yi
Su Tzu-Ray
Publication venue
Publication date: 15/08/2017
Field of study

In this paper, we propose new methods to learn Chinese word representations. Chinese characters are composed of graphical components, which carry rich semantics. It is common for a Chinese learner to comprehend the meaning of a word from these graphical components. As a result, we propose models that enhance word representations by character glyphs. The character glyph features are directly learned from the bitmaps of characters by convolutional auto-encoder(convAE), and the glyph features improve Chinese word representations which are already enhanced by character embeddings. Another contribution in this paper is that we created several evaluation datasets in traditional Chinese and made them public

arXiv.org e-Print Archive

An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition

Author: Li Cuihua
Wang Hanzi
Yan Yan
Yang Chenhui
Zhong Bineng
Publication venue
Publication date: 24/03/2016
Field of study

In this paper, an effective unconstrained correlation filter called Uncon- strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to robust face recognition. Compared with the conventional correlation filters in Class-dependence Feature Analysis (CFA), UOOTF improves the overall performance for unseen patterns by removing the hard constraints on the origin correlation outputs during the filter design. To handle non-linearly separable distributions between different classes, we further develop a non- linear extension of UOOTF based on the kernel technique. The kernel ex- tension of UOOTF allows for higher flexibility of the decision boundary due to a wider range of non-linearity properties. Experimental results demon- strate the effectiveness of the proposed unconstrained correlation filter and its kernelization in the task of face recognition

arXiv.org e-Print Archive

Automatic Severity Classification of Coronary Artery Disease via Recurrent Capsule Network

Author: Gao Daqi
Gao Ju
Qiu Jiahui
Ruan Tong
Wang Qi
Zhou Yangming
Publication venue
Publication date: 27/11/2018
Field of study

Coronary artery disease (CAD) is one of the leading causes of cardiovascular disease deaths. CAD condition progresses rapidly, if not diagnosed and treated at an early stage may eventually lead to an irreversible state of the heart muscle death. Invasive coronary arteriography is the gold standard technique for CAD diagnosis. Coronary arteriography texts describe which part has stenosis and how much stenosis is in details. It is crucial to conduct the severity classification of CAD. In this paper, we employ a recurrent capsule network (RCN) to extract semantic relations between clinical named entities in Chinese coronary arteriography texts, through which we can automatically find out the maximal stenosis for each lumen to inference how severe CAD is according to the improved method of Gensini. Experimental results on the corpus collected from Shanghai Shuguang Hospital show that our proposed method achieves an accuracy of 97.0\% in the severity classification of CAD.Comment: 8 pages, 5 figure

arXiv.org e-Print Archive