62,046 research outputs found
Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning
Named entity recognition, and other information extraction tasks, frequently
use linguistic features such as part of speech tags or chunkings. For languages
where word boundaries are not readily identified in text, word segmentation is
a key first step to generating features for an NER system. While using word
boundary tags as features are helpful, the signals that aid in identifying
these boundaries may provide richer information for an NER system. New
state-of-the-art word segmentation systems use neural models to learn
representations for predicting word boundaries. We show that these same
representations, jointly trained with an NER system, yield significant
improvements in NER for Chinese social media. In our experiments, jointly
training NER and word segmentation with an LSTM-CRF model yields nearly 5%
absolute improvement over previously published results.Comment: This is the camera ready version of our ACL'16 paper. We also added a
supplementary material containing the results of our systems on a cleaner
dataset (much higher F1 scores). More information please refer to the repo
https://github.com/hltcoe/golden-hors
Overview of the Ugglan Entity Discovery and Linking System
Ugglan is a system designed to discover named entities and link them to
unique identifiers in a knowledge base. It is based on a combination of a name
and nominal dictionary derived from Wikipedia and Wikidata, a named entity
recognition module (NER) using fixed ordinally-forgetting encoding (FOFE)
trained on the TAC EDL data from 2014-2016, a candidate generation module from
the Wikipedia link graph across multiple editions, a PageRank link and
cooccurrence graph disambiguator, and finally a reranker trained on the TAC EDL
2015-2016 data
CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition
Named entity recognition (NER) in Chinese is essential but difficult because
of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS)
is usually considered as the first step for Chinese NER. However, models based
on word-level embeddings and lexicon features often suffer from segmentation
errors and out-of-vocabulary (OOV) words. In this paper, we investigate a
Convolutional Attention Network called CAN for Chinese NER, which consists of a
character-based convolutional neural network (CNN) with local-attention layer
and a gated recurrent unit (GRU) with global self-attention layer to capture
the information from adjacent characters and sentence contexts. Also, compared
to other models, not depending on any external resources like lexicons and
employing small size of char embeddings make our model more practical.
Extensive experimental results show that our approach outperforms
state-of-the-art methods without word embedding and external lexicon resources
on different domain datasets including Weibo, MSRA and Chinese Resume NER
dataset.Comment: This paper is accepted by NAACL-HLT 2019. The code is available at
https://github.com/microsoft/vert-papers/tree/master/papers/CAN-NE
Multi-Source Cross-Lingual Model Transfer: Learning What to Share
Modern NLP applications have enjoyed a great boost utilizing neural networks
models. Such deep neural models, however, are not applicable to most human
languages due to the lack of annotated training data for various NLP tasks.
Cross-lingual transfer learning (CLTL) is a viable method for building NLP
models for a low-resource target language by leveraging labeled data from other
(source) languages. In this work, we focus on the multilingual transfer setting
where training data in multiple source languages is leveraged to further boost
target language performance.
Unlike most existing methods that rely only on language-invariant features
for CLTL, our approach coherently utilizes both language-invariant and
language-specific features at instance level. Our model leverages adversarial
networks to learn language-invariant features, and mixture-of-experts models to
dynamically exploit the similarity between the target language and each
individual source language. This enables our model to learn effectively what to
share between various languages in the multilingual setup. Moreover, when
coupled with unsupervised multilingual embeddings, our model can operate in a
zero-resource setting where neither target language training data nor
cross-lingual resources are available. Our model achieves significant
performance gains over prior art, as shown in an extensive set of experiments
over multiple text classification and sequence tagging tasks including a
large-scale industry dataset.Comment: ACL 201
A Multi-task Learning Approach for Named Entity Recognition using Local Detection
Named entity recognition (NER) systems that perform well require task-related
and manually annotated datasets. However, they are expensive to develop, and
are thus limited in size. As there already exists a large number of NER
datasets that share a certain degree of relationship but differ in content, it
is important to explore the question of whether such datasets can be combined
as a simple method for improving NER performance. To investigate this, we
developed a novel locally detecting multitask model using FFNNs. The model
relies on encoding variable-length sequences of words into theoretically
lossless and unique fixed-size representations. We applied this method to
several well-known NER tasks and compared the results of our model to baseline
models as well as other published results. As a result, we observed competitive
performance in nearly all of the tasks.Comment: 8 pages, 1 figure, 5 tables (Rejected by ACL2018 with score 3-4-4
Effective Context and Fragment Feature Usage for Named Entity Recognition
In this paper, we explore a new approach to named entity recognition (NER)
with the goal of learning from context and fragment features more effectively,
contributing to the improvement of overall recognition performance. We use the
recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode
each sentence fragment and its left-right contexts into a fixed-size
representation. Next, we organize the context and fragment features into
groups, and feed each feature group to dedicated fully-connected layers.
Finally, we merge each group's final dedicated layers and add a shared layer
leading to a single output. The outcome of our experiments show that, given
only tokenized text and trained word embeddings, our system outperforms our
baseline models, and is competitive to the state-of-the-arts of various
well-known NER tasks.Comment: 7 pages, 1 figure, 7 tables (Rejected by EMNLP 2018 with score
3-4-4). arXiv admin note: text overlap with arXiv:1904.0330
Neural Entity Reasoner for Global Consistency in NER
We propose Neural Entity Reasoner (NE-Reasoner), a framework to introduce
global consistency of recognized entities into Neural Reasoner over Named
Entity Recognition (NER) task. Given an input sentence, the NE-Reasoner layer
can infer over multiple entities to increase the global consistency of output
labels, which then be transfered into entities for the input of next layer.
NE-Reasoner inherits and develops some features from Neural Reasoner 1) a
symbolic memory, allowing it to exchange entities between layers. 2) the
specific interaction-pooling mechanism, allowing it to connect each local word
to multiple global entities, and 3) the deep architecture, allowing it to
bootstrap the recognized entity set from coarse to fine. Like human beings,
NE-Reasoner is able to accommodate ambiguous words and Name Entities that
rarely or never met before. Despite the symbolic information the model
introduced, NE-Reasoner can still be trained effectively in an end-to-end
manner via parameter sharing strategy. NE-Reasoner can outperform conventional
NER models in most cases on both English and Chinese NER datasets. For example,
it achieves state-of-art on CoNLL-2003 English NER dataset.Comment: 8 pages, 3 figures, submitted to AAAI201
Adversarial Learning for Chinese NER from Crowd Annotations
To quickly obtain new labeled data, we can choose crowdsourcing as an
alternative way at lower cost in a short time. But as an exchange, crowd
annotations from non-experts may be of lower quality than those from experts.
In this paper, we propose an approach to performing crowd annotation learning
for Chinese Named Entity Recognition (NER) to make full use of the noisy
sequence labels from multiple annotators. Inspired by adversarial learning, our
approach uses a common Bi-LSTM and a private Bi-LSTM for representing
annotator-generic and -specific information. The annotator-generic information
is the common knowledge for entities easily mastered by the crowd. Finally, we
build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we
create two data sets for Chinese NER tasks from two domains. The experimental
results show that our system achieves better scores than strong baseline
systems.Comment: 8 pages, AAAI-201
A Knowledge Graph Based Solution for Entity Discovery and Linking in Open-Domain Questions
Named entity discovery and linking is the fundamental and core component of
question answering. In Question Entity Discovery and Linking (QEDL) problem,
traditional methods are challenged because multiple entities in one short
question are difficult to be discovered entirely and the incomplete information
in short text makes entity linking hard to implement. To overcome these
difficulties, we proposed a knowledge graph based solution for QEDL and
developed a system consists of Question Entity Discovery (QED) module and
Entity Linking (EL) module. The method of QED module is a tradeoff and ensemble
of two methods. One is the method based on knowledge graph retrieval, which
could extract more entities in questions and guarantee the recall rate, the
other is the method based on Conditional Random Field (CRF), which improves the
precision rate. The EL module is treated as a ranking problem and Learning to
Rank (LTR) method with features such as semantic similarity, text similarity
and entity popularity is utilized to extract and make full use of the
information in short texts. On the official dataset of a shared QEDL evaluation
task, our approach could obtain 64.44% F1 score of QED and 64.86% accuracy of
EL, which ranks the 2nd place and indicates its practical use for QEDL problem
Exploring Lexical, Syntactic, and Semantic Features for Chinese Textual Entailment in NTCIR RITE Evaluation Tasks
We computed linguistic information at the lexical, syntactic, and semantic
levels for Recognizing Inference in Text (RITE) tasks for both traditional and
simplified Chinese in NTCIR-9 and NTCIR-10. Techniques for syntactic parsing,
named-entity recognition, and near synonym recognition were employed, and
features like counts of common words, statement lengths, negation words, and
antonyms were considered to judge the entailment relationships of two
statements, while we explored both heuristics-based functions and
machine-learning approaches. The reported systems showed robustness by
simultaneously achieving second positions in the binary-classification subtasks
for both simplified and traditional Chinese in NTCIR-10 RITE-2. We conducted
more experiments with the test data of NTCIR-9 RITE, with good results. We also
extended our work to search for better configurations of our classifiers and
investigated contributions of individual features. This extended work showed
interesting results and should encourage further discussion.Comment: 20 pages, 1 figure, 26 tables, Journal article in Soft Computing
(Spinger). Soft Computing, online. Springer, Germany, 201
- …