6,277 research outputs found
MLBiNet: A Cross-Sentence Collective Event Detection Network
We consider the problem of collectively detecting multiple events,
particularly in cross-sentence settings. The key to dealing with the problem is
to encode semantic information and model event inter-dependency at a
document-level. In this paper, we reformulate it as a Seq2Seq task and propose
a Multi-Layer Bidirectional Network (MLBiNet) to capture the document-level
association of events and semantic information simultaneously. Specifically, a
bidirectional decoder is firstly devised to model event inter-dependency within
a sentence when decoding the event tag vector sequence. Secondly, an
information aggregation module is employed to aggregate sentence-level semantic
and event tag information. Finally, we stack multiple bidirectional decoders
and feed cross-sentence information, forming a multi-layer bidirectional
tagging architecture to iteratively propagate information across sentences. We
show that our approach provides significant improvement in performance compared
to the current state-of-the-art results.Comment: Accepted by ACL 202
Text Classification: A Review, Empirical, and Experimental Evaluation
The explosive and widespread growth of data necessitates the use of text
classification to extract crucial information from vast amounts of data.
Consequently, there has been a surge of research in both classical and deep
learning text classification methods. Despite the numerous methods proposed in
the literature, there is still a pressing need for a comprehensive and
up-to-date survey. Existing survey papers categorize algorithms for text
classification into broad classes, which can lead to the misclassification of
unrelated algorithms and incorrect assessments of their qualities and behaviors
using the same metrics. To address these limitations, our paper introduces a
novel methodological taxonomy that classifies algorithms hierarchically into
fine-grained classes and specific techniques. The taxonomy includes methodology
categories, methodology techniques, and methodology sub-techniques. Our study
is the first survey to utilize this methodological taxonomy for classifying
algorithms for text classification. Furthermore, our study also conducts
empirical evaluation and experimental comparisons and rankings of different
algorithms that employ the same specific sub-technique, different
sub-techniques within the same technique, different techniques within the same
category, and categorie
Event Extraction: A Survey
Extracting the reported events from text is one of the key research themes in
natural language processing. This process includes several tasks such as event
detection, argument extraction, role labeling. As one of the most important
topics in natural language processing and natural language understanding, the
applications of event extraction spans across a wide range of domains such as
newswire, biomedical domain, history and humanity, and cyber security. This
report presents a comprehensive survey for event detection from textual
documents. In this report, we provide the task definition, the evaluation
method, as well as the benchmark datasets and a taxonomy of methodologies for
event extraction. We also present our vision of future research direction in
event detection.Comment: 20 page
PersoNER: Persian named-entity recognition
© 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network
- …