9 research outputs found
Span Model for Open Information Extraction on Accurate Corpus
Open information extraction (Open IE) is a challenging task especially due to
its brittle data basis. Most of Open IE systems have to be trained on
automatically built corpus and evaluated on inaccurate test set. In this work,
we first alleviate this difficulty from both sides of training and test sets.
For the former, we propose an improved model design to more sufficiently
exploit training dataset. For the latter, we present our accurately
re-annotated benchmark test set (Re-OIE6) according to a series of linguistic
observation and analysis. Then, we introduce a span model instead of previous
adopted sequence labeling formulization for n-ary Open IE. Our newly introduced
model achieves new state-of-the-art performance on both benchmark evaluation
datasets.Comment: this paper has been accepted by AAAI 202
Transition-based Semantic Role Labeling with Pointer Networks
Semantic role labeling (SRL) focuses on recognizing the predicate-argument
structure of a sentence and plays a critical role in many natural language
processing tasks such as machine translation and question answering.
Practically all available methods do not perform full SRL, since they rely on
pre-identified predicates, and most of them follow a pipeline strategy, using
specific models for undertaking one or several SRL subtasks. In addition,
previous approaches have a strong dependence on syntactic information to
achieve state-of-the-art performance, despite being syntactic trees equally
hard to produce. These simplifications and requirements make the majority of
SRL systems impractical for real-world applications. In this article, we
propose the first transition-based SRL approach that is capable of completely
processing an input sentence in a single left-to-right pass, with neither
leveraging syntactic information nor resorting to additional modules. Thanks to
our implementation based on Pointer Networks, full SRL can be accurately and
efficiently done in , achieving the best performance to date on the
majority of languages from the CoNLL-2009 shared task.Comment: Final peer-reviewed manuscript accepted for publication in
Knowledge-Based System
Unifying context with labeled property graph: A pipeline-based system for comprehensive text representation in NLP
Extracting valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. The proposed approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. The proposed system enables efficient graph-based querying and manipulation by addressing the crucial aspect of comprehensive context modeling and fine-grained semantics. The effectiveness of the proposed system is demonstrated through the implementation of NLP components that operate on LPG-based representations. Additionally, the proposed approach introduces specialized patterns and algorithms to enhance specific NLP tasks, including nominal mention detection, named entity disambiguation, event enrichments, event participant detection, and temporal link detection. The evaluation of the proposed approach, using the MEANTIME corpus comprising manually annotated documents, provides encouraging results and valuable insights into the system\u27s strengths. The proposed pipeline-based framework serves as a solid foundation for future research, aiming to refine and optimize LPG-based graph structures to generate comprehensive and semantically rich text representations, addressing the challenges associated with efficient information extraction and analysis in NLP
Graph Neural Networks for Natural Language Processing: A Survey
Deep learning has become the dominant approach in coping with various tasks
in Natural LanguageProcessing (NLP). Although text inputs are typically
represented as a sequence of tokens, there isa rich variety of NLP problems
that can be best expressed with a graph structure. As a result, thereis a surge
of interests in developing new deep learning techniques on graphs for a large
numberof NLP tasks. In this survey, we present a comprehensive overview onGraph
Neural Networks(GNNs) for Natural Language Processing. We propose a new
taxonomy of GNNs for NLP, whichsystematically organizes existing research of
GNNs for NLP along three axes: graph construction,graph representation
learning, and graph based encoder-decoder models. We further introducea large
number of NLP applications that are exploiting the power of GNNs and summarize
thecorresponding benchmark datasets, evaluation metrics, and open-source codes.
Finally, we discussvarious outstanding challenges for making the full use of
GNNs for NLP as well as future researchdirections. To the best of our
knowledge, this is the first comprehensive overview of Graph NeuralNetworks for
Natural Language Processing.Comment: 127 page