9 research outputs found

    Span Model for Open Information Extraction on Accurate Corpus

    Full text link
    Open information extraction (Open IE) is a challenging task especially due to its brittle data basis. Most of Open IE systems have to be trained on automatically built corpus and evaluated on inaccurate test set. In this work, we first alleviate this difficulty from both sides of training and test sets. For the former, we propose an improved model design to more sufficiently exploit training dataset. For the latter, we present our accurately re-annotated benchmark test set (Re-OIE6) according to a series of linguistic observation and analysis. Then, we introduce a span model instead of previous adopted sequence labeling formulization for n-ary Open IE. Our newly introduced model achieves new state-of-the-art performance on both benchmark evaluation datasets.Comment: this paper has been accepted by AAAI 202

    Transition-based Semantic Role Labeling with Pointer Networks

    Get PDF
    Semantic role labeling (SRL) focuses on recognizing the predicate-argument structure of a sentence and plays a critical role in many natural language processing tasks such as machine translation and question answering. Practically all available methods do not perform full SRL, since they rely on pre-identified predicates, and most of them follow a pipeline strategy, using specific models for undertaking one or several SRL subtasks. In addition, previous approaches have a strong dependence on syntactic information to achieve state-of-the-art performance, despite being syntactic trees equally hard to produce. These simplifications and requirements make the majority of SRL systems impractical for real-world applications. In this article, we propose the first transition-based SRL approach that is capable of completely processing an input sentence in a single left-to-right pass, with neither leveraging syntactic information nor resorting to additional modules. Thanks to our implementation based on Pointer Networks, full SRL can be accurately and efficiently done in O(n2)O(n^2), achieving the best performance to date on the majority of languages from the CoNLL-2009 shared task.Comment: Final peer-reviewed manuscript accepted for publication in Knowledge-Based System

    Unifying context with labeled property graph: A pipeline-based system for comprehensive text representation in NLP

    Get PDF
    Extracting valuable insights from vast amounts of unstructured digital text presents significant challenges across diverse domains. This research addresses this challenge by proposing a novel pipeline-based system that generates domain-agnostic and task-agnostic text representations. The proposed approach leverages labeled property graphs (LPG) to encode contextual information, facilitating the integration of diverse linguistic elements into a unified representation. The proposed system enables efficient graph-based querying and manipulation by addressing the crucial aspect of comprehensive context modeling and fine-grained semantics. The effectiveness of the proposed system is demonstrated through the implementation of NLP components that operate on LPG-based representations. Additionally, the proposed approach introduces specialized patterns and algorithms to enhance specific NLP tasks, including nominal mention detection, named entity disambiguation, event enrichments, event participant detection, and temporal link detection. The evaluation of the proposed approach, using the MEANTIME corpus comprising manually annotated documents, provides encouraging results and valuable insights into the system\u27s strengths. The proposed pipeline-based framework serves as a solid foundation for future research, aiming to refine and optimize LPG-based graph structures to generate comprehensive and semantically rich text representations, addressing the challenges associated with efficient information extraction and analysis in NLP

    Graph Neural Networks for Natural Language Processing: A Survey

    Full text link
    Deep learning has become the dominant approach in coping with various tasks in Natural LanguageProcessing (NLP). Although text inputs are typically represented as a sequence of tokens, there isa rich variety of NLP problems that can be best expressed with a graph structure. As a result, thereis a surge of interests in developing new deep learning techniques on graphs for a large numberof NLP tasks. In this survey, we present a comprehensive overview onGraph Neural Networks(GNNs) for Natural Language Processing. We propose a new taxonomy of GNNs for NLP, whichsystematically organizes existing research of GNNs for NLP along three axes: graph construction,graph representation learning, and graph based encoder-decoder models. We further introducea large number of NLP applications that are exploiting the power of GNNs and summarize thecorresponding benchmark datasets, evaluation metrics, and open-source codes. Finally, we discussvarious outstanding challenges for making the full use of GNNs for NLP as well as future researchdirections. To the best of our knowledge, this is the first comprehensive overview of Graph NeuralNetworks for Natural Language Processing.Comment: 127 page
    corecore