27 research outputs found

    A Hierarchical Framework for Relation Extraction with Reinforcement Learning

    Full text link
    Most existing methods determine relation types only after all the entities have been recognized, thus the interaction between relation types and entity mentions is not fully modeled. This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation. We apply a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types. The whole extraction process is decomposed into a hierarchy of two-level RL policies for relation detection and entity extraction respectively, so that it is more feasible and natural to deal with overlapping relations. Our model was evaluated on public datasets collected via distant supervision, and results show that it gains better performance than existing methods and is more powerful for extracting overlapping relations.Comment: To appear in AAAI 1

    Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

    Full text link
    Most existing event extraction (EE) methods merely extract event arguments within the sentence scope. However, such sentence-level EE methods struggle to handle soaring amounts of documents from emerging applications, such as finance, legislation, health, etc., where event arguments always scatter across different sentences, and even multiple such event mentions frequently co-exist in the same document. To address these challenges, we propose a novel end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic graph to fulfill the document-level EE (DEE) effectively. Moreover, we reformalize a DEE task with the no-trigger-words design to ease the document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we build a large-scale real-world dataset consisting of Chinese financial announcements with the challenges mentioned above. Extensive experiments with comprehensive analyses illustrate the superiority of Doc2EDAG over state-of-the-art methods. Data and codes can be found at https://github.com/dolphin-zs/Doc2EDAG.Comment: Accepted by EMNLP 201

    A Boundary Determined Neural Model for Relation Extraction

    Get PDF
    Existing models extract entity relations only after two entity spans have been precisely extracted that influenced the performance of relation extraction. Compared with recognizing entity spans, because the boundary has a small granularity and a less ambiguity, it can be detected precisely and incorporated to learn better representation. Motivated by the strengths of boundary, we propose a boundary determined neural (BDN) model, which leverages boundaries as task-related cues to predict the relation labels. Our model can predict high-quality relation instance via the pairs of boundaries, which can relieve error propagation problem. Moreover, our model fuses with boundary-relevant information encoding to represent distributed representation to improve the ability of capturing semantic and dependency information, which can increase the discriminability of neural network. Experiments show that our model achieves state-of-the-art performances on ACE05 corpus

    Contrastive Triple Extraction with Generative Transformer

    Full text link
    Triple extraction is an essential task in information extraction for natural language processing and knowledge graph construction. In this paper, we revisit the end-to-end triple extraction task for sequence generation. Since generative triple extraction may struggle to capture long-term dependencies and generate unfaithful triples, we introduce a novel model, contrastive triple extraction with a generative transformer. Specifically, we introduce a single shared transformer module for encoder-decoder-based generation. To generate faithful results, we propose a novel triplet contrastive training object. Moreover, we introduce two mechanisms to further improve model performance (i.e., batch-wise dynamic attention-masking and triple-wise calibration). Experimental results on three datasets (i.e., NYT, WebNLG, and MIE) show that our approach achieves better performance than that of baselines.Comment: Accepted by AAAI 202
    corecore