20 research outputs found
Coreference based event-argument relation extraction on biomedical text
This paper presents a new approach to exploit coreference information for extracting event-argument (E-A) relations from biomedical documents. This approach has two advantages: (1) it can extract a large number of valuable E-A relations based on the concept of salience in discourse; (2) it enables us to identify E-A relations over sentence boundaries (cross-links) using transitivity of coreference relations. We propose two coreference-based models: a pipeline based on Support Vector Machine (SVM) classifiers, and a joint Markov Logic Network (MLN). We show the effectiveness of these models on a biomedical event corpus. Both models outperform the systems that do not use coreference information. When the two proposed models are compared to each other, joint MLN outperforms pipeline SVM with gold coreference information
How to Train Your Agent to Read and Write
Reading and writing research papers is one of the most privileged abilities
that a qualified researcher should master. However, it is difficult for new
researchers (\eg{students}) to fully {grasp} this ability. It would be
fascinating if we could train an intelligent agent to help people read and
summarize papers, and perhaps even discover and exploit the potential knowledge
clues to write novel papers. Although there have been existing works focusing
on summarizing (\emph{i.e.}, reading) the knowledge in a given text or
generating (\emph{i.e.}, writing) a text based on the given knowledge, the
ability of simultaneously reading and writing is still under development.
Typically, this requires an agent to fully understand the knowledge from the
given text materials and generate correct and fluent novel paragraphs, which is
very challenging in practice. In this paper, we propose a Deep ReAder-Writer
(DRAW) network, which consists of a \textit{Reader} that can extract knowledge
graphs (KGs) from input paragraphs and discover potential knowledge, a
graph-to-text \textit{Writer} that generates a novel paragraph, and a
\textit{Reviewer} that reviews the generated paragraph from three different
aspects. Extensive experiments show that our DRAW network outperforms
considered baselines and several state-of-the-art methods on AGENDA and
M-AGENDA datasets. Our code and supplementary are released at
https://github.com/menggehe/DRAW
New Resources and Perspectives for Biomedical Event Extraction
Event extraction is a major focus of recent work in biomedical information extraction. Despite substantial advances, many challenges still remain for reliable automatic extraction of events from text. We introduce a new biomedical event extraction resource consisting of analyses automatically created by systems participating in the recent BioNLP Shared Task (ST) 2011. In providing for the first time the outputs of a broad set of state-ofthe-art event extraction systems, this resource opens many new opportunities for studying aspects of event extraction, from the identification of common errors to the study of effective approaches to combining the strengths of systems. We demonstrate these opportunities through a multi-system analysis on three BioNLP ST 2011 main tasks, focusing on events that none of the systems can successfully extract. We further argue for new perspectives to the performance evaluation of domain event extraction systems, considering a document-level, “off-the-page ” representation and evaluation to complement the mentionlevel evaluations pursued in most recent work.
Neural Relation Extraction Within and Across Sentence Boundaries
Past work in relation extraction mostly focuses on binary relation between
entity pairs within single sentence. Recently, the NLP community has gained
interest in relation extraction in entity pairs spanning multiple sentences. In
this paper, we propose a novel architecture for this task: inter-sentential
dependency-based neural networks (iDepNN). iDepNN models the shortest and
augmented dependency paths via recurrent and recursive neural networks to
extract relationships within (intra-) and across (inter-) sentence boundaries.
Compared to SVM and neural network baselines, iDepNN is more robust to false
positives in relationships spanning sentences.
We evaluate our models on four datasets from newswire (MUC6) and medical
(BioNLP shared task) domains that achieve state-of-the-art performance and show
a better balance in precision and recall for inter-sentential relationships. We
perform better than 11 teams participating in the BioNLP shared task 2016 and
achieve a gain of 5.2% (0.587 vs 0.558) in F1 over the winning team. We also
release the crosssentence annotations for MUC6.Comment: AAAI201
Boosting automatic event extraction from the literature using domain adaptation and coreference resolution
Motivation: In recent years, several biomedical event extraction (EE) systems have been developed. However, the nature of the annotated training corpora, as well as the training process itself, can limit the performance levels of the trained EE systems. In particular, most event-annotated corpora do not deal adequately with coreference. This impacts on the trained systems' ability to recognize biomedical entities, thus affecting their performance in extracting events accurately. Additionally, the fact that most EE systems are trained on a single annotated corpus further restricts their coverage