25,221 research outputs found
Extraction of chemical-induced diseases using prior knowledge and textual information
We describe our approach to the chemical-disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: Automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID)
Combining Context and Knowledge Representations for Chemical-Disease Relation Extraction
Automatically extracting the relationships between chemicals and diseases is
significantly important to various areas of biomedical research and health
care. Biomedical experts have built many large-scale knowledge bases (KBs) to
advance the development of biomedical research. KBs contain huge amounts of
structured information about entities and relationships, therefore plays a
pivotal role in chemical-disease relation (CDR) extraction. However, previous
researches pay less attention to the prior knowledge existing in KBs. This
paper proposes a neural network-based attention model (NAM) for CDR extraction,
which makes full use of context information in documents and prior knowledge in
KBs. For a pair of entities in a document, an attention mechanism is employed
to select important context words with respect to the relation representations
learned from KBs. Experiments on the BioCreative V CDR dataset show that
combining context and knowledge representations through the attention
mechanism, could significantly improve the CDR extraction performance while
achieve comparable results with state-of-the-art systems.Comment: Published on IEEE/ACM Transactions on Computational Biology and
Bioinformatics, 11 pages, 5 figure
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Entities, as the essential elements in relation extraction tasks, exhibit
certain structure. In this work, we formulate such structure as distinctive
dependencies between mention pairs. We then propose SSAN, which incorporates
these structural dependencies within the standard self-attention mechanism and
throughout the overall encoding stage. Specifically, we design two alternative
transformation modules inside each self-attention building block to produce
attentive biases so as to adaptively regularize its attention flow. Our
experiments demonstrate the usefulness of the proposed entity structure and the
effectiveness of SSAN. It significantly outperforms competitive baselines,
achieving new state-of-the-art results on three popular document-level relation
extraction datasets. We further provide ablation and visualization to show how
the entity structure guides the model for better relation extraction. Our code
is publicly available.Comment: Accepted to AAAI 202
- …