3,720 research outputs found
Deep learning for extracting protein-protein interactions from biomedical literature
State-of-the-art methods for protein-protein interaction (PPI) extraction are
primarily feature-based or kernel-based by leveraging lexical and syntactic
information. But how to incorporate such knowledge in the recent deep learning
methods remains an open question. In this paper, we propose a multichannel
dependency-based convolutional neural network model (McDepCNN). It applies one
channel to the embedding vector of each word in the sentence, and another
channel to the embedding vector of the head of the corresponding word.
Therefore, the model can use richer information obtained from different
channels. Experiments on two public benchmarking datasets, AIMed and BioInfer,
demonstrate that McDepCNN compares favorably to the state-of-the-art
rich-feature and single-kernel based methods. In addition, McDepCNN achieves
24.4% relative improvement in F1-score over the state-of-the-art methods on
cross-corpus evaluation and 12% improvement in F1-score over kernel-based
methods on "difficult" instances. These results suggest that McDepCNN
generalizes more easily over different corpora, and is capable of capturing
long distance features in the sentences.Comment: Accepted for publication in Proceedings of the 2017 Workshop on
Biomedical Natural Language Processing, 10 pages, 2 figures, 6 table
Using Neural Networks for Relation Extraction from Biomedical Literature
Using different sources of information to support automated extracting of
relations between biomedical concepts contributes to the development of our
understanding of biological systems. The primary comprehensive source of these
relations is biomedical literature. Several relation extraction approaches have
been proposed to identify relations between concepts in biomedical literature,
namely, using neural networks algorithms. The use of multichannel architectures
composed of multiple data representations, as in deep neural networks, is
leading to state-of-the-art results. The right combination of data
representations can eventually lead us to even higher evaluation scores in
relation extraction tasks. Thus, biomedical ontologies play a fundamental role
by providing semantic and ancestry information about an entity. The
incorporation of biomedical ontologies has already been proved to enhance
previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1
Semi-supervised prediction of protein interaction sentences exploiting semantically encoded metrics
Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification is one of the key goals of text mining (TM). However, labelled PPI corpora required to train classifiers are generally small. In order to overcome this sparsity in the training data, we propose a novel method of integrating corpora that do not contain relevance judgements. Our approach uses a semantic language model to gather word similarity from a large unlabelled corpus. This additional information is integrated into the sentence classification process using kernel transformations and has a re-weighting effect on the training features that leads to an 8% improvement in F-score over the baseline results. Furthermore, we discover that some words which are generally considered indicative of interactions are actually neutralised by this process
- …