20,156 research outputs found
Biomedical Event Trigger Identification Using Bidirectional Recurrent Neural Network Based Models
Biomedical events describe complex interactions between various biomedical
entities. Event trigger is a word or a phrase which typically signifies the
occurrence of an event. Event trigger identification is an important first step
in all event extraction methods. However many of the current approaches either
rely on complex hand-crafted features or consider features only within a
window. In this paper we propose a method that takes the advantage of recurrent
neural network (RNN) to extract higher level features present across the
sentence. Thus hidden state representation of RNN along with word and entity
type embedding as features avoid relying on the complex hand-crafted features
generated using various NLP toolkits. Our experiments have shown to achieve
state-of-art F1-score on Multi Level Event Extraction (MLEE) corpus. We have
also performed category-wise analysis of the result and discussed the
importance of various features in trigger identification task.Comment: The work has been accepted in BioNLP at ACL-201
Extraction of Transcript Diversity from Scientific Literature
Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term āalternative splicingā to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/
- ā¦