28,265 research outputs found
Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text
We advance the state of the art in biomolecular interaction extraction with
three contributions: (i) We show that deep, Abstract Meaning Representations
(AMR) significantly improve the accuracy of a biomolecular interaction
extraction system when compared to a baseline that relies solely on surface-
and syntax-based features; (ii) In contrast with previous approaches that infer
relations on a sentence-by-sentence basis, we expand our framework to enable
consistent predictions over sets of sentences (documents); (iii) We further
modify and expand a graph kernel learning framework to enable concurrent
exploitation of automatically induced AMR (semantic) and dependency structure
(syntactic) representations. Our experiments show that our approach yields
interaction extraction systems that are more robust in environments where there
is a significant mismatch between training and test conditions.Comment: Appearing in Proceedings of the Thirtieth AAAI Conference on
Artificial Intelligence (AAAI-16
Finding the Core-Genes of Chloroplasts
Due to the recent evolution of sequencing techniques, the number of available
genomes is rising steadily, leading to the possibility to make large scale
genomic comparison between sets of close species. An interesting question to
answer is: what is the common functionality genes of a collection of species,
or conversely, to determine what is specific to a given species when compared
to other ones belonging in the same genus, family, etc. Investigating such
problem means to find both core and pan genomes of a collection of species,
\textit{i.e.}, genes in common to all the species vs. the set of all genes in
all species under consideration. However, obtaining trustworthy core and pan
genomes is not an easy task, leading to a large amount of computation, and
requiring a rigorous methodology. Surprisingly, as far as we know, this
methodology in finding core and pan genomes has not really been deeply
investigated. This research work tries to fill this gap by focusing only on
chloroplastic genomes, whose reasonable sizes allow a deep study. To achieve
this goal, a collection of 99 chloroplasts are considered in this article. Two
methodologies have been investigated, respectively based on sequence
similarities and genes names taken from annotation tools. The obtained results
will finally be evaluated in terms of biological relevance
Bayesian sequence learning for predicting protein cleavage points
A challenging problem in data mining is the application of efficient techniques to automatically annotate the vast databases of biological sequence data. This paper describes one such application in this area, to the prediction of the position of signal peptide cleavage points along protein sequences. It is shown that the method, based on Bayesian statistics, is comparable in terms of accuracy to the existing state-of-the-art neural network techniques while providing explanatory information for its predictions
A Labeled Graph Kernel for Relationship Extraction
In this paper, we propose an approach for Relationship Extraction (RE) based
on labeled graph kernels. The kernel we propose is a particularization of a
random walk kernel that exploits two properties previously studied in the RE
literature: (i) the words between the candidate entities or connecting them in
a syntactic representation are particularly likely to carry information
regarding the relationship; and (ii) combining information from distinct
sources in a kernel may help the RE system make better decisions. We performed
experiments on a dataset of protein-protein interactions and the results show
that our approach obtains effectiveness values that are comparable with the
state-of-the art kernel methods. Moreover, our approach is able to outperform
the state-of-the-art kernels when combined with other kernel methods
- âŠ