2,934 research outputs found

    Extracting Biomolecular Interactions Using Semantic Parsing of Biomedical Text

    Full text link
    We advance the state of the art in biomolecular interaction extraction with three contributions: (i) We show that deep, Abstract Meaning Representations (AMR) significantly improve the accuracy of a biomolecular interaction extraction system when compared to a baseline that relies solely on surface- and syntax-based features; (ii) In contrast with previous approaches that infer relations on a sentence-by-sentence basis, we expand our framework to enable consistent predictions over sets of sentences (documents); (iii) We further modify and expand a graph kernel learning framework to enable concurrent exploitation of automatically induced AMR (semantic) and dependency structure (syntactic) representations. Our experiments show that our approach yields interaction extraction systems that are more robust in environments where there is a significant mismatch between training and test conditions.Comment: Appearing in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16

    Using Neural Networks for Relation Extraction from Biomedical Literature

    Full text link
    Using different sources of information to support automated extracting of relations between biomedical concepts contributes to the development of our understanding of biological systems. The primary comprehensive source of these relations is biomedical literature. Several relation extraction approaches have been proposed to identify relations between concepts in biomedical literature, namely, using neural networks algorithms. The use of multichannel architectures composed of multiple data representations, as in deep neural networks, is leading to state-of-the-art results. The right combination of data representations can eventually lead us to even higher evaluation scores in relation extraction tasks. Thus, biomedical ontologies play a fundamental role by providing semantic and ancestry information about an entity. The incorporation of biomedical ontologies has already been proved to enhance previous state-of-the-art results.Comment: Artificial Neural Networks book (Springer) - Chapter 1

    Classification of protein interaction sentences via gaussian processes

    Get PDF
    The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption

    Event based text mining for integrated network construction

    Get PDF
    The scientific literature is a rich and challenging data source for research in systems biology, providing numerous interactions between biological entities. Text mining techniques have been increasingly useful to extract such information from the literature in an automatic way, but up to now the main focus of text mining in the systems biology field has been restricted mostly to the discovery of protein-protein interactions. Here, we take this approach one step further, and use machine learning techniques combined with text mining to extract a much wider variety of interactions between biological entities. Each particular interaction type gives rise to a separate network, represented as a graph, all of which can be subsequently combined to yield a so-called integrated network representation. This provides a much broader view on the biological system as a whole, which can then be used in further investigations to analyse specific properties of the networ

    Semantic models as metrics for kernel-based interaction identification

    Get PDF
    Automatic detection of protein-protein interactions (PPIs) in biomedical publications is vital for efficient biological research. It also presents a host of new challenges for pattern recognition methodologies, some of which will be addressed by the research in this thesis. Proteins are the principal method of communication within a cell; hence, this area of research is strongly motivated by the needs of biologists investigating sub-cellular functions of organisms, diseases, and treatments. These researchers rely on the collaborative efforts of the entire field and communicate through experimental results published in reviewed biomedical journals. The substantial number of interactions detected by automated large-scale PPI experiments, combined with the ease of access to the digitised publications, has increased the number of results made available each day. The ultimate aim of this research is to provide tools and mechanisms to aid biologists and database curators in locating relevant information. As part of this objective this thesis proposes, studies, and develops new methodologies that go some way to meeting this grand challenge. Pattern recognition methodologies are one approach that can be used to locate PPI sentences; however, most accurate pattern recognition methods require a set of labelled examples to train on. For this particular task, the collection and labelling of training data is highly expensive. On the other hand, the digital publications provide a plentiful source of unlabelled data. The unlabelled data is used, along with word cooccurrence models, to improve classification using Gaussian processes, a probabilistic alternative to the state-of-the-art support vector machines. This thesis presents and systematically assesses the novel methods of using the knowledge implicitly encoded in biomedical texts and shows an improvement on the current approaches to PPI sentence detection
    • …
    corecore