6 research outputs found

    Study of Probabilistic Parsing in Syntactic Analysis

    Get PDF
    Statistical parser, like statistical tagging requires a corpus of hand –parsed text. There are such corpora available, the most notably being the Penn-tree bank. The Penn-tree bank is large corpus of articles from the Wall Street Journal that have been tagged with Penn tree-Bank tags and then parsed accordingly to a simple set of phrase structure rules conforming to Chomsky Government and binding syntax

    Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information

    Full text link
    Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. For these tasks, understanding logical and semantic relationship between two sentences is required but it is yet challenging. Although attention mechanism is useful to capture the semantic relationship and to properly align the elements of two sentences, previous methods of attention mechanism simply use a summation operation which does not retain original features enough. Inspired by DenseNet, a densely connected convolutional network, we propose a densely-connected co-attentive recurrent neural network, each layer of which uses concatenated information of attentive features as well as hidden features of all the preceding recurrent layers. It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. To alleviate the problem of an ever-increasing size of feature vectors due to dense concatenation operations, we also propose to use an autoencoder after dense concatenation. We evaluate our proposed architecture on highly competitive benchmark datasets related to sentence matching. Experimental results show that our architecture, which retains recurrent and attentive features, achieves state-of-the-art performances for most of the tasks.Comment: Accepted at AAAI 201

    Evaluation of linguistic features useful in extraction of interactions from PubMed; Application to annotating known, high-throughput and predicted interactions in I2D

    Get PDF
    Motivation: Identification and characterization of protein–protein interactions (PPIs) is one of the key aims in biological research. While previous research in text mining has made substantial progress in automatic PPI detection from literature, the need to improve the precision and recall of the process remains. More accurate PPI detection will also improve the ability to extract experimental data related to PPIs and provide multiple evidence for each interaction

    Investigating a Generic Paraphrase-based Approach for Relation Extraction

    Get PDF
    Unsupervised paraphrase acquisition has been an active research field in recent years, but its effective coverage and performance have rarely been evaluated. We propose a generic paraphrase-based approach for Relation Extraction (RE), aiming at a dual goal: obtaining an applicative evaluation scheme for paraphrase acquisition and obtaining a generic and largely unsupervised configuration for RE. We analyze the potential of our approach and evaluate an implemented prototype of it using an RE dataset. Our findings reveal a high potential for unsupervised paraphrase acquisition. We also identify the need for novel robust models for matching paraphrases in texts, which should address syntactic complexity and variability
    corecore