20,446 research outputs found

    Contextualizing Citations for Scientific Summarization using Word Embeddings and Domain Knowledge

    Full text link
    Citation texts are sometimes not very informative or in some cases inaccurate by themselves; they need the appropriate context from the referenced paper to reflect its exact contributions. To address this problem, we propose an unsupervised model that uses distributed representation of words as well as domain knowledge to extract the appropriate context from the reference paper. Evaluation results show the effectiveness of our model by significantly outperforming the state-of-the-art. We furthermore demonstrate how an effective contextualization method results in improving citation-based summarization of the scientific articles.Comment: SIGIR 201

    Improving approximation of domain-focused, corpus-based, lexical semantic relatedness

    Get PDF
    Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in many domain-specific scenarios. The problem of most state-of-the-art methods for calculating domain-specific semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the fields such as Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing. In this dissertation, three new corpus-based methods for approximating domain-specific textual semantic relatedness are presented and evaluated with a set of standard benchmarks focused on the field of biomedicine. Nonetheless, the proposed measures are general enough to be adapted to other domain-focused scenarios. The evaluation involves comparisons with other relevant state-of-the-art measures for calculating semantic relatedness and the results suggest that the methods presented here perform comparably or better than other approaches. Additionally, the dissertation also presents an experiment, in which one of the proposed methods is applied within an ontology matching system, DisMatch. The performance of the system was evaluated externally on a biomedically themed ‘Phenotype’ track of the Ontology Alignment Evaluation Initiative 2016 campaign. The results of the track indicate, that the use distributional semantic relatedness for ontology matching is promising, as the system presented in this thesis did stand out in detecting correct mappings that were not detected by any other systems participating in the track. The work presented in the dissertation indicates an improvement achieved w.r.t. the stat-of-the-art through the domain adapted use of the distributional principle (i.e. the presented methods are corpus-based and do not require additional resources). The ontology matching experiment showcases practical implications of the presented theoretical body of work

    Using ontology in query answering systems: Scenarios, requirements and challenges

    Get PDF
    Equipped with the ultimate query answering system, computers would finally be in a position to address all our information needs in a natural way. In this paper, we describe how Language and Computing nv (L&C), a developer of ontology-based natural language understanding systems for the healthcare domain, is working towards the ultimate Question Answering (QA) System for healthcare workers. L&C’s company strategy in this area is to design in a step-by-step fashion the essential components of such a system, each component being designed to solve some one part of the total problem and at the same time reflect well-defined needs on the prat of our customers. We compare our strategy with the research roadmap proposed by the Question Answering Committee of the National Institute of Standards and Technology (NIST), paying special attention to the role of ontology
    corecore