31 research outputs found

    Computing Semantic Similarity of Concepts in Knowledge Graphs

    Get PDF
    This paper presents a method for measuring the semantic similarity between concepts in Knowledge Graphs (KGs) such as WordNet and DBpedia. Previous work on semantic similarity methods have focused on either the structure of the semantic network between concepts (e.g., path length and depth), or only on the Information Content (IC) of concepts. We propose a semantic similarity method, namely wpath, to combine these two approaches, using IC to weight the shortest path length between concepts. Conventional corpus-based IC is computed from the distributions of concepts over textual corpus, which is required to prepare a domain corpus containing annotated concepts and has high computational cost. As instances are already extracted from textual corpus and annotated by concepts in KGs, graph-based IC is proposed to compute IC based on the distributions of concepts over instances. Through experiments performed on well known word similarity datasets, we show that the wpath semantic similarity method has produced a statistically significant improvement over other semantic similarity methods. Moreover, in a real category classification evaluation, the wpath method has shown the best performance in terms of accuracy and F score

    A Review on Computing Semantic Similarity of Concepts in Knowledge Graphs

    Get PDF
    Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between them is based on the likeness of their meaning or semantic content as opposed to similarity which can be estimated regarding their syntactical representation (e.g. their string format). One of the drawbacks of conventional knowledge-based approaches (e.g. path or lch) in addressing such task is that the semantic similarity of any two concepts with the same path length is the same (uniform distance problem).To propose a weighted path length (wpath) method to combine both path length and IC in measuring the semantic similarity between concepts. The IC of two conceptsďż˝ LCS is used to weight their shortest path length so that those concept pairs having same path length can have different semantic similarity score if they have different LCS

    Multi-sense Embeddings Using Synonym Sets and Hypernym Information from Wordnet

    Get PDF
    Word embedding approaches increased the efficiency of natural language processing (NLP) tasks. Traditional word embeddings though robust for many NLP activities, do not handle polysemy of words. The tasks of semantic similarity between concepts need to understand relations like hypernymy and synonym sets to produce efficient word embeddings. The outcomes of any expert system are affected by the text representation. Systems that understand senses, context, and definitions of concepts while deriving vector representations handle the drawbacks of single vector representations. This paper presents a novel idea for handling polysemy by generating Multi-Sense Embeddings using synonym sets and hypernyms information of words. This paper derives embeddings of a word by understanding the information of a word at different levels, starting from sense to context and definitions. Proposed sense embeddings of words obtained prominent results when tested on word similarity tasks. The proposed approach is tested on nine benchmark datasets, which outperformed several state-of-the-art systems

    Constructing Cooking Ontology for Live Streams

    Get PDF
    We build a cooking domain knowledge by using an ontology schema that reflects natural language processing and enhances ontology instances with semantic query. Our research helps audiences to better understand live streaming, especially when they just switch to a show. The practical contribution of our research is to use cooking ontology, so we may map clips of cooking live stream video and instructions of recipes. The architecture of our study presents three sections: ontology construction, ontology enhancement, and mapping cooking video to cooking ontology. Also, our preliminary evaluations consist of three hierarchies—nodes, ordered-pairs, and 3-tuples—that we use to referee (1) ontology enhancement performance for our first experiment evaluation and (2) the accuracy ratio of mapping between video clips and cooking ontology for our second experiment evaluation. Our results indicate that ontology enhancement is effective and heightens accuracy ratios on matching pairs with cooking ontology and video clips

    Similarity Computing on Electronic Health Records

    Get PDF
    Similarity computing on real world applications like Electronic Health Records (EHRs) can reveal numerous interesting knowledge. Similarity measures the closeness between comparable things such as patients. Like similarity computing amongst Intensive Care Unit (ICU) patients can create various benefits, such as case based patient retrieval, unearthing of similar patient groups. However, many classical methods such as euclidean distance, cosine similarity can’t be directly applicable as similarity computing in EHRs is subjective and in many cases conditional. Also, many intrinsic relationships between the data are lost due to poor data representation and conversion. To address these challenges, firstly, we propose structural network representation for EHRs to preserve inherent relationship. And, to make them more comparable, we do data enrichment e.g. adding abstract information. Then, we extract different similarity feature sets to generate different similarity metrics and retrieve top similar patients. Finally, we perform experiment which shows promising results over classical methods

    CoCo: A tool for automatically assessing conceptual complexity of texts

    Get PDF
    Traditional text complexity assessment usually takes into account only syntactic and lexical text complexity. The task of automatic assessment of conceptual text complexity, important for maintaining reader's interest and text adaptation for struggling readers, has only been proposed recently. In this paper, we present CoCo - a tool for automatic assessment of conceptual text complexity, based on using the current state-of-the-art unsupervised approach. We make the code and API freely available for research purposes, and describe the code and the possibility for its personalization and adaptation in details. We compare the current implementation with the state of the art, discussing the influence of the choice of entity linker on the performances of the tool. Finally, we present results obtained on two widely used text simplification corpora, discussing the full potential of the tool
    corecore