95 research outputs found

    Evaluating semantic relations by exploring ontologies on the Semantic Web

    Get PDF
    We investigate the problem of evaluating the correctness of a semantic relation and propose two methods which explore the increasing number of online ontologies as a source of evidence for predicting correctness. We obtain encouraging results, with some of our measures reaching average precision values of 75%

    Retrieval, alignment, and clustering of computational models based on semantic annotations

    Get PDF
    As the number of computational systems biology models increases, new methods are needed to explore their content and build connections with experimental data. In this Perspective article, the authors propose a flexible semantic framework that can help achieve these aims

    Disambiguation of biomedical text using diverse sources of information

    Get PDF
    Background: Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of various sources of information including linguistic features of the context in which the ambiguous term is used and domain-specific resources, such as UMLS. Materials and methods: We compare various sources of information including ones which have been previously used and a novel one: MeSH terms. Evaluation is carried out using a standard test set (the NLM-WSD corpus). Results: The best performance is obtained using a combination of linguistic features and MeSH terms. Performance of our system exceeds previously published results for systems evaluated using the same data set. Conclusion: Disambiguation of biomedical terms benefits from the use of information from a variety of sources. In particular, MeSH terms have proved to be useful and should be used if available

    An evaluative baseline for geo-semantic relatedness and similarity

    Get PDF
    In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been largely ignored. In natural language processing, semantic relatedness is often confused with the more specific semantic similarity. In this article, we discuss a notion of geo-semantic relatedness based on Lehrer’s semantic fields, and we compare it with geo-semantic similarity. We then describe and validate the Geo Relatedness and Similarity Dataset (GeReSiD), a new open dataset designed to evaluate computational measures of geo-semantic relatedness and similarity. This dataset is larger than existing datasets of this kind, and includes 97 geographic terms combined into 50 term pairs rated by 203 human subjects. GeReSiD is available online and can be used as an evaluation baseline to determine empirically to what degree a given computational model approximates geo-semantic relatedness and similarity

    Geotag Propagation with User Trust Modeling

    Get PDF
    The amount of information that people share on social networks is constantly increasing. People also comment, annotate, and tag their own content (videos, photos, notes, etc.), as well as the content of others. In many cases, the content is tagged manually. One way to make this time-consuming manual tagging process more efficient is to propagate tags from a small set of tagged images to the larger set of untagged images automatically. In such a scenario, however, a wrong or a spam tag can damage the integrity and reliability of the automated propagation system. Users may make mistakes in tagging, or irrelevant tags and content may be added maliciously for advertisement or self-promotion. Therefore, a certain mechanism insuring the trustworthiness of users or published content is needed. In this chapter, we discuss several image retrieval methods based on tags, various approaches to trust modeling and spam protection in social networks, and trust modeling in geotagging systems. We then consider a specific example of automated geotag propagation system that adopts a user trust model. The tag propagation in images relies on the similarity between image content (famous landmarks) and its context (associated geotags). For each tagged image, similar untagged images are found by the robust graph-based object duplicate detection and the known tags are propagated accordingly. The user trust value is estimated based on a social feedback from the users of the photo-sharing system and only tags from trusted users are propagated. This approach demonstrates that a practical tagging system significantly benefits from the intelligent combination of efficient propagation algorithm and a user-centered trust model

    A transversal approach to predict gene product networks from ontology-based similarity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression.</p> <p>Results</p> <p>The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity.</p> <p>Conclusion</p> <p>Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression.</p

    ReaderBench Learns Dutch: Building a Comprehensive Automated Essay Scoring System for Dutch Language

    Full text link
    Automated Essay Scoring has gained a wider applicability and usage with the integration of advanced Natural Language Processing techniques which enabled in-depth analyses of discourse in order capture the specificities of written texts. In this paper, we introduce a novel Automatic Essay Scoring method for Dutch language, built within the Readerbench framework, which encompasses a wide range of textual complexity indices, as well as an automated segmentation approach. Our method was evaluated on a corpus of 173 technical reports automatically split into sections and subsections, thus forming a hierarchical structure on which textual complexity indices were subsequently applied. The stepwise regression model explained 30.5% of the variance in students’ scores, while a Discriminant Function Analysis predicted with substantial accuracy (75.1%) whether they are high or low performance students.This study is part of the RAGE project. The RAGE project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 644187. This publication reflects only the author's view. The European Commission is not responsible for any use that may be made of the information it contains

    Omiotis: A Thesaurus-Based Measure of Text Relatedness

    No full text
    Abstract. In this paper we present a new approach for measuring the relatedness between text segments, based on implicit semantic links between their words, as offered by a word thesaurus, namely WordNet. The approach does not require any type of training, since it exploits only WordNet to devise the implicit semantic links between text words. The paper presents a prototype on-line demo of the measure, that can provide word-to-word relatedness values, even for words of different part of speech. In addition the demo allows for the computation of relatedness between text segments

    Building Semantic Hierarchies Faithful to Image Semantics

    No full text
    International audienceThis paper proposes a new image-semantic measure, named "Semantico-Visual Relatedness of Concepts" (SVRC), to estimate the semantic similarity between concepts. The proposed measure incorporates visual, conceptual and contextual information to provide a measure which is more meaningful and more representative of image semantics. We also propose a new methodology to automatically build a semantic hierarchy suitable for the purpose of image annotation and/or classification. The building is based on the previously proposed measure SVRC and on a new heuristic, named TRUST-ME, to connect concepts with higher relatedness till the building of the final hierarchy. The built hierarchy explicitly encodes a general to specific concepts relationship and therefore provides a semantic structure to concepts which facilitates the semantic interpretation of images. Our experiments showed that the use of the constructed semantic hierarchies as a hierarchical classification framework provides a better image annotation
    corecore