8,949 research outputs found

    The semantics of similarity in geographic information retrieval

    Get PDF
    Similarity measures have a long tradition in fields such as information retrieval artificial intelligence and cognitive science. Within the last years these measures have been extended and reused to measure semantic similarity; i.e. for comparing meanings rather than syntactic differences. Various measures for spatial applications have been developed but a solid foundation for answering what they measure; how they are best applied in information retrieval; which role contextual information plays; and how similarity values or rankings should be interpreted is still missing. It is therefore difficult to decide which measure should be used for a particular application or to compare results from different similarity theories. Based on a review of existing similarity measures we introduce a framework to specify the semantics of similarity. We discuss similarity-based information retrieval paradigms as well as their implementation in web-based user interfaces for geographic information retrieval to demonstrate the applicability of the framework. Finally we formulate open challenges for similarity research

    An evaluative baseline for geo-semantic relatedness and similarity

    Get PDF
    In geographic information science and semantics, the computation of semantic similarity is widely recognised as key to supporting a vast number of tasks in information integration and retrieval. By contrast, the role of geo-semantic relatedness has been largely ignored. In natural language processing, semantic relatedness is often confused with the more specific semantic similarity. In this article, we discuss a notion of geo-semantic relatedness based on Lehrer’s semantic fields, and we compare it with geo-semantic similarity. We then describe and validate the Geo Relatedness and Similarity Dataset (GeReSiD), a new open dataset designed to evaluate computational measures of geo-semantic relatedness and similarity. This dataset is larger than existing datasets of this kind, and includes 97 geographic terms combined into 50 term pairs rated by 203 human subjects. GeReSiD is available online and can be used as an evaluation baseline to determine empirically to what degree a given computational model approximates geo-semantic relatedness and similarity

    A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web

    Full text link
    Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users' spatial information needs. Numerous efforts in the Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the Linking Open Data initiative have converged in a constellation of open knowledge bases, freely available online. In this article, we survey these open knowledge bases, focusing on their geospatial dimension. Particular attention is devoted to the crucial issue of the quality of geo-knowledge bases, as well as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic Network, is outlined as our contribution to this area. Research directions in information integration and Geographic Information Retrieval (GIR) are then reviewed, with a critical discussion of their current limitations and future prospects

    Semantics-driven event clustering in Twitter feeds

    Get PDF
    Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources - either textual, temporal, geographic or community features - have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic information can also be used to drive the actual event detection, which is less covered by academic research. We therefore supplemented an existing baseline event clustering algorithm with semantic information about the tweets in order to improve its performance. This paper lays out the details of the semantics-driven event clustering algorithms developed, discusses a novel method to aid in the creation of a ground truth for event detection purposes, and analyses how well the algorithms improve over baseline. We find that assigning semantic information to every individual tweet results in just a worse performance in F1 measure compared to baseline. If however semantics are assigned on a coarser, hashtag level the improvement over baseline is substantial and significant in both precision and recall
    • …
    corecore