3,223 research outputs found

    Spatio-textual indexing for geographical search on the web

    Get PDF
    Many web documents refer to specific geographic localities and many people include geographic context in queries to web search engines. Standard web search engines treat the geographical terms in the same way as other terms. This can result in failure to find relevant documents that refer to the place of interest using alternative related names, such as those of included or nearby places. This can be overcome by associating text indexing with spatial indexing methods that exploit geo-tagging procedures to categorise documents with respect to geographic space. We describe three methods for spatio-textual indexing based on multiple spatially indexed text indexes, attaching spatial indexes to the document occurrences of a text index, and merging text index access results with results of access to a spatial index of documents. These schemes are compared experimentally with a conventional text index search engine, using a collection of geo-tagged web documents, and are shown to be able to compete in speed and storage performance with pure text indexing

    TALP-UPC at MediaEval 2014 Placing Task: Combining geographical knowledge bases and language models for large-scale textual georeferencing

    Get PDF
    This paper describes our Georeferencing approaches, experiments, and results at the MediaEval 2014 Placing Task evaluation. The task consists of predicting the most probable geographical coordinates of Flickr images and videos using its visual, audio and metadata associated features. Our approaches used only Flickr users textual metadata annotations and tagsets. We used four approaches for this task: 1) an approach based on Geographical Knowledge Bases (GeoKB), 2) the Hiemstra Language Model (HLM) approach with Re-Ranking, 3) a combination of the GeoKB and the HLM (GeoFusion). 4) a combination of the GeoFusion with a HLM model derived from the English Wikipedia georeferenced pages. The HLM approach with Re-Ranking showed the best performance within 10m to 1km distances. The GeoFusion approaches achieved the best results within the margin of errors from 10km to 5000km. This work has been supported by the Spanish Research Department (SKATER Project: TIN2012-38584-C06-01). TALP Research Center is recognized as a Quality Research Group (2014 SGR 1338) by AGAUR, the Research Department of the Catalan Government.Peer ReviewedPostprint (published version

    Automatic tagging and geotagging in video collections and communities

    Get PDF
    Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

    A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters

    Full text link
    Keyword-based web queries with local intent retrieve web content that is relevant to supplied keywords and that represent points of interest that are near the query location. Two broad categories of such queries exist. The first encompasses queries that retrieve single spatial web objects that each satisfy the query arguments. Most proposals belong to this category. The second category, to which this paper's proposal belongs, encompasses queries that support exploratory user behavior and retrieve sets of objects that represent regions of space that may be of interest to the user. Specifically, the paper proposes a new type of query, namely the top-k spatial textual clusters (k-STC) query that returns the top-k clusters that (i) are located the closest to a given query location, (ii) contain the most relevant objects with regard to given query keywords, and (iii) have an object density that exceeds a given threshold. To compute this query, we propose a basic algorithm that relies on on-line density-based clustering and exploits an early stop condition. To improve the response time, we design an advanced approach that includes three techniques: (i) an object skipping rule, (ii) spatially gridded posting lists, and (iii) a fast range query algorithm. An empirical study on real data demonstrates that the paper's proposals offer scalability and are capable of excellent performance

    Sidra5: a search system with geographic signatures

    Get PDF
    Tese de mestrado em Engenharia Informática, apresentada à Universidade de Lisboa através da Faculdade de Ciências, 2007Este trabalho consistiu no desenvolvimento de um sistema de pesquisa de informação com raciocínio geográfico, servindo de base para uma nova abordagem para modelação da informação geográfica contida nos documentos, as assinaturas geográficas. Pretendeu-se determinar se a semântica geográfica presente nos documentos, capturada através das assinaturas geográficas, contribui para uma melhoria dos resultados obtidos para pesquisas de cariz geográfico. São propostas e experimentadas diversas estratégias para o cálculo da semelhança entre as assinaturas geográficas de interrogações e documentos. A partir dos resultados observados conclui-se que, em algumas circunstâncias, as assinaturas geográficas contribuem para melhorar a qualidade das pesquisas geográficas.The dissertation report presents the development of a geographic information search system which implements geographic signatures, a novel approach for the modeling of the geographic information present in documents. The goal of the project was to determine if the information with geographic semantics present in documents, captured as geographic signatures, contributes to the improvement of search results. Several strategies for computing the similarity between the geographic signatures in queries and documents are proposed and experimented. The obtained results show that, in some circunstances, geographic signatures can indeed improve the search quality of geographic queries

    A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web

    Full text link
    Over the past decade, rapid advances in web technologies, coupled with innovative models of spatial data collection and consumption, have generated a robust growth in geo-referenced information, resulting in spatial information overload. Increasing 'geographic intelligence' in traditional text-based information retrieval has become a prominent approach to respond to this issue and to fulfill users' spatial information needs. Numerous efforts in the Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the Linking Open Data initiative have converged in a constellation of open knowledge bases, freely available online. In this article, we survey these open knowledge bases, focusing on their geospatial dimension. Particular attention is devoted to the crucial issue of the quality of geo-knowledge bases, as well as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic Network, is outlined as our contribution to this area. Research directions in information integration and Geographic Information Retrieval (GIR) are then reviewed, with a critical discussion of their current limitations and future prospects
    corecore