3,223 research outputs found
Spatio-textual indexing for geographical search on the web
Many web documents refer to specific geographic localities and many
people include geographic context in queries to web search engines. Standard
web search engines treat the geographical terms in the same way as other terms.
This can result in failure to find relevant documents that refer to the place of
interest using alternative related names, such as those of included or nearby
places. This can be overcome by associating text indexing with spatial indexing
methods that exploit geo-tagging procedures to categorise documents with
respect to geographic space. We describe three methods for spatio-textual
indexing based on multiple spatially indexed text indexes, attaching spatial
indexes to the document occurrences of a text index, and merging text index
access results with results of access to a spatial index of documents. These
schemes are compared experimentally with a conventional text index search
engine, using a collection of geo-tagged web documents, and are shown to be
able to compete in speed and storage performance with pure text indexing
TALP-UPC at MediaEval 2014 Placing Task: Combining geographical knowledge bases and language models for large-scale textual georeferencing
This paper describes our Georeferencing approaches, experiments, and results at the MediaEval 2014 Placing Task evaluation. The task consists of predicting the most probable geographical coordinates of Flickr images and videos using its visual, audio and metadata associated features. Our approaches used only Flickr users textual metadata annotations and tagsets. We used four approaches for this task: 1) an approach based on Geographical Knowledge Bases (GeoKB), 2) the Hiemstra Language Model (HLM) approach with Re-Ranking, 3) a combination of the GeoKB and the HLM (GeoFusion). 4) a combination of the GeoFusion with a HLM model derived from the English Wikipedia georeferenced pages. The HLM approach with Re-Ranking showed the best performance within 10m to 1km distances. The GeoFusion approaches achieved the best results within the margin of errors from 10km to 5000km. This work has been supported by the Spanish Research Department (SKATER Project: TIN2012-38584-C06-01). TALP Research Center is recognized as a Quality Research Group (2014 SGR 1338) by AGAUR, the Research Department of the Catalan Government.Peer ReviewedPostprint (published version
Automatic tagging and geotagging in video collections and communities
Automatically generated tags and geotags hold great promise
to improve access to video collections and online communi-
ties. We overview three tasks offered in the MediaEval 2010
benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features
A Density-Based Approach to the Retrieval of Top-K Spatial Textual Clusters
Keyword-based web queries with local intent retrieve web content that is
relevant to supplied keywords and that represent points of interest that are
near the query location. Two broad categories of such queries exist. The first
encompasses queries that retrieve single spatial web objects that each satisfy
the query arguments. Most proposals belong to this category. The second
category, to which this paper's proposal belongs, encompasses queries that
support exploratory user behavior and retrieve sets of objects that represent
regions of space that may be of interest to the user. Specifically, the paper
proposes a new type of query, namely the top-k spatial textual clusters (k-STC)
query that returns the top-k clusters that (i) are located the closest to a
given query location, (ii) contain the most relevant objects with regard to
given query keywords, and (iii) have an object density that exceeds a given
threshold. To compute this query, we propose a basic algorithm that relies on
on-line density-based clustering and exploits an early stop condition. To
improve the response time, we design an advanced approach that includes three
techniques: (i) an object skipping rule, (ii) spatially gridded posting lists,
and (iii) a fast range query algorithm. An empirical study on real data
demonstrates that the paper's proposals offer scalability and are capable of
excellent performance
Sidra5: a search system with geographic signatures
Tese de mestrado em Engenharia Informática, apresentada à Universidade de Lisboa através da Faculdade de Ciências, 2007Este trabalho consistiu no desenvolvimento de um sistema de pesquisa de informação com raciocínio geográfico, servindo de base para uma nova abordagem para modelação da informação geográfica contida nos documentos, as assinaturas geográficas. Pretendeu-se determinar se a semântica geográfica presente nos documentos, capturada através das assinaturas geográficas, contribui para uma melhoria dos resultados obtidos para pesquisas de cariz geográfico. São propostas e experimentadas diversas estratégias para o cálculo da semelhança entre as assinaturas geográficas de interrogações e documentos. A partir dos resultados observados conclui-se que, em algumas circunstâncias, as assinaturas geográficas contribuem para melhorar a qualidade das pesquisas geográficas.The dissertation report presents the development of a geographic information search system which implements geographic signatures, a novel approach for the modeling of the geographic information present in documents. The goal of the project was to determine if the information with geographic semantics present in documents, captured as geographic signatures, contributes to the improvement of search results. Several strategies for computing the similarity between the geographic signatures in queries and documents are proposed and experimented. The obtained results show that, in some circunstances, geographic signatures can indeed improve the search quality of geographic queries
A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web
Over the past decade, rapid advances in web technologies, coupled with
innovative models of spatial data collection and consumption, have generated a
robust growth in geo-referenced information, resulting in spatial information
overload. Increasing 'geographic intelligence' in traditional text-based
information retrieval has become a prominent approach to respond to this issue
and to fulfill users' spatial information needs. Numerous efforts in the
Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the
Linking Open Data initiative have converged in a constellation of open
knowledge bases, freely available online. In this article, we survey these open
knowledge bases, focusing on their geospatial dimension. Particular attention
is devoted to the crucial issue of the quality of geo-knowledge bases, as well
as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic
Network, is outlined as our contribution to this area. Research directions in
information integration and Geographic Information Retrieval (GIR) are then
reviewed, with a critical discussion of their current limitations and future
prospects
- …