28 research outputs found

    GIR Experimentation

    No full text
    Geographic Information Retrieval (GIR) community has generally accepted the thesis that both thematic and geographic aspect of documents are useful for GIR. This paper describes a preliminary experiment exploring this thesis by seperately indexing/searching geographical relevant-terms (place names, geo-spatial relations, geographic concepts and geographic adjacetives) extracted from reference document collection. Two indexes were created one for extracted geographic relevant-terms (i.e. document footprint) and one for reference document collections. Geo-Score and Thematic-Score against document collection footprint and reference document collection respectively were combined through a linear interpolation to obtained the final score for document relevance ranking. We used several freely available geographic resources – Wikipedia, World-Gazetteer, GEOnet Name Server (GNS), and WordNet. Apache Lucene was used as an indexing and search platform while Alias-I LingPipe was used to detect geographic named entities (GNEs), and other geo-relevant concepts and terms in documents. We submitted runs for monolingual English task, and our system achieved mean average precision (MAP) of 0.1690 to 0.2194. No significant improvement was observed through geographic query expansion

    Geographically constrained information retrieval

    Get PDF
    Eighteen percent of information seekers demand geographically intelligent information retrieval systems (Sanderson and Kohler, 2004). State-of-the-art information retrieval (IR) systems lack the geographical intelligence needed to effectively answer geography-dependent questions. Two specific research objectives are addressed in this thesis: (1) how to mine and analyze the geographical information (GI) implicit in texts, and (2) how to use the geographical knowledge obtained in this way to build models for answering geography-dependent questions. We assume that every document and search query have a geographical scope (i.e., where the events described are situated). In order to exploit the notion geographical scope we first developed techniques to detect the geographical scope of documents, and resolve the scopes in case the indications are complex or inconsistent. The thesis then turns to problems whose solution may be improved by incorporating the notion geographical scope, namely (i) toponym resolution, i.e. determining which place is referred to when ambiguous place names (toponyms) are used, (ii) query expansion, the enrichment of queries often used in IR, and relevance ranking strategies. The toponym resolution strategy prefers candidate places in top ranked scopes, and the query expansion strategy prefers place names in commonly shared scopes. The relevance ranking strategy incorporates scope information in score calculation. New evaluation metrics that measure small discrepancies among toponym and scope resolution systems are also proposed. The scope and toponym resolution strategies achieved scores of 70% ~ 90% against human annotators. The query expansion and relevance ranking strategies out-performed state-of-the-art IR systems by 9%.

    GIR experimentation

    No full text
    The Geographic Information Retrieval (GIR) community has generally accepted the thesis that both thematic and geographic aspects of documents are useful for GIR. This paper describes an experiment exploring this thesis by separately indexing and searching geographical relevant-terms (place names, geo-spatial relations, geographic concepts and geographic adjectives). Two indexes were created - one for extracted geographic relevant-terms (footprint document) and one for reference document collections. Footprint and reference document scores are combined by a linear interpolation to obtain an overall score for document relevance ranking. Experimentation with geographic query expansion provided no significant improvement though it stabilized the query result
    corecore