Eighteen percent of information seekers demand geographically intelligent information retrieval systems (Sanderson and Kohler, 2004). State-of-the-art information retrieval (IR) systems lack the geographical intelligence needed to effectively answer geography-dependent questions. Two specific research objectives are addressed in this thesis: (1) how to mine and analyze the geographical information (GI) implicit in texts, and (2) how to use the geographical knowledge obtained in this way to build models for answering geography-dependent questions. 

We assume that every document and search query have a geographical scope (i.e., where the events described are situated). In order to exploit the notion geographical scope we first developed techniques to detect the geographical scope of documents, and resolve the scopes in case the indications are complex or inconsistent. 
 
The thesis then turns to problems whose solution may be improved by incorporating the notion geographical scope, namely (i) toponym resolution, i.e. determining which place is referred to when ambiguous place names (toponyms) are used, (ii) query expansion, the enrichment of queries often used in IR, and relevance ranking strategies. The toponym resolution strategy prefers candidate places in top ranked scopes, and the query expansion strategy prefers place names in commonly shared scopes.  The relevance ranking strategy incorporates scope information in score calculation. New evaluation metrics that measure small discrepancies among toponym and scope resolution systems are also proposed. The scope and toponym resolution strategies achieved scores of 70% ~ 90% against human annotators. The query expansion and relevance ranking strategies out-performed state-of-the-art IR systems by 9%.

Andogah, Geoffrey,

Andogah, Geoffrey

University of Groningen Research Database

  
 University of Groningen
Geographically constrained information retrieval
Andogah, Geoffrey
IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from
it. Please check the document version below.
Document Version
Publisher's PDF, also known as Version of record
Publication date:
2010
Link to publication in University of Groningen/UMCG research database
Citation for published version (APA):
Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.
Copyright
Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the
author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).
Take-down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the
number of authors shown on this cover page is limited to 10 maximum.
Download date: 11-02-2018
Geographically Constrained Information
Retrieval
Geoffrey Andogah
ii
CLCG
Center for Language and Cognition Groningen
The work in this thesis has been carried out under the auspices of the Center
for Language and Cognition Groningen (CLCG) with funding from Nuffic
Project NPT/UGA/132 on “Building Sustainable ICT Training Capacity in
the Public Universities in Uganda”.
Groningen Dissertations in Linguistics 79
ISSN 0928-0030
©2010 Geoffrey Andogah
ISBN 978-90-367-4309-9
Cover design by: Peter Kleiweg
Printed by Wo¨hrmann Print Service, Zutphen, Netherlands.
Document prepared with LATEX2ε using Stasinos Konstantopoulos’ RuGth-
esis.cls.
Rijksuniversiteit Groningen
Geographically Constrained Information Retrieval
Proefschrift
ter verkrijging van het doctoraat in de
Wiskunde en Natuurwetenschappen
aan de Rijksuniversiteit Groningen
op gezag van de
Rector Magnificus, dr. F. Zwarts,
in het openbaar te verdedigen op
vrijdag 21 mei 2010
om 14:45 uur
door
Geoffrey Andogah
geboren op 7 december 1968
te Kampala, Uganda
Promotor: Prof. dr. ir. J. Nerbonne
Copromotor: Dr. G. Bouma
Beoordelingscommissie: Prof. dr. V. Baryamureeba
Prof. dr. F. de Jong
Prof. dr. H. Sol
To God the Father, God the Son and God the Holy Spirit.
v
vi
Preface
I am honoured to thank all the people who directly or indirectly contributed
to the realization of this thesis. Indeed I will not remember everyone of you
who helped me in one-way or another to achieve what I have reported in
this thesis. The one thing that I can promise you is that God Almighty, the
Father of our Lord Jesus Christ knows you, and remembers your contribution
toward this thesis, and will indeed reward you abundantly.
This thesis would not have been possible without the encouragement,
leadership and support of my promoter Prof. dr. ir. John Nerbonne, and
co-promoter and supervisor Dr. Gosse Bouma. Thank you Prof. John and
Dr. Gosse for your patience to ensure that this thesis is completed. John, I
vividly remember the day you introduced me to Dutch soup, I will continue
to enjoy it.
I am indebted to fellow PhD students at CLCG for the support and
encouragement – Lonneke van der Plas, Jori Mur, Ismael Fahmi and Jacky
Benavides. I am also grateful to Wyke van der Meer for taking care of all the
practical stuff at CLCG. I want to say thank you to every member of staff
of CLCG for your cooperation – Dr. Gertjan van Noord, Dr. Elwin Koster,
Dr. Leonie Bosveld-de Smet, Drs. Peter Kleiweg.
It is a pleasure to thank the people at the International Bureau for taking
care of all the practical issues during my stay at Groningen – Erik Haarbrink,
Gonny Lakerveld and Marieke Farchi.
I am deeply grateful to the members of the reading committee for their
critical and positive comments – Prof. dr. Venansius Baryamureeba, Prof.
dr. Franciska de Jong and Prof. dr. Henk Sol. I am solely responsible for
any error still lingering in this dissertation.
I thank my paranymphs Jelena Prokic and Peter Nabende for accepting
to stand by me. I am especially indebted to Jelena Prokic for making sure
that all the paper work is completed for my defence.
I am indebted to Prof. dr. J.H. Nyeko Pen-Mogi, the Vice Chancellor
of Gulu University for swaying me to pursue PhD studies way back in 2003.
Thank you for sowing the seed of determination and courage in me to venture
vii
viii
into unknown territories.
I am grateful to Prof. dr. Venansius Baryamureeba for making available
space to conduct my research at the Faculty of Computing and Informatics
(FCI), Makerere University. I indeed enjoyed the company of FCI staff, not
forgetting the wonderful meals at FCI.
I enjoyed the company of fellow PhD/Masters students at Groningen
sponsored under NPT Project NPT/UGA/132 on ‘Building a Sustainable
ICT Training Capacity in four (4) Public Universities in Uganda’ – Florence
Tushabe, Julianne Sansa Otim and Proscovia Olango.
I also acknowledge the contribution of CLEF to this work by providing
dataset and forum to evaluate our GIR system.
My sincere gratitude goes to the pastoral team and members of the RCCG
– Embassy of God Groningen. Thank you for spiritual guidance and comfort
you accorded me at Church.
My deepest thanks and love goes to my dear wife Annastazia Nicodemus
Chengulla for being supportive over the years. Your prayers, kindness and
love kept me going in times when things are at standstill. I am also indebted
to my children Glory Rong Babua, Samuel Godwill Babua and Victoria Peace
Babua for those questions I could not respond to knowing that I will be away
for awhile – Dad are you coming back?
My gratitude also goes to my parents Stephen Babua and Margaret
Nguju; uncle Michael E. Nguma, brothers Edemah Fredrick and Alile Roland;
and sisters Adiru Lydia and Robinah Angucia.
Contents
1 The thesis 1
1.1 Geographical information retrieval . . . . . . . . . . . . . . . . 2
1.2 Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Problem statement . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Research objective . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Research justification . . . . . . . . . . . . . . . . . . . 6
1.2.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 State-of-the-art in GIR 11
2.1 Toponym resolution . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.1 Default sense heuristics . . . . . . . . . . . . . . . . . . 12
2.1.2 Pattern matching and hierarchy overlap . . . . . . . . 13
2.1.3 One referent per discourse . . . . . . . . . . . . . . . . 14
2.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Scope resolution . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Country scope . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 A graph ranking algorithm . . . . . . . . . . . . . . . . 16
2.2.3 Taxonomy hierarchy . . . . . . . . . . . . . . . . . . . 17
2.2.4 Geographical distribution . . . . . . . . . . . . . . . . 18
2.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Query Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.1 Knowledge-based expansion . . . . . . . . . . . . . . . 20
2.3.2 Relevance feedback expansion . . . . . . . . . . . . . . 21
2.3.3 Document geographic term expansion . . . . . . . . . . 22
2.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Relevance ranking . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 Euclidean distance . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Extent of overlap . . . . . . . . . . . . . . . . . . . . . 23
2.4.3 Containment relations . . . . . . . . . . . . . . . . . . 24
2.4.4 Query footprint as filter . . . . . . . . . . . . . . . . . 24
ix
x CONTENTS
2.4.5 Geographic scope indexing . . . . . . . . . . . . . . . . 24
2.4.6 Other criteria . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Data and tools 27
3.1 Geographical database . . . . . . . . . . . . . . . . . . . . . . 27
3.1.1 Geonames.org database . . . . . . . . . . . . . . . . . . 28
3.1.2 Other GeoDBs . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 TR-CLEF . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.2 TR-RNW . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.3 Other collections . . . . . . . . . . . . . . . . . . . . . 41
3.3 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.1 Alias-i Lingpipe . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 OpenNLP tools . . . . . . . . . . . . . . . . . . . . . . 47
3.3.3 WordFreak . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3.4 GATE ANNIE . . . . . . . . . . . . . . . . . . . . . . 47
3.3.5 Apache UIMA . . . . . . . . . . . . . . . . . . . . . . . 47
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 Scope resolution 49
4.1 Resolution with place names . . . . . . . . . . . . . . . . . . . 50
4.1.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.2 Reference scope . . . . . . . . . . . . . . . . . . . . . . 60
4.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Resolution with person names . . . . . . . . . . . . . . . . . . 72
4.2.1 Person name ambiguity resolution . . . . . . . . . . . . 76
4.2.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.2.3 Implementation . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.1 Evaluating place name based strategy . . . . . . . . . . 88
4.4.2 Evaluating person name based strategy . . . . . . . . . 90
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5 Toponym resolution 95
5.1 Toponym resolution procedure . . . . . . . . . . . . . . . . . . 97
5.2 Evaluation metric . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4 The big picture . . . . . . . . . . . . . . . . . . . . . . . . . . 113
CONTENTS xi
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6 Query expansion 117
6.1 Document processing . . . . . . . . . . . . . . . . . . . . . . . 118
6.2 Query expansion . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.2.1 Top term-based expansion . . . . . . . . . . . . . . . . 121
6.2.2 Scope constrained expansion . . . . . . . . . . . . . . . 122
6.2.3 Query expansion evaluation . . . . . . . . . . . . . . . 124
6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7 Relevance ranking 135
7.1 Non-geographic relevance measure . . . . . . . . . . . . . . . . 136
7.2 Scope-based relevance measure . . . . . . . . . . . . . . . . . . 136
7.3 Type-based relevance measure . . . . . . . . . . . . . . . . . . 137
7.4 Relevance measure unification . . . . . . . . . . . . . . . . . . 140
7.4.1 Linear interpolated combination . . . . . . . . . . . . . 140
7.4.2 Weighted harmonic mean combination . . . . . . . . . 140
7.4.3 Extended harmonic mean combination . . . . . . . . . 141
7.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.5.1 Harmonic mean vs. linear interpolated combination . . 142
7.5.2 Extended harmonic mean combination . . . . . . . . . 144
7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8 Conclusion 149
8.1 Achievements . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
8.1.1 Scope resolution . . . . . . . . . . . . . . . . . . . . . . 150
8.1.2 Toponym resolution . . . . . . . . . . . . . . . . . . . . 151
8.1.3 Query expansion . . . . . . . . . . . . . . . . . . . . . 152
8.1.4 Relevance ranking . . . . . . . . . . . . . . . . . . . . . 153
8.1.5 Evaluation data . . . . . . . . . . . . . . . . . . . . . . 154
8.1.6 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.2.1 Scope resolution . . . . . . . . . . . . . . . . . . . . . . 155
8.2.2 Toponym resolution . . . . . . . . . . . . . . . . . . . . 156
8.2.3 Query expansion . . . . . . . . . . . . . . . . . . . . . 156
8.2.4 Relevance ranking . . . . . . . . . . . . . . . . . . . . . 157
8.2.5 Evaluation data . . . . . . . . . . . . . . . . . . . . . . 157
8.3 Final remark . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
xii CONTENTS
A Sample stories 159
A.1 News story about Mexico . . . . . . . . . . . . . . . . . . . . . 159
A.2 News story about Thailand . . . . . . . . . . . . . . . . . . . 160
A.3 News story about Lake Victoria . . . . . . . . . . . . . . . . . 161
B GeoCLEF topics 163
B.1 GeoCLEF 2005 English topic titles . . . . . . . . . . . . . . . 163
B.2 GeoCLEF 2006 English topic titles . . . . . . . . . . . . . . . 164
B.3 GeoCLEF 2007 English topic titles . . . . . . . . . . . . . . . 165
Bibliography 167
Summary 177
Samenvatting 181
List of Figures
1.1 Common GIR processing procedure. . . . . . . . . . . . . . . . 3
1.2 Schematic of the research objectives. . . . . . . . . . . . . . . 6
2.1 Example taxonomy of places. . . . . . . . . . . . . . . . . . . 12
3.1 Schematic of the Geonames.org feature code hierarchy. . . . . 28
3.2 Potential ambiguity in GeoCLEF 2006 relevant documents. . . 36
3.3 Map showing the result of toponym resolution. . . . . . . . . . 39
3.4 Sample of a GeoCLEF document. . . . . . . . . . . . . . . . . 40
3.5 Sample of RNW story. . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Example output of Alias-i’s LingPipe NLP tool. . . . . . . . . 46
4.1 Geographic terms in articles appendix A.1, A.2 and A.3. . . . 52
4.2 Geographical scopes of article A.1 (top) and A.2 (bottom). . . 53
4.3 Scope of plane crash in Lake Victoria in March 2009. . . . . . 54
4.4 Data model for geographical scope modeling. . . . . . . . . . . 56
4.5 Sample data model for scope of the Netherlands. . . . . . . . . 57
4.6 Directional sub-division of the Netherlands. . . . . . . . . . . 61
4.7 Example U.S.A & Canada GeoVIP grouping. . . . . . . . . . . 79
4.8 News story featuring Hillary Clinton and Stockwell Day. . . . 83
4.9 place name based scope resolution performance. . . . . . . . . 89
4.10 Sample pseudo documents. . . . . . . . . . . . . . . . . . . . . 91
4.11 VIP based resolution performance on pseudo document. . . . . 92
5.1 Toponym resolution schematic. . . . . . . . . . . . . . . . . . 96
5.2 Toponym resolution algorithm. . . . . . . . . . . . . . . . . . 98
5.3 Nearness to correct location. . . . . . . . . . . . . . . . . . . . 104
5.4 Example hierarchy structure. . . . . . . . . . . . . . . . . . . . 105
5.5 Sample Geonames.org feature code hierarchy. . . . . . . . . . . 108
5.6 Mahali system architecture. . . . . . . . . . . . . . . . . . . . 114
6.1 Radio Netherlands Worldwide (RNW) summary. . . . . . . . . 119
6.2 Schematic of scope constrained relevance feedback procedure. . 123
xiii
xiv LIST OF FIGURES
6.3 Sample scope hierarchy for Groningen and Rotterdam. . . . . 124
6.4 GeoCLEF 2007 topic and pre-processed format. . . . . . . . . 125
6.5 Interpolated recall vs precision average. . . . . . . . . . . . . . 126
6.6 Performance of term-based query expansion procedure. . . . . 128
6.7 Performance of scope-based query expansion procedure. . . . . 130
6.8 Topic performance on residual collection. . . . . . . . . . . . . 131
7.1 Sample Geonames.org feature code hierarchy. . . . . . . . . . . 138
7.2 Sample documents with geographic feature classes and types. . 139
7.3 Variation of MAP as a factor of NIF λT . . . . . . . . . . . . . 143
7.4 Variation of MAP as a factor of RIF β. . . . . . . . . . . . . . 144
7.5 Variation of MAP as a function of RIF β. . . . . . . . . . . . 146
7.6 GeoCLEF 2007 per topic performance. . . . . . . . . . . . . . 146
List of Tables
1.1 Sample keywords with geographical interest of Groningen . . . 7
2.1 Toponym resolution techniques in literature. . . . . . . . . . . 15
2.2 Scope resolution techniques in literature. . . . . . . . . . . . . 19
3.1 Geonames.org feature classification. . . . . . . . . . . . . . . . 29
3.2 Geonames.org feature statistics per classification. . . . . . . . 30
3.3 Example Geonames.org feature code. . . . . . . . . . . . . . . 30
3.4 Example Geonames.org database structure (as used in here). . 30
3.5 English monolingual GeoCLEF relevant document counts. . . 32
3.6 GeoCLEF 2006 topic. . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 GeoCLEF 2006 relevant document characteristic. . . . . . . . 35
3.8 Fifteen most frequent and least frequent toponyms. . . . . . . 35
3.9 Inter-annotator agreement. . . . . . . . . . . . . . . . . . . . . 37
3.10 Inter-annotator agreement per feature types. . . . . . . . . . . 37
3.11 TR-RNW corpus characteristic. . . . . . . . . . . . . . . . . . 41
3.12 TR-RNW fifteen most frequent and least frequent toponyms. . 43
3.13 TR-CoNLL & TR-MUC4 corpus characteristic. . . . . . . . . 44
3.14 Named entity types defined in Chinchor (1997). . . . . . . . . 44
4.1 Zone index for the sample Netherlands scope data in Fig. 4.5. 59
4.2 Statistics of standard geographic scope. . . . . . . . . . . . . . 63
4.3 Example Lucene index for scope Europe. . . . . . . . . . . . . 67
4.4 Example Lucene index for scope the Netherlands. . . . . . . . 68
4.5 Reference scope data layout in Lucene index. . . . . . . . . . . 69
4.6 Place type and population weights. . . . . . . . . . . . . . . . 72
4.7 Place and people adjective occurrence on the Internet. . . . . 73
4.8 Example query formulation for per field querying. . . . . . . . 74
4.9 Example administrative division vs. position. . . . . . . . . . 75
4.10 Top ten surnames and first names in Wikipedia. . . . . . . . . 77
4.11 Example GeoVIP data for U.S.A. . . . . . . . . . . . . . . . . 80
4.12 Example GeoVIP sample data for Canada. . . . . . . . . . . . 81
xv
xvi LIST OF TABLES
4.13 Person name weight computation for VIPs in Fig. 4.8. . . . . . 82
4.14 Example scope resolution using VIP names in Fig. 4.8. . . . . 82
4.15 Example country scope Lucene index data source. . . . . . . . 84
4.16 Example district scope Lucene index data source. . . . . . . . 84
4.17 Fictitious sample data to illustrate metric performance. . . . . 87
4.18 CoNLL 2003 place name statistics. . . . . . . . . . . . . . . . 88
4.19 Scope resolution result using Eq. 4.12 and Eq. 4.13. . . . . . . 88
4.20 Results of PageRank and HITS on Reuters-21578 . . . . . . . 90
4.21 Performance of VIP based approach on news articles. . . . . . 92
5.1 Characteristics of TR-CoNLL, TR-CLEF & TR-RNW. . . . . 109
5.2 Toponym resolution results on TR-CoNLL. . . . . . . . . . . . 111
5.3 Toponym resolution results on TR-RNW and TR-CLEF. . . . 113
6.1 Query place name expansion illustration with Eq. 6.1. . . . . . 122
6.2 Summary of term based relevance feedback evaluation. . . . . 127
6.3 Result of term-based approach using relevant documents. . . . 129
6.4 Summary of scope-based relevance feedback evaluation. . . . . 129
6.5 Summary of feedback evaluation on residual collection. . . . . 132
7.1 Example topic grouping and query formulation . . . . . . . . . 142
7.2 Comparison to GeoCLEF 2007 participants. . . . . . . . . . . 145


Geographically constrained information retrieval

ARTS repository - University of Groningen

   University of GroningenGeographically constrained information retrievalAndogah, GeoffreyIMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.Document VersionPublisher's PDF, also known as Version of recordPublication date:2010Link to publication in University of Groningen/UMCG research databaseCitation for published version (APA):Andogah, G. (2010). Geographically constrained information retrieval. s.n.CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.Download date: 30-10-2023BibliographyHarith Alani, Christopher B. Jones, and Douglas Tudhope. Voronoi-based re-gion approximation for geographical information retrieval with gazetteers.International Journal of Geographical Information Science, 15(4):287–306,2001.Einat Amitay, Nadav Har’El, Ron Sivan, and Aya Soffer. Web-a-Where:GeotaggingWeb Content. In Proceedings of SIGIR-04, the 27th Conferenceon Research and Development in Information Retrieval, Sheffield, SouthYorkshire, UK, 2004.Geoffrey Andogah. GIR Experimentation. In Evaluation of Multilin-gual and Multi-modal Information Retrieval, volume 4730/2009 of Lec-ture Notes in Computer Science LNCS, pages 881–888. Springer-VerlagBerlin/Heidelberg, 2007.Geoffrey Andogah and Goose Bouma. Relevance measures using geographicscopes and types. In Advances in Multilingual and Multimodel InformationRetrieval, volume 5152/2008 of Lecture Notes in Computer Science LNCS,pages 794–801. Springer-Verlag Berlin/Heidelberg, 2008.Geoffrey Andogah and Gosse Bouma. University of Groningen at GeoCLEF2007. In Working Notes for CLEF 2007, Budapest, Hungary, 2007.Geoffrey Andogah, Goose Bouma, John Nerbonne, and Elwin Koster. Geo-graphical scope resolution. In Proceedings of the LREC 2008 Workshop onMethodologies and Resources for Processing Spatial Language, Marrakech,Morocco, 2008.Leonardo Andrade and Mario J. Silva. Relevance Ranking for GeographicIR. In Proceedings of the Workshop on Geographical Information Retrieval,SIGIR’06, Seattle, USA, August 2006.167168 BibliographyKate Beard and Vyjayanti Sharma. Multidimensional ranking for data indigital spatial libraries. International Journal on Digital Libraries, 1:153–160, 1997.Davide Buscaldi and Paolo Rosso. The UPV at GeoCLEF 2007. In WorkingNotes for CLEF 2007, 2007.Davide Buscaldi and Paolo Rosso. On the Relative Importance of Toponymsin GeoCLEF. In Advances in Multilingual and Multimodel InformationRetrieval, volume 5152/2008 of Lecture Notes in Computer Science LNCS,pages 815–822. Springer-Verlag Berlin/Heidelberg, 2008.Davide Buscaldi, Paolo Rosso, and Emilio Sanchis Arnal. Using the WordNetOntology in the GeoCLEF Geographical Information Retrieval Task. InAccessing Multilingual Information Repositories, volume 4022/2006, pages939–946. Springer (Lecture Notes in Computer Science LNCS), 2006.Cláudio Eĺızio Calazans Campelo and Cláudio de Souza Baptista. Geographicscope modeling for web documents. In Proceedings of Workshop on Ge-ographic Information Retrieval (GIR’08), Napa Valley, California, USA,2008.Nuno Cardoso, David Cruz, Marcirio Chaves, and Mário J. Silva. The Uni-versity of Lisbon at GeoCLEF 2007. In Working Notes for CLEF 2007,Budapest, Hungary, 2007.Jean Carletta. Assessing agreement on classification tasks: The kappa statis-tic. In Computational Linguistics, volume 22 of 2, pages 249–254, Cam-bridge, MA, USA, June 1996. MIT Press.Xavier Carreras, Llúıs Màrquez, and Llúıs Padró. A Simple Named EntityExtractor using AdaBoost. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Hai Leong Chieu and Hwee Tou Ng. Named Entity Recognition with aMaximum Entropy Approach. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Nancy Chinchor. MUC-7 Named Entity Task Definition, 1997. 17 Octo-ber 2007: http://www.itl.nist.gov/iad/894.02/related_projects/muc/.Paul Clough. Extracting Metadata for Spatially-Aware Information Retrievalon the Internet. In Proceedings of Workshop on Geographic InformationRetrieval (GIR’05), CIKM2005, Bremen, Germany, 2005.169James R. Curran and Stephen Clark. Language Independent NER using aMaximum Entropy Tagger. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Bruno Emanuel da Graca Martins. Geographically Aware Web Text Min-ing. PhD thesis, Departamento de Informática, Faculdade de Ciências,Universidade de Lisboa, 2008.Tiago M. Delboni, Karla A. V. Dorges, Alberto H. F. Laender, and ClodoveuA. Davis Jr. Semantic Expansion of Geographic Web Queries Based onNatural Language Positioning Expressions. In Transactions in GIS, pages377–397, 2007.Junyan Ding, Luis Gravano, and Narayanan Shivakumar. Computing Geo-graphical Scopes of Web Resources. In Proceedings of the 26th Very LargeData Bases (VLDB) Conference, pages 545–556. Morgan Kaufmann Pub-lishers Inc., 2000.David Dubin. The most influential paper gerard salton never wrote. Technicalreport, Graduate School of Library and Information Science. University ofIllinois at Urbana-Champaign., 2004.Michael Ben Fleischman and Eduard Hovy. Multi-document person nameresolution. In ANNUAL MEETING OF THE ASSOCIATION FORCOMPUTATIONAL LINGUISTICS (ACL), REFERENCE RESOLU-TION WORKSHOP, pages 66–82, 2004.Radu Florian, Abe Ittycheriah, Hongyan Jing, and Tong Zhang. NamedEntity Recognition through Classifier Combination. In Proceedings ofCoNLL-2003, Edmonton, Canada, 2003.Gaihua Fu, Christopher B. Jones, and Alia I. Abdelmonty. Ontology-basedSpatial Query Expansion in Information Retrieval. In On the Move toMeaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, volume3761/2005, pages 1466–1482. Springer (Lecture Notes in Computer ScienceLNCS), 2005.William A. Gale, Kenneth W. Chruch, and David Yarowsky. One SensePer Discourse. In Proceedings of the Fourth DARPA Speech and NaturalLanguage Workshop, pages 233–237, 1992.Fredric Gey, Ray Larson, Mark Sanderson, Hideo Joho, Paul Clough, andVivien Petras. GeoCLEF: the CLEF 2005 Cross-Language Geographic170 BibliographyInformation Retrieval Track Overview. In Cross-Language Evaluation Fo-rum: CLEF 2005. Springer (Lecture Notes in Computer Science LNCS4022), 2006.Fredric Gey, Ray Larson, Mark Sanderson, Kerstin Bischoff, Thomas Mandl,Christa Womser-Hacker, Diana Santos, Paulo Rocha, Giorgio M. Di Nun-zio, and Nicola Ferro. GeoCLEF 2006: The CLEF 2006 Cross-LanguageGeographic Information Retrieval Track Overview. In Evaluation of Mul-tilingual and Multi-modal Information Retrieval, volume 4730 of LectureNotes in Computer Science, pages 852–876. Springer Berlin / Heidelberg,2007.Otis Gospodnetic and Eric Hatcher. Lucene in Action. Manning PublicationsCo., 206 Bruce Park Avenue, Greenwich, CT 06830, 2005.Linda L. Hill, James Frew, and Qi Zheng. Geographic Names: The Imple-mentation of a Gazetteer in a Georeferenced Digital Library. In D-LibMagazine, volume 5 of 1, January 1999.Thorsten Joachims, Laura Granka, and Bing Pan. Accurately InterpretingClickthrough Data as Implicit Feedback. In Proceedings of the 28th an-nual international ACM SIGIR conference on Research and developmentin information retrieval, Salvador, Brazil, 2005.Christopher B. Jones and Ross Purves. GIR’05 2005 ACM workshop ongeographical information retrieval. ACM SIGIR Forum, 40(1):34–37, 2006.Christopher B. Jones, Harith Alani, and Douglas Tudhope. Geographic In-formation Retrieval with Ontologies of Place. In Proceedings of the Inter-national Conference on Spatial Information Theory: Foundations of Geo-graphic Information Science, volume 2205/2001, pages 322–335. Springer(Lecture Notes in Computer Science LNCS), 2001.Diane Kelly and Nicholas J. Belkin. Reading Time, Scrolling and Interac-tion: Exploring Implicit Sources of User Preferences for Relevance Feed-back During Interactive Information Retrieval. In Proceedings of the 24thannual international ACM SIGIR conference on Research and developmentin information retrieval, 2001.Diane Kelly and Jaime Teevan. Implicit Feedback for Inferring User Prefer-ence: A Bibliography. ACM SIGIR Forum, 37(2):18–28, 2003.171Dan Klein, Joseph Smarr, Huy Nguyen, and Christopher D. Manning.Named Entity Recognition with Character-Level Models. In Proceedingsof CoNLL-2003, Edmonton, Canada, 2003.Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. Jour-nal of the ACM, 46:604–632, 1999.Ray R. Larson and Patricia Frontiera. Spatial Ranking Methods for Geo-graphic Information Retrieval (GIR) in Digital Libraries. In Research andAdvanced Technology for Digital Libraries, volume 3232 of Lecture Notesin Computer Science, pages 45–56. Springer Berlin / Heidelberg, 2004.Ray R. Larson, Fredric C. Gey, and Vivien Petras. Berkeley at GeoCLEF:Logistic Regression and Fusion for Geographic Information Retrieval. InAccessing Multilingual Information Repositories, volume 4022/2006, pages963–976. Springer (Lecture Notes in Computer Science LNCS), 2006.Jochen L. Leidner. Experiments with Geo-Filtering Predicates for Geo-graphic IR. In Accessing Multilingual Information Repositories, volume4022/2006, pages 987–996. Springer (Lecture Notes in Computer ScienceLNCS), 2006.Jochen L. Leidner, Gail Sinclair, and Bonnie Webber. Grounding spatialnamed entities for information extraction and question answering. In Ko-rnai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003Workshop on Analysis of Geographic References, pages 31–38, Alberta,Canada, 2003.Jochen Lothar Leidner. Toponym Resolution in Text: Annotation, Evalua-tion and Applications of Spatial Grounding of Place Names. PhD thesis,Institute for Communicating and Collaborative Systems, School of Infor-matics, University of Edinburgh, 2007.Huifeng Li, Rohini K. Srihari, Cheng Niu, and Wei Li. InfoXtract locationnormalization: a hybrid approach to geographic references in informationextraction. In Kornai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003 Workshop on Analysis of Geographic References, pages 39–44, Alberta, Canada, 2003.Eugene E. Loos, Susan Anderson, Jr. Dwight H., Day, Paul C. Jordan, andJ. Douglas Wingate. Glossary of linguistic terms, 2004. 5 October 2007:http://www.sil.org/linguistics/.172 BibliographyThomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson,Mark Sanderson, Diana Santos, Christa Womser-Hacker, and Xing Xie.GeoCLEF 2007: the CLEF 2007 Cross-Language Geographic InformationRetrieval Track Overview . In Working Notes for CLEF 2007, 2007.Thomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson,Mark Sanderson, Diana Santos, Christa Womser-Hacker, and Xing Xie.GeoCLEF 2007: the CLEF 2007 Cross-Language Geographic InformationRetrieval Track Overview . In Advances in Multilingual and MultimodelInformation Retrieval, volume 5152/2008 of Lecture Notes in ComputerScience LNCS, pages 674–686. Springer-Verlag Berlin/Heidelberg, 2008.Thomas Mandl, Paula Carvalho, Giorgio Maria Di Nunzio, Fredric Gey,Ray R. Larson, Diana Santos, and Christa Womser-Hacker. GeoCLEF2008: The CLEF 2008 Cross-Language Geographic Information RetrievalTrack Overview . In Evaluating Systems for Multilingual and MultimodalInformation Access, volume 5706/2009 of Lecture Notes in Computer Sci-ence LNCS, pages 808–821. Springer-Verlag Berlin/Heidelberg, 2009.Gideon S. Mann and David Yarowsky. Unsupervised personal name disam-biguation. In Proceedings of the seventh conference on Natural languagelearning at HLT-NAACL 2003, pages 33–40. Association for Computa-tional Linguistics, 2003.Christopher D. Manning, Prabhakar Raghven, and Hinrich Schütze. An In-toducation To Infromation Retrieval. Cambridge University Press, Cam-bridge, England, November 2007. Draft.Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xiaohui Long, and Bern-hard Seefer. Design and Implementation of a Geographic Search Engine. InEighth International Workshop on the Web and Databases (WebDB 2005),2005.Bruno Martins and Mário J. Silva. A Graph-Ranking Algorithm for Geo-Referencing Documents. In Proceedings of ICDM-05, the 5th IEEE Inter-national Conference on Data Mining, Texas, USA, Novermber 2005.Bruno Martins, Mário J. Silva, and Leonardo Andrade. Indexing and Rank-ing in Geo-IR Systems. In Proceedings of the ACM Workshop on Geo-graphic Information Retrieval, pages 31–34. ACM New York, NY, USA,2005.173Bruno Martins, Mário J. Silva, Sérgio Freitas, and Ana Paula Afonso. Han-dling Locations in Search Engine Queries. In Proceedings of the 3rd Work-shop on Geographic Information Retrieval held at The 29th Annual Inter-national ACM SIGIR Conference, Seattle, WA, USA, 2006.Bruno Martins, Nuno Cardoso, Marcirio Chaves, Leonardo Andrade, andMário J. Silva. The University of Lisbon at GeoCLEF 2006. In Evalua-tion of Multilingual and Multi-modal Information Retrieval, 7th Workshopof the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain,September 20-22, 2006, Revised Selected Papers, volume 4730/2007, pages986–994. Springer (Lecture Notes in Computer Science LNCS), 2007.Simon E Overell. Geographic Information Retrieval: Classification, Disam-biguation and Modelling. PhD thesis, Department of Computing, ImperialCollege London, 2009.Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. ThePageRank citation ranking: Bringing order to the Web, 1999. TechnicalReport SIDL-WP-1999-0120, Stanford Digital Library.Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, and Tom De Groeve. Ge-ographical Information Recognition and Visualisation in Texts Written inVarious Languages. In Proceedings of ACM (SAC2004), Nicosia, Cyprus,2004.Bruno Pouliquen, Marco Kimler, Ralf Steinberger, Camelia Ignat, TamaraOellinger, Ken Blackler, Flavio Fluart, Wajdi Zaghouani, Anna Widi-ger, Ann-Charlotte Forslund, and Clive Best. Geocoding MultilingualTexts: Recognition, Disambiguation and Visualisation. In Proceedings ofThe Fifth International Conference on Language Resources and Evaluation(LREC), pages 53–58, 2006.Erik Rauch, Michael Bukatin, and Kenneth Baker. A Confidence-BasedFramework for Disambiguating Geographic Terms. In Kornai, A. andSundheim, B. (eds) Proceedings of the HTL-NAACL 2003 Workshop onAnalysis of Geographic References, pages 50–54, Alberta, Canada, 2003.Tony Rose, Mark Stevenson, and Miles Whitehead. Reuters Corpus Volume 1- from yesterday?s news to tomorrow?s language resources. In Proceedingsof the 3rd International Conference on Language Resources and Evaluation(LREC-2002), volume 3, pages 827–833, 2002.174 BibliographyGerard Salton. Automatic Text Processing: The Transformation Analysisand Retrieval of Information by Computer. Addison-Wesley Series in Com-puter Science, 1989.Gerard Salton and Chris Buckley. Improving retrieval performance by rele-vance feedback. In Journal of the American Society for Information Sci-ence, volume 41 of 4, pages 288–297. John Wiley & Sons, Inc., 1990.Mark Sanderson and Janet Kohler. Analyzing geographic queries. In Pro-ceedings of the Workshop on Geographic Information Retrieval held at The27th Annual International ACM SIGIR Conference, Sheffield, Engand,UK, 2004.Christoph Schlieder, Thomas Vogele, and Ubbo Visser. Qualitative SpatialRepresentation for Information Retrieval by Gazetteers. In Spatial Infor-mation Theory. Foundations of Geographic Information Science : Interna-tional Conference, COSIT 2001 Morro Bay, CA, USA, September 19-23,2001. Proceedings, volume 2205/2001, pages 336–351. Springer (LectureNotes in Computer Science LNCS), 2001.David A. Smith and Gideon S. Mann. Bootstrapping Toponym Classifiers. InKornai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003Workshop on Analysis of Geographic References, pages 45–49, Alberta,Canada, 2003.Amanda Spink, Dietmar Wolfram, Major B. J. Jansen, and Tefko Saracevic.Searching the web: The public and their queries. Journal of the AmericanSociety for Information Science and Technology, 52(2):226–234, 2001.Beth M. Sundheim. Overview of the Fourth Message Understanding Evalua-tion and Conference. In Proceedings of the Fourth Message UnderstandingConference (MUC-4), pages 3–21. Morgan Kaufmann Publishers, 1992.Erik F. Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 Shared Task: Language independent named entity recognition. InWalter Daelemans and Miles Osborne, Editors, Proceedings of CoNLL-2003, pages 142–147, 2003.Subodh Vaid, Christopher B. Jones, Hideo Joho, and Mark Sanderson.Spatio-textual Indexing for Geographical Search on the Web. In Advancesin Spatial and Temporal Databases, volume 3633 of Lecture Notes in Com-puter Science, pages 218–235. Springer Berlin / Heidelberg, 2005.175Marc van Kreveld and Iris Reinbacher. Good NEWS: partitioning a simplepolygon by compass directions. In Proceedings of the Nineteenth AnnualSymposium on Computational geometry, pages 78–87. ACM New York,NY, USA, 2003.Cornelis Joost van Rijsbergen. Information Retrieval. Butterworths, 2ndedition, 1979. 7:112-140.Raphael Volz, Joachim Kleb, and Wolfgang Mueller. Towards ontology-baseddisambiguation of geographical identifiers. In Proceedings of WWW2007,May 2007.David R. F. Walker, Ian A. Newman, David J. Medyckyj-Scott, and CliveL. N. Ruggles. A system for identifying datasets for GIS users. Interna-tional Journal of Geographical Information Systems, 6(6):511–527, 1992.Jinxi Xu and W. Bruce Croft. Query expansion using local and global docu-ment analysis. In ACM SIGIR International Conference on Research andDevelopment in Information Retrieval, pages 4–11, New York, NY, USA,1996. ACM.Tong Zhang and David Johnson. A Robust Risk Minimization based NamedEntity Recognition System. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Wenbo Zong, DanWu, Aixin Sun, Ee-Peng Lim, and Dion Hoe-Lian Goh. OnAssigning Place Names to Geography Related Web Pages. In Proceedingsof the Fifth ACM/IEEE-CS Joint Conference on Digital Libraries, pages354–362, 2005.176 Bibliography

English

   University of GroningenGeographically constrained information retrievalAndogah, GeoffreyIMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.Document VersionPublisher's PDF, also known as Version of recordPublication date:2010Link to publication in University of Groningen/UMCG research databaseCitation for published version (APA):Andogah, G. (2010). Geographically constrained information retrieval. s.n.CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.Download date: 08-06-2022BibliographyHarith Alani, Christopher B. Jones, and Douglas Tudhope. Voronoi-based re-gion approximation for geographical information retrieval with gazetteers.International Journal of Geographical Information Science, 15(4):287–306,2001.Einat Amitay, Nadav Har’El, Ron Sivan, and Aya Soffer. Web-a-Where:GeotaggingWeb Content. In Proceedings of SIGIR-04, the 27th Conferenceon Research and Development in Information Retrieval, Sheffield, SouthYorkshire, UK, 2004.Geoffrey Andogah. GIR Experimentation. In Evaluation of Multilin-gual and Multi-modal Information Retrieval, volume 4730/2009 of Lec-ture Notes in Computer Science LNCS, pages 881–888. Springer-VerlagBerlin/Heidelberg, 2007.Geoffrey Andogah and Goose Bouma. Relevance measures using geographicscopes and types. In Advances in Multilingual and Multimodel InformationRetrieval, volume 5152/2008 of Lecture Notes in Computer Science LNCS,pages 794–801. Springer-Verlag Berlin/Heidelberg, 2008.Geoffrey Andogah and Gosse Bouma. University of Groningen at GeoCLEF2007. In Working Notes for CLEF 2007, Budapest, Hungary, 2007.Geoffrey Andogah, Goose Bouma, John Nerbonne, and Elwin Koster. Geo-graphical scope resolution. In Proceedings of the LREC 2008 Workshop onMethodologies and Resources for Processing Spatial Language, Marrakech,Morocco, 2008.Leonardo Andrade and Mario J. Silva. Relevance Ranking for GeographicIR. In Proceedings of the Workshop on Geographical Information Retrieval,SIGIR’06, Seattle, USA, August 2006.167168 BibliographyKate Beard and Vyjayanti Sharma. Multidimensional ranking for data indigital spatial libraries. International Journal on Digital Libraries, 1:153–160, 1997.Davide Buscaldi and Paolo Rosso. The UPV at GeoCLEF 2007. In WorkingNotes for CLEF 2007, 2007.Davide Buscaldi and Paolo Rosso. On the Relative Importance of Toponymsin GeoCLEF. In Advances in Multilingual and Multimodel InformationRetrieval, volume 5152/2008 of Lecture Notes in Computer Science LNCS,pages 815–822. Springer-Verlag Berlin/Heidelberg, 2008.Davide Buscaldi, Paolo Rosso, and Emilio Sanchis Arnal. Using the WordNetOntology in the GeoCLEF Geographical Information Retrieval Task. InAccessing Multilingual Information Repositories, volume 4022/2006, pages939–946. Springer (Lecture Notes in Computer Science LNCS), 2006.Cláudio Eĺızio Calazans Campelo and Cláudio de Souza Baptista. Geographicscope modeling for web documents. In Proceedings of Workshop on Ge-ographic Information Retrieval (GIR’08), Napa Valley, California, USA,2008.Nuno Cardoso, David Cruz, Marcirio Chaves, and Mário J. Silva. The Uni-versity of Lisbon at GeoCLEF 2007. In Working Notes for CLEF 2007,Budapest, Hungary, 2007.Jean Carletta. Assessing agreement on classification tasks: The kappa statis-tic. In Computational Linguistics, volume 22 of 2, pages 249–254, Cam-bridge, MA, USA, June 1996. MIT Press.Xavier Carreras, Llúıs Màrquez, and Llúıs Padró. A Simple Named EntityExtractor using AdaBoost. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Hai Leong Chieu and Hwee Tou Ng. Named Entity Recognition with aMaximum Entropy Approach. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Nancy Chinchor. MUC-7 Named Entity Task Definition, 1997. 17 Octo-ber 2007: http://www.itl.nist.gov/iad/894.02/related_projects/muc/.Paul Clough. Extracting Metadata for Spatially-Aware Information Retrievalon the Internet. In Proceedings of Workshop on Geographic InformationRetrieval (GIR’05), CIKM2005, Bremen, Germany, 2005.169James R. Curran and Stephen Clark. Language Independent NER using aMaximum Entropy Tagger. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Bruno Emanuel da Graca Martins. Geographically Aware Web Text Min-ing. PhD thesis, Departamento de Informática, Faculdade de Ciências,Universidade de Lisboa, 2008.Tiago M. Delboni, Karla A. V. Dorges, Alberto H. F. Laender, and ClodoveuA. Davis Jr. Semantic Expansion of Geographic Web Queries Based onNatural Language Positioning Expressions. In Transactions in GIS, pages377–397, 2007.Junyan Ding, Luis Gravano, and Narayanan Shivakumar. Computing Geo-graphical Scopes of Web Resources. In Proceedings of the 26th Very LargeData Bases (VLDB) Conference, pages 545–556. Morgan Kaufmann Pub-lishers Inc., 2000.David Dubin. The most influential paper gerard salton never wrote. Technicalreport, Graduate School of Library and Information Science. University ofIllinois at Urbana-Champaign., 2004.Michael Ben Fleischman and Eduard Hovy. Multi-document person nameresolution. In ANNUAL MEETING OF THE ASSOCIATION FORCOMPUTATIONAL LINGUISTICS (ACL), REFERENCE RESOLU-TION WORKSHOP, pages 66–82, 2004.Radu Florian, Abe Ittycheriah, Hongyan Jing, and Tong Zhang. NamedEntity Recognition through Classifier Combination. In Proceedings ofCoNLL-2003, Edmonton, Canada, 2003.Gaihua Fu, Christopher B. Jones, and Alia I. Abdelmonty. Ontology-basedSpatial Query Expansion in Information Retrieval. In On the Move toMeaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, volume3761/2005, pages 1466–1482. Springer (Lecture Notes in Computer ScienceLNCS), 2005.William A. Gale, Kenneth W. Chruch, and David Yarowsky. One SensePer Discourse. In Proceedings of the Fourth DARPA Speech and NaturalLanguage Workshop, pages 233–237, 1992.Fredric Gey, Ray Larson, Mark Sanderson, Hideo Joho, Paul Clough, andVivien Petras. GeoCLEF: the CLEF 2005 Cross-Language Geographic170 BibliographyInformation Retrieval Track Overview. In Cross-Language Evaluation Fo-rum: CLEF 2005. Springer (Lecture Notes in Computer Science LNCS4022), 2006.Fredric Gey, Ray Larson, Mark Sanderson, Kerstin Bischoff, Thomas Mandl,Christa Womser-Hacker, Diana Santos, Paulo Rocha, Giorgio M. Di Nun-zio, and Nicola Ferro. GeoCLEF 2006: The CLEF 2006 Cross-LanguageGeographic Information Retrieval Track Overview. In Evaluation of Mul-tilingual and Multi-modal Information Retrieval, volume 4730 of LectureNotes in Computer Science, pages 852–876. Springer Berlin / Heidelberg,2007.Otis Gospodnetic and Eric Hatcher. Lucene in Action. Manning PublicationsCo., 206 Bruce Park Avenue, Greenwich, CT 06830, 2005.Linda L. Hill, James Frew, and Qi Zheng. Geographic Names: The Imple-mentation of a Gazetteer in a Georeferenced Digital Library. In D-LibMagazine, volume 5 of 1, January 1999.Thorsten Joachims, Laura Granka, and Bing Pan. Accurately InterpretingClickthrough Data as Implicit Feedback. In Proceedings of the 28th an-nual international ACM SIGIR conference on Research and developmentin information retrieval, Salvador, Brazil, 2005.Christopher B. Jones and Ross Purves. GIR’05 2005 ACM workshop ongeographical information retrieval. ACM SIGIR Forum, 40(1):34–37, 2006.Christopher B. Jones, Harith Alani, and Douglas Tudhope. Geographic In-formation Retrieval with Ontologies of Place. In Proceedings of the Inter-national Conference on Spatial Information Theory: Foundations of Geo-graphic Information Science, volume 2205/2001, pages 322–335. Springer(Lecture Notes in Computer Science LNCS), 2001.Diane Kelly and Nicholas J. Belkin. Reading Time, Scrolling and Interac-tion: Exploring Implicit Sources of User Preferences for Relevance Feed-back During Interactive Information Retrieval. In Proceedings of the 24thannual international ACM SIGIR conference on Research and developmentin information retrieval, 2001.Diane Kelly and Jaime Teevan. Implicit Feedback for Inferring User Prefer-ence: A Bibliography. ACM SIGIR Forum, 37(2):18–28, 2003.171Dan Klein, Joseph Smarr, Huy Nguyen, and Christopher D. Manning.Named Entity Recognition with Character-Level Models. In Proceedingsof CoNLL-2003, Edmonton, Canada, 2003.Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. Jour-nal of the ACM, 46:604–632, 1999.Ray R. Larson and Patricia Frontiera. Spatial Ranking Methods for Geo-graphic Information Retrieval (GIR) in Digital Libraries. In Research andAdvanced Technology for Digital Libraries, volume 3232 of Lecture Notesin Computer Science, pages 45–56. Springer Berlin / Heidelberg, 2004.Ray R. Larson, Fredric C. Gey, and Vivien Petras. Berkeley at GeoCLEF:Logistic Regression and Fusion for Geographic Information Retrieval. InAccessing Multilingual Information Repositories, volume 4022/2006, pages963–976. Springer (Lecture Notes in Computer Science LNCS), 2006.Jochen L. Leidner. Experiments with Geo-Filtering Predicates for Geo-graphic IR. In Accessing Multilingual Information Repositories, volume4022/2006, pages 987–996. Springer (Lecture Notes in Computer ScienceLNCS), 2006.Jochen L. Leidner, Gail Sinclair, and Bonnie Webber. Grounding spatialnamed entities for information extraction and question answering. In Ko-rnai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003Workshop on Analysis of Geographic References, pages 31–38, Alberta,Canada, 2003.Jochen Lothar Leidner. Toponym Resolution in Text: Annotation, Evalua-tion and Applications of Spatial Grounding of Place Names. PhD thesis,Institute for Communicating and Collaborative Systems, School of Infor-matics, University of Edinburgh, 2007.Huifeng Li, Rohini K. Srihari, Cheng Niu, and Wei Li. InfoXtract locationnormalization: a hybrid approach to geographic references in informationextraction. In Kornai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003 Workshop on Analysis of Geographic References, pages 39–44, Alberta, Canada, 2003.Eugene E. Loos, Susan Anderson, Jr. Dwight H., Day, Paul C. Jordan, andJ. Douglas Wingate. Glossary of linguistic terms, 2004. 5 October 2007:http://www.sil.org/linguistics/.172 BibliographyThomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson,Mark Sanderson, Diana Santos, Christa Womser-Hacker, and Xing Xie.GeoCLEF 2007: the CLEF 2007 Cross-Language Geographic InformationRetrieval Track Overview . In Working Notes for CLEF 2007, 2007.Thomas Mandl, Fredric Gey, Giorgio Di Nunzio, Nicola Ferro, Ray Larson,Mark Sanderson, Diana Santos, Christa Womser-Hacker, and Xing Xie.GeoCLEF 2007: the CLEF 2007 Cross-Language Geographic InformationRetrieval Track Overview . In Advances in Multilingual and MultimodelInformation Retrieval, volume 5152/2008 of Lecture Notes in ComputerScience LNCS, pages 674–686. Springer-Verlag Berlin/Heidelberg, 2008.Thomas Mandl, Paula Carvalho, Giorgio Maria Di Nunzio, Fredric Gey,Ray R. Larson, Diana Santos, and Christa Womser-Hacker. GeoCLEF2008: The CLEF 2008 Cross-Language Geographic Information RetrievalTrack Overview . In Evaluating Systems for Multilingual and MultimodalInformation Access, volume 5706/2009 of Lecture Notes in Computer Sci-ence LNCS, pages 808–821. Springer-Verlag Berlin/Heidelberg, 2009.Gideon S. Mann and David Yarowsky. Unsupervised personal name disam-biguation. In Proceedings of the seventh conference on Natural languagelearning at HLT-NAACL 2003, pages 33–40. Association for Computa-tional Linguistics, 2003.Christopher D. Manning, Prabhakar Raghven, and Hinrich Schütze. An In-toducation To Infromation Retrieval. Cambridge University Press, Cam-bridge, England, November 2007. Draft.Alexander Markowetz, Yen-Yu Chen, Torsten Suel, Xiaohui Long, and Bern-hard Seefer. Design and Implementation of a Geographic Search Engine. InEighth International Workshop on the Web and Databases (WebDB 2005),2005.Bruno Martins and Mário J. Silva. A Graph-Ranking Algorithm for Geo-Referencing Documents. In Proceedings of ICDM-05, the 5th IEEE Inter-national Conference on Data Mining, Texas, USA, Novermber 2005.Bruno Martins, Mário J. Silva, and Leonardo Andrade. Indexing and Rank-ing in Geo-IR Systems. In Proceedings of the ACM Workshop on Geo-graphic Information Retrieval, pages 31–34. ACM New York, NY, USA,2005.173Bruno Martins, Mário J. Silva, Sérgio Freitas, and Ana Paula Afonso. Han-dling Locations in Search Engine Queries. In Proceedings of the 3rd Work-shop on Geographic Information Retrieval held at The 29th Annual Inter-national ACM SIGIR Conference, Seattle, WA, USA, 2006.Bruno Martins, Nuno Cardoso, Marcirio Chaves, Leonardo Andrade, andMário J. Silva. The University of Lisbon at GeoCLEF 2006. In Evalua-tion of Multilingual and Multi-modal Information Retrieval, 7th Workshopof the Cross-Language Evaluation Forum, CLEF 2006, Alicante, Spain,September 20-22, 2006, Revised Selected Papers, volume 4730/2007, pages986–994. Springer (Lecture Notes in Computer Science LNCS), 2007.Simon E Overell. Geographic Information Retrieval: Classification, Disam-biguation and Modelling. PhD thesis, Department of Computing, ImperialCollege London, 2009.Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. ThePageRank citation ranking: Bringing order to the Web, 1999. TechnicalReport SIDL-WP-1999-0120, Stanford Digital Library.Bruno Pouliquen, Ralf Steinberger, Camelia Ignat, and Tom De Groeve. Ge-ographical Information Recognition and Visualisation in Texts Written inVarious Languages. In Proceedings of ACM (SAC2004), Nicosia, Cyprus,2004.Bruno Pouliquen, Marco Kimler, Ralf Steinberger, Camelia Ignat, TamaraOellinger, Ken Blackler, Flavio Fluart, Wajdi Zaghouani, Anna Widi-ger, Ann-Charlotte Forslund, and Clive Best. Geocoding MultilingualTexts: Recognition, Disambiguation and Visualisation. In Proceedings ofThe Fifth International Conference on Language Resources and Evaluation(LREC), pages 53–58, 2006.Erik Rauch, Michael Bukatin, and Kenneth Baker. A Confidence-BasedFramework for Disambiguating Geographic Terms. In Kornai, A. andSundheim, B. (eds) Proceedings of the HTL-NAACL 2003 Workshop onAnalysis of Geographic References, pages 50–54, Alberta, Canada, 2003.Tony Rose, Mark Stevenson, and Miles Whitehead. Reuters Corpus Volume 1- from yesterday?s news to tomorrow?s language resources. In Proceedingsof the 3rd International Conference on Language Resources and Evaluation(LREC-2002), volume 3, pages 827–833, 2002.174 BibliographyGerard Salton. Automatic Text Processing: The Transformation Analysisand Retrieval of Information by Computer. Addison-Wesley Series in Com-puter Science, 1989.Gerard Salton and Chris Buckley. Improving retrieval performance by rele-vance feedback. In Journal of the American Society for Information Sci-ence, volume 41 of 4, pages 288–297. John Wiley & Sons, Inc., 1990.Mark Sanderson and Janet Kohler. Analyzing geographic queries. In Pro-ceedings of the Workshop on Geographic Information Retrieval held at The27th Annual International ACM SIGIR Conference, Sheffield, Engand,UK, 2004.Christoph Schlieder, Thomas Vogele, and Ubbo Visser. Qualitative SpatialRepresentation for Information Retrieval by Gazetteers. In Spatial Infor-mation Theory. Foundations of Geographic Information Science : Interna-tional Conference, COSIT 2001 Morro Bay, CA, USA, September 19-23,2001. Proceedings, volume 2205/2001, pages 336–351. Springer (LectureNotes in Computer Science LNCS), 2001.David A. Smith and Gideon S. Mann. Bootstrapping Toponym Classifiers. InKornai, A. and Sundheim, B. (eds) Proceedings of the HTL-NAACL 2003Workshop on Analysis of Geographic References, pages 45–49, Alberta,Canada, 2003.Amanda Spink, Dietmar Wolfram, Major B. J. Jansen, and Tefko Saracevic.Searching the web: The public and their queries. Journal of the AmericanSociety for Information Science and Technology, 52(2):226–234, 2001.Beth M. Sundheim. Overview of the Fourth Message Understanding Evalua-tion and Conference. In Proceedings of the Fourth Message UnderstandingConference (MUC-4), pages 3–21. Morgan Kaufmann Publishers, 1992.Erik F. Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 Shared Task: Language independent named entity recognition. InWalter Daelemans and Miles Osborne, Editors, Proceedings of CoNLL-2003, pages 142–147, 2003.Subodh Vaid, Christopher B. Jones, Hideo Joho, and Mark Sanderson.Spatio-textual Indexing for Geographical Search on the Web. In Advancesin Spatial and Temporal Databases, volume 3633 of Lecture Notes in Com-puter Science, pages 218–235. Springer Berlin / Heidelberg, 2005.175Marc van Kreveld and Iris Reinbacher. Good NEWS: partitioning a simplepolygon by compass directions. In Proceedings of the Nineteenth AnnualSymposium on Computational geometry, pages 78–87. ACM New York,NY, USA, 2003.Cornelis Joost van Rijsbergen. Information Retrieval. Butterworths, 2ndedition, 1979. 7:112-140.Raphael Volz, Joachim Kleb, and Wolfgang Mueller. Towards ontology-baseddisambiguation of geographical identifiers. In Proceedings of WWW2007,May 2007.David R. F. Walker, Ian A. Newman, David J. Medyckyj-Scott, and CliveL. N. Ruggles. A system for identifying datasets for GIS users. Interna-tional Journal of Geographical Information Systems, 6(6):511–527, 1992.Jinxi Xu and W. Bruce Croft. Query expansion using local and global docu-ment analysis. In ACM SIGIR International Conference on Research andDevelopment in Information Retrieval, pages 4–11, New York, NY, USA,1996. ACM.Tong Zhang and David Johnson. A Robust Risk Minimization based NamedEntity Recognition System. In Proceedings of CoNLL-2003, Edmonton,Canada, 2003.Wenbo Zong, DanWu, Aixin Sun, Ee-Peng Lim, and Dion Hoe-Lian Goh. OnAssigning Place Names to Geography Related Web Pages. In Proceedingsof the Fifth ACM/IEEE-CS Joint Conference on Digital Libraries, pages354–362, 2005.176 Bibliography

University of Groningen Digital Archive

NARCIS 

University of Groningen

https://pure.rug.nl/ws/files/14618889/10_bibl.pdf

Geographically constrained information retrieval

Abstract

Similar works

Full text

Available Versions

University of Groningen Research Database

ARTS repository - University of Groningen

ARTS repository - University of Groningen

University of Groningen Digital Archive

NARCIS

University of Groningen