20 research outputs found
A conceptual framework and a risk management approach for interoperability between geospatial datacubes
De nos jours, nous observons un intĂ©rĂȘt grandissant pour les bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ces bases de donnĂ©es sont dĂ©veloppĂ©es pour faciliter la prise de dĂ©cisions stratĂ©giques des organisations, et plus spĂ©cifiquement lorsquâil sâagit de donnĂ©es de diffĂ©rentes Ă©poques et de diffĂ©rents niveaux de granularitĂ©. Cependant, les utilisateurs peuvent avoir besoin dâutiliser plusieurs bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ces bases de donnĂ©es peuvent ĂȘtre sĂ©mantiquement hĂ©tĂ©rogĂšnes et caractĂ©risĂ©es par diffĂ©rent degrĂ©s de pertinence par rapport au contexte dâutilisation. RĂ©soudre les problĂšmes sĂ©mantiques liĂ©s Ă lâhĂ©tĂ©rogĂ©nĂ©itĂ© et Ă la diffĂ©rence de pertinence dâune maniĂšre transparente aux utilisateurs a Ă©tĂ© lâobjectif principal de lâinteropĂ©rabilitĂ© au cours des quinze derniĂšres annĂ©es. Dans ce contexte, diffĂ©rentes solutions ont Ă©tĂ© proposĂ©es pour traiter lâinteropĂ©rabilitĂ©. Cependant, ces solutions ont adoptĂ© une approche non systĂ©matique. De plus, aucune solution pour rĂ©soudre des problĂšmes sĂ©mantiques spĂ©cifiques liĂ©s Ă lâinteropĂ©rabilitĂ© entre les bases de donnĂ©es gĂ©ospatiales multidimensionnelles nâa Ă©tĂ© trouvĂ©e. Dans cette thĂšse, nous supposons quâil est possible de dĂ©finir une approche qui traite ces problĂšmes sĂ©mantiques pour assurer lâinteropĂ©rabilitĂ© entre les bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ainsi, nous dĂ©finissons tout dâabord lâinteropĂ©rabilitĂ© entre ces bases de donnĂ©es. Ensuite, nous dĂ©finissons et classifions les problĂšmes dâhĂ©tĂ©rogĂ©nĂ©itĂ© sĂ©mantique qui peuvent se produire au cours dâune telle interopĂ©rabilitĂ© de diffĂ©rentes bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Afin de rĂ©soudre ces problĂšmes dâhĂ©tĂ©rogĂ©nĂ©itĂ© sĂ©mantique, nous proposons un cadre conceptuel qui se base sur la communication humaine. Dans ce cadre, une communication sâĂ©tablit entre deux agents systĂšme reprĂ©sentant les bases de donnĂ©es gĂ©ospatiales multidimensionnelles impliquĂ©es dans un processus dâinteropĂ©rabilitĂ©. Cette communication vise Ă Ă©changer de lâinformation sur le contenu de ces bases. Ensuite, dans lâintention dâaider les agents Ă prendre des dĂ©cisions appropriĂ©es au cours du processus dâinteropĂ©rabilitĂ©, nous Ă©valuons un ensemble dâindicateurs de la qualitĂ© externe (fitness-for-use) des schĂ©mas et du contexte de production (ex., les mĂ©tadonnĂ©es). Finalement, nous mettons en Ćuvre lâapproche afin de montrer sa faisabilitĂ©.Today, we observe wide use of geospatial databases that are implemented in many forms (e.g., transactional centralized systems, distributed databases, multidimensional datacubes). Among those possibilities, the multidimensional datacube is more appropriate to support interactive analysis and to guide the organizationâs strategic decisions, especially when different epochs and levels of information granularity are involved. However, one may need to use several geospatial multidimensional datacubes which may be semantically heterogeneous and having different degrees of appropriateness to the context of use. Overcoming the semantic problems related to the semantic heterogeneity and to the difference in the appropriateness to the context of use in a manner that is transparent to users has been the principal aim of interoperability for the last fifteen years. However, in spite of successful initiatives, today's solutions have evolved in a non systematic way. Moreover, no solution has been found to address specific semantic problems related to interoperability between geospatial datacubes. In this thesis, we suppose that it is possible to define an approach that addresses these semantic problems to support interoperability between geospatial datacubes. For that, we first describe interoperability between geospatial datacubes. Then, we define and categorize the semantic heterogeneity problems that may occur during the interoperability process of different geospatial datacubes. In order to resolve semantic heterogeneity between geospatial datacubes, we propose a conceptual framework that is essentially based on human communication. In this framework, software agents representing geospatial datacubes involved in the interoperability process communicate together. Such communication aims at exchanging information about the content of geospatial datacubes. Then, in order to help agents to make appropriate decisions during the interoperability process, we evaluate a set of indicators of the external quality (fitness-for-use) of geospatial datacube schemas and of production context (e.g., metadata). Finally, we implement the proposed approach to show its feasibility
Improving the geospatial consistency of digital libraries metadata
Consistency is an essential aspect of the quality of metadata. Inconsistent metadata records are harmful: given a themed query, the set of retrieved metadata records would contain descriptions of unrelated or irrelevant resources, and may even not contain some resources considered obvious. This is even worse when the description of the location is inconsistent. Inconsistent spatial descriptions may yield invisible or hidden geographical resources that cannot be retrieved by means of spatially themed queries. Therefore, ensuring spatial consistency should be a primary goal when reusing, sharing and developing georeferenced digital collections. We present a methodology able to detect geospatial inconsistencies in metadata collections based on the combination of spatial ranking, reverse geocoding, geographic knowledge organization systems and information-retrieval techniques. This methodology has been applied to a collection of metadata records describing maps and atlases belonging to the Library of Congress. The proposed approach was able to automatically identify inconsistent metadata records (870 out of 10,575) and propose fixes to most of them (91.5%) These results support the ability of the proposed methodology to assess the impact of spatial inconsistency in the retrievability and visibility of metadata records and improve their spatial consistency
Design for geospatially enabled climate modeling and alert system (CLIMSYS):A position paper
The paper brings the focus on to multi-disciplinary approach of presenting climate analysis studies, taking help of interdisciplinary fields to structure the information. The system CLIMSYS provides the crucial element of spatially enabling climate data processing. Even though climate change is a matter of great scientific relevance and of broad general interest, there are some problems related to its communication. Its a fact that finding practical, workable and cost-efficient solutions to the problems posed by climate change is now a world priority and one which links government and non-government organizations in a way not seen before. An approach that should suffice is to create an accessible intelligent system that houses prior knowledge and curates the incoming data to deliver meaningful results. The objective of the proposed research is to develop a generalized system for climate data analysis that facilitates open sharing, central implementation, integrated components, knowledge creation, data format understanding, inferencing and ultimately optimal solution delivery, by the way of geospatial enablement
Geospatial information infrastructures
Manual of Digital Earth / Editors: Huadong Guo, Michael F. Goodchild, Alessandro Annoni .- Springer, 2020 .- ISBN: 978-981-32-9915-3Geospatial information infrastructures (GIIs) provide the technological, semantic,organizationalandlegalstructurethatallowforthediscovery,sharing,and use of geospatial information (GI). In this chapter, we introduce the overall concept and surrounding notions such as geographic information systems (GIS) and spatial datainfrastructures(SDI).WeoutlinethehistoryofGIIsintermsoftheorganizational andtechnologicaldevelopmentsaswellasthecurrentstate-of-art,andreïŹectonsome of the central challenges and possible future trajectories. We focus on the tension betweenincreasedneedsforstandardizationandtheever-acceleratingtechnological changes. We conclude that GIIs evolved as a strong underpinning contribution to implementation of the Digital Earth vision. In the future, these infrastructures are challengedtobecomeïŹexibleandrobustenoughtoabsorbandembracetechnological transformationsandtheaccompanyingsocietalandorganizationalimplications.With this contribution, we present the reader a comprehensive overview of the ïŹeld and a solid basis for reïŹections about future developments
Integration of temporal and semantic components into the Geographic Information. Part II: Methodology
The overall objective of this research project is to enrich geographic data with temporal and semantic components in order to significantly improve spatio-temporal analysis of geographic phenomena. To achieve this goal, we intend to establish and incorporate three new layers (structures) into the core of the Geographic Information by using mark-up languages as well as defining a set of methods and tools for enriching the system to make it able to retrieve and exploit such layers (semantic-temporal, geosemantic, and incremental spatio-temporal). Besides these layers, we also propose a set of models (temporal and spatial) and two semantic engines that make the most of the enriched geographic data. The roots of the project and its definition have been previously presented in Siabato & Manso-Callejo 2011. In this new position paper, we extend such work by delineating clearly the methodology and the foundations on which we will base to define the main components of this research: the spatial model, the temporal model, the semantic layers, and the semantic engines. By putting together the former paper and this new work we try to present a comprehensive description of the whole process, from pinpointing the basic problem to describing and assessing the solution. In this new article we just mention the methods and the background to describe how we intend to define the components and integrate them into the GI
Consortial Geospatial Data Collection: Toward Standards and Processes for Shared GeoBlacklight Metadata
Consortial geospatial data communities, such as the OpenGeoPortal federation and the GeoBlacklight initiative, facilitate contextualized discovery and promote metadata sharing to disperse hosting and preservation responsibilities across institutions. However, the challenges of communal metadata are manifold; they include proliferating standards, varying levels of completeness, mutable technology infrastructures, and uneven availability of human labor. Drawing from literature on metadata quality control, we outline a procedure for âscoringâ GeoBlacklight records to establish a Domain Specific Language for metadata best practices. We propose strategies for authorship and management conducive to functionally interoperable geospatial metadata, that is versioned and enhanceable by the collective
PĂ”llumassiivide identifitseerimissĂŒsteemi kontseptuaalne mudel: geoinfo huvigruppi kontseptuaalse mudeli loomine
VĂ€itekirja elektrooniline versioon ei sisalda publikatsioone.KĂ€esolevas doktoritöös kĂ€sitletakse PĂ”llumassiivide identifitseerimissĂŒsteemi (Land Parcel Identification System, LPIS) Kontseptuaalse Mudeli (LPIS Conceptual Model, LCM) loomist ja selle kasutamist ruumiandmete standardiseerimisel, kvaliteedi hindamisel ja koostoimimisel teiste valdkondade ruumiandmetega. Mudelis kĂ€sitletud ruumiandmeid kasutatakse pĂ”llumajandustoetuste haldamise ja kontrolli eesmĂ€rgil ELi Ăhise PĂ”llumajanduspoliitika (ĂPP) raames.
ĂPP raames makstavate toetuste haldamiseks on igas EL liikmesriigis asutatud Ăhtne haldus-ja kontrollisĂŒsteem (Eestis PĂ”llumajandusregistrite amet, PRIA), mille ruumiandmeid haldav komponent on pĂ”llumassiivide register ehk identifitseerimissĂŒsteem. NĂ”ue kaardistada ja registreerida toetuskĂ”lbulik maa on viinud olukorrani, kus pĂ”llumajandussektoris on tekkinud suur hulk ruumiandmeid. Viimase aastakĂŒmne jooksul on kasvanud ĂPP-ga seotud geoinformaatika sektor Euroopas. ĂPP-ga seotud geoinfo huvigrupp (Spatial Data Interest Community) hĂ”lmab nii andmete tootjaid, haldajaid ja kasutajaid, kui ka IT rakenduste arendajaid ning kaugseire andmete tarnijaid. Vajadus hinnata registrite kvaliteeti ja selle vastavust EL mÀÀrustele ning tagada koostalitlusvĂ”ime keskkonnaalaseid nĂ”udeid toetavate ruumiandmete ja sĂŒsteemidega, kutsus esile LCM-i loomise. Töö eesmĂ€rgiks oli edendada kontseptuaalmodelleerimist pĂ”lluregistrite ruumiandmete kvaliteedi hindamisel ja teiste geoinfo (eelkĂ”ige keskkonnakaitse) valdkondadega koostalitlusvĂ”ime arendamisel.
LCM vĂ€ljatöötamise metodoloogia aluseks oli ISO19100 seeria rahvusvaheliste standardite metoodika, mida samuti rakendavad ja laiendavad INSPIRE direktiivi printsiibid ja millele keskendutakse uurimistöö teoreetilises osas. Mudeli peamiseks sisendiks said ĂPP-d reguleeritavates mÀÀrustes sĂ€testatud kontseptsioonide pĂ”hjalik kĂ€sitlus ja olemasolevate töötavate sĂŒsteemide analĂŒĂŒs, mis pĂ”hineb LPIS kĂŒsitluste tulemustel (Milenov ja Kay, 2006; ZieliĆski ja Sagris, 2008 ja 2009) ja hĂ”lmab erinevate liikmesriikide pĂ”lluregistreid.
VĂ€itekirjas on keskendutud ĂPP otsetoetuste Ă€rimudeli analĂŒĂŒsile ehk ĂPP toetustesĂŒsteemi pĂ”hikontseptsioonidele, tehtud kokkuvĂ”tted ja jĂ€reldused 2006. ja 2008. aasta kĂŒsimustikust. KĂŒsimustikust saadut info laiendati EL pĂ”lluregistrite kvaliteedi hindamise programmi raames.
LCM esimese versiooni keskmes on kaks klassi: ReferenceParcel ehk pĂ”llumassiiv ja AgriculturalParcel ehk toetustaotluses deklareeritud pĂ”ld. ReferenceParcel-i klassi ĂŒlesandeks on toetuskĂ”lbliku pĂ”llumaa identifitseerimine, lokaliseerimine ja pindala mÀÀramine. ReferenceParcel tĂ€idab âkonteineriâ rolli deklareeritavate maatĂŒkkide suhtes. Kuid kĂ€sitletud pĂ”llumassiiviklassi alamtĂŒĂŒpe ning analĂŒĂŒsitud erinevaid pĂ”llumajanduslikke maakatte klassifitseerimise ja kaardistamise lĂ€henemisviise EL liikmesriikides. Töö teisel etapil on otsitud vĂ”imalusi kahe mudeli â LCM ja Maakatastri infosĂŒsteemi mudeli (Land Administration Domain Model, LADM, ISO 19152) â lĂ”imiseks. Kaks mudelit on omavahel integreeritud uue ruumilise klassi SubCadParsel abil â katastriĂŒksuse sees eristuvad maakatte tĂŒĂŒbi alamĂŒksused. KĂ€sitletakse ka mĂ”lema mudeli semantiliselt sarnaseid haldusklasse ja tehakse kindlaks uued seosed kahe mudeliklassi vahel. Ăra on toodud pĂ”hjalik analĂŒĂŒs, millistes reaalse elu tingimustes vĂ”iks toimida kahe mudeli integreerimine.
LCM viimane versioon keskendub kahele aspektile: (i) nende klasside modelleerimisele, mis toetavad vastavust keskkonna, tervise ja loomade heaolu majandamisnÔuetele ning mis toetavad maa heade pÔllumajandus- ja keskkonnatingimuste kontrolli; (ii) mudeli kasutamisele pÔlluregistrite loogilise Ôigsuse (ehk EL mÀÀruste nÔuetele vastavuse) testimiseks. Selleks on vÀlja töötatud ISO19105 standardil pÔhinev testide kogum (Abstract Tests Suite, ATS), mis vÔimaldab kaardistada olemasolevaid LPIS registreid vastavalt LCM skeemile. ATS töötati vÀlja ja testiti koostöös mitmete EL liikmesriikidega ja selle metodoloogia on osa Euroopa komisjoni poolt kehtestatud LPIS kvaliteedi tagamise raamprogrammist alates 2010. aastast.
LCM-i kasutati ka LPIS testimise portaali prototĂŒĂŒbi loomisel, mis koondas enda alla OGC ĂŒhilduvaid veebiteenuseid. Nende eesmĂ€rgiks on vĂ”imaldada andmevahetust rahvuslike pĂ”lluregistrite ja auditeerijatega Euroopa komisjonist. Eelvalitud pĂ”llumassiivide geograafiliste kihtide temaatilist ja positsioonilist Ă”igsust kontrolliti liikmesriikide ekspertide poolt kĂ”rge resolutsiooniga kaugseire andmete taustal. Selleks et vĂ”imaldata auditeerijate juurdepÀÀsu kvaliteedikontrolli tulemustele, loodi kolm prototĂŒĂŒp-veebiteenust, kus kasutati LCM-i originaalandmete transformeerimiseks.
Edasised uuringud kontsentreeruvad erinevate Euroopa pĂ”llumajandussĂŒsteemide kajastamisele pĂ”lluregistite andmetes ja nende andmete kasutamise vĂ”imalustele pĂ”llumajanduspoliitika keskkonnamĂ”ju hindamisel. LPIS/IACS pĂ”hikontseptsioonid vaadatakse uuesti lĂ€bi, nĂŒĂŒd juba mĂ”juhindamise ja indikaatorite vĂ€ljatöötamise kontekstis. Teoreetilist arutlust illustreerib kĂ”rge loodusvÀÀrtusega pĂ”llumajandusmaa (KLV) indikaatorite vĂ€ljatöötamise nĂ€ide JĂ”gevamaal â pĂ”lluregistrist saadud detailiderohked andmed lubavad arvutada nii maastiku meetrika kui ka pĂ”llumajandusintensiivsuse indikaatoreid, seejuures tĂŒpiseerides pĂ”llumajandussĂŒsteemide erinevaid aspekte.
Seega, LCM toetab geograafiliste andmete harmoniseerimist ja koostalitusvĂ”imet mitmel moel: (i) pakkudes valdkonna siseselt andmete ĂŒhiselt mĂ”istetavat tehnilist lugemist, nii mudeli vastavusetesti (ATS) kui ka veebiteenuste kaudu transformeerimisel; (ii) vĂ”imaldades semantilise vastavuse leidmist ja andmete/sĂŒsteemide integreerimist erinevate geoinfo valdkondade vahel. Loodud ja arendatud esialgselt Euroopa komisjoni LPIS kvaliteedisĂŒsteemi vajadusi silmas pidades, vĂ”imaldab LCM erinevate liikmesriikide pĂ”llumajandusregistrite andmete ĂŒhiselt mĂ”istetavat lugemist ka teistes valdkondades. LCM on lisatud kasutusjuhtumina rahvusvahelise standardi ISO 19152 âLand Administration Domain Modelâ lisasse H ja INSPIRE DS2.8 Land Cover rakenduseeskirja lisasse B2.This dissertation presents the development of the Land Parcel Identification System (LPIS) Conceptual Model (LCM) for the administration and control of agricultural subsidies of the European Common Agricultural Policy (CAP).
The subsidies which European farmers receive in the frame of the CAP are administered through the Integrated Administration and Control System (IACS) that are established and run by the EU member states. IACS includes a Land Parcel Identification System (LPIS) as its spatial component. The requirement to map and record land eligible for payments has led to the situation where the agricultural sector has acquired a large amount of geographic data; the geospatial community of data producers, custodians and users has grown during the last decades. The need to assess the quality and consistency of the LPIS towards the EU regulators as well as to ensure systemsâ interoperability as it is required for compliance with environmental legislation, call for harmonisation efforts. In the view of this, an LPIS Conceptual Model (LCM) was developed. The objective of the study was to introduce the modeling framework of ISO 19100 series for advance of quality of geospatial data in the LPIS domain and of interoperability with other geospatial domains.
The LCM was generated by means of both (i) methodological approaches of International Standards of ISO 19100 series, further extended by the INSPIRE principles, and (ii) reverse engineering of existing operational LPIS systems. The latter is based on the results of two LPIS surveys covering different national implementations. Business analysis of the relevant EU regulations and the LPIS surveys led to the first-cut LCM. Modelâs core classes â reference and agricultural parcels â cover process of land registration for administration of agricultural subsidies, agri-environmental measures of rural development and environmental restriction. Agricultural and reference parcels of the model build the framework for recording land cover and land use. Further model refinement addressed the quality aspects of the geographical databases: the LCM became naturally a part of the LPIS Quality Assurance programme between the European Commission and EU countries. The LCM was used (i) for conformance assessment of national systems and (ii) for implementation of the LPIS Test Bed portal: set of OGC compliant Web services allowing for agricultural data transformation from national data schemas to the common model as well as transferring, checking and storing spatial and non-spatial observations from the quality inspection.
The study case for interoperability with cadastral domain looked for possibilities of the collaboration of two models â the LCM and the Land Administration Domain Model (became ISO19152 LADM). Ownerâs rights, restrictions and responsibilities arising from land ownership in the cadastral domain have many similarities, but also differences with agricultural practice. The collaboration model established via newly introduced spatial class, also the semantic similarity of administrative classes of both models were analysed in details. Further studies include a representation of different European agricultural systems in LPIS and potentials of using LPIS data in the environmental impact assessment of the agricultural policy. Different types of land parcel proposed by the thesis and ways of integration with data from environmental domain viewed in context of the development of agri-environmental indicators.
Developed firstly for the needs of LPIS Quality Assurance Framework of the European Commission, the LCM also became a part of the International Standard ISO19152 â Land Administration Domain Model (Annex H: use case in agriculture) and INSPIRE DS2.8 Land Cover specification (Annex B2: use case in agriculture)
Towards intelligent transport systems: geospatial ontological framework and agent simulation
In an Intelligent Transport System (ITS) environment, the communication component is of high
significance as it supports interactions between vehicles and the roadside infrastructure.
Existing studies focus on the physical capability and capacity of the communication
technologies, but the equally important development of suitable and efficient semantic content
for transmission has received notably less attention. Using an ontology is one promising
approach for context modelling in ubiquitous computing environments. In the transport domain,
an ontology can be used both for context modelling and semantic contents for vehicular
communications. This research explores the development of an ontological framework
implementing a geosemantic messaging model to support vehicle-to-vehicle communications.
To develop an ontology model, two scenarios (an ambulance situation and a breakdown on the
motorway) are constructed to describe specific situations using short-range communication in
an ITS environment. In the scenarios, spatiotemporal relations and semantic relations among
vehicles and road facilities are extracted and defined as classes, objects, and properties/relations
in the ontology model. For the ontology model, some functions and query templates are also
developed to update vehiclesâ movements and to provide some logical procedures that vehicles
need to follow in emergency situations. To measure the effects of the vehicular communication
based on the ontology model, an agent-based approach is adopted to dynamically simulate the
moving vehicles and their communications following the scenarios.
The simulation results demonstrate that the ontology model can support vehicular
communications to update each vehicleâs context model and assist its decision-making process
to resolve the emergency situations. The results also show the effect of vehicular
communications on the efficiency trends of traffic in emergency situations, where some vehicles
have a communication device, and others do not. The efficiency trends, based on the percentage
of vehicles having a communication device, can be useful to set a transition period plan for
implanting communication devices onto vehicles and the infrastructure.
The geospatial ontological framework and agent simulation may contribute to increase the
intelligence of ITS by supporting data-level and application-level implementation of
autonomous vehicle agents to share knowledge in local contexts. This work can be easily
extended to support more complex interactions amongst vehicles and the infrastructure
Geographic information metadata â an outlook from the international standardization perspective
Geographic information metadata provides a detailed description of geographic information
resources. Well before digital data emerged, metadata were shown in the margins of paper maps
to inform the reader of the name of the map, the scale, the orientation of the magnetic North, the
projection used, the coordinate systems, the legend, and so on. Metadata were used to communicate
practical information for the proper use of maps. When geographic information entered the digital
era with geographic information systems, metadata was also collected digitally to describe datasets
and the dataset collections for various purposes. Initially, metadata were collected and saved in
digital files by data producers for their own specific needs. The sharing of geographic datasets that
required producers to provide metadata with the dataset to guide proper use of the datasetâmap
scale, data sources, extent, datum, coordinate reference system, etc. Because of issues with sharing
and no common understanding of metadata requirements, the need for metadata standardization was
recognized by the geographic information community worldwide. The ISO technical committee 211
was created in 1994 with the scope of standardization in the field of digital geographic information
to support interoperability. In the early years of the committee, standardization of metadata was
initiated for di erent purposes, which culminated in the ISO 19115:2003 standard. Now, there are
many ISO Geographic information standards that covers the various aspect of geographic information
metadata. This paper traces an illustration of the development and evolution of the requirements
and international standardization activities of geographic information metadata standards, profiles
and resources, and how these attest to facilitating the discovery, evaluation, and appropriate use of
geographic information in various contexts.http://www.mdpi.com/journal/ijgiam2020Geography, Geoinformatics and Meteorolog
The role of geographic knowledge in sub-city level geolocation algorithms
Geolocation of microblog messages has been largely investigated in the lit-
erature. Many solutions have been proposed that achieve good results at the
city-level. Existing approaches are mainly data-driven (i.e., they rely on a
training phase). However, the development of algorithms for geolocation at
sub-city level is still an open problem also due to the absence of good training
datasets. In this thesis, we investigate the role that external geographic know-
ledge can play in geolocation approaches. We show how di)erent geographical
data sources can be combined with a semantic layer to achieve reasonably
accurate sub-city level geolocation. Moreover, we propose a knowledge-based
method, called Sherloc, to accurately geolocate messages at sub-city level, by
exploiting the presence in the message of toponyms possibly referring to the
speci*c places in the target geographical area. Sherloc exploits the semantics
associated with toponyms contained in gazetteers and embeds them into a
metric space that captures the semantic distance among them. This allows
toponyms to be represented as points and indexed by a spatial access method,
allowing us to identify the semantically closest terms to a microblog message,
that also form a cluster with respect to their spatial locations. In contrast to
state-of-the-art methods, Sherloc requires no prior training, it is not limited
to geolocating on a *xed spatial grid and it experimentally demonstrated its
ability to infer the location at sub-city level with higher accuracy