20 research outputs found

    A conceptual framework and a risk management approach for interoperability between geospatial datacubes

    Get PDF
    De nos jours, nous observons un intĂ©rĂȘt grandissant pour les bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ces bases de donnĂ©es sont dĂ©veloppĂ©es pour faciliter la prise de dĂ©cisions stratĂ©giques des organisations, et plus spĂ©cifiquement lorsqu’il s’agit de donnĂ©es de diffĂ©rentes Ă©poques et de diffĂ©rents niveaux de granularitĂ©. Cependant, les utilisateurs peuvent avoir besoin d’utiliser plusieurs bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ces bases de donnĂ©es peuvent ĂȘtre sĂ©mantiquement hĂ©tĂ©rogĂšnes et caractĂ©risĂ©es par diffĂ©rent degrĂ©s de pertinence par rapport au contexte d’utilisation. RĂ©soudre les problĂšmes sĂ©mantiques liĂ©s Ă  l’hĂ©tĂ©rogĂ©nĂ©itĂ© et Ă  la diffĂ©rence de pertinence d’une maniĂšre transparente aux utilisateurs a Ă©tĂ© l’objectif principal de l’interopĂ©rabilitĂ© au cours des quinze derniĂšres annĂ©es. Dans ce contexte, diffĂ©rentes solutions ont Ă©tĂ© proposĂ©es pour traiter l’interopĂ©rabilitĂ©. Cependant, ces solutions ont adoptĂ© une approche non systĂ©matique. De plus, aucune solution pour rĂ©soudre des problĂšmes sĂ©mantiques spĂ©cifiques liĂ©s Ă  l’interopĂ©rabilitĂ© entre les bases de donnĂ©es gĂ©ospatiales multidimensionnelles n’a Ă©tĂ© trouvĂ©e. Dans cette thĂšse, nous supposons qu’il est possible de dĂ©finir une approche qui traite ces problĂšmes sĂ©mantiques pour assurer l’interopĂ©rabilitĂ© entre les bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Ainsi, nous dĂ©finissons tout d’abord l’interopĂ©rabilitĂ© entre ces bases de donnĂ©es. Ensuite, nous dĂ©finissons et classifions les problĂšmes d’hĂ©tĂ©rogĂ©nĂ©itĂ© sĂ©mantique qui peuvent se produire au cours d’une telle interopĂ©rabilitĂ© de diffĂ©rentes bases de donnĂ©es gĂ©ospatiales multidimensionnelles. Afin de rĂ©soudre ces problĂšmes d’hĂ©tĂ©rogĂ©nĂ©itĂ© sĂ©mantique, nous proposons un cadre conceptuel qui se base sur la communication humaine. Dans ce cadre, une communication s’établit entre deux agents systĂšme reprĂ©sentant les bases de donnĂ©es gĂ©ospatiales multidimensionnelles impliquĂ©es dans un processus d’interopĂ©rabilitĂ©. Cette communication vise Ă  Ă©changer de l’information sur le contenu de ces bases. Ensuite, dans l’intention d’aider les agents Ă  prendre des dĂ©cisions appropriĂ©es au cours du processus d’interopĂ©rabilitĂ©, nous Ă©valuons un ensemble d’indicateurs de la qualitĂ© externe (fitness-for-use) des schĂ©mas et du contexte de production (ex., les mĂ©tadonnĂ©es). Finalement, nous mettons en Ɠuvre l’approche afin de montrer sa faisabilitĂ©.Today, we observe wide use of geospatial databases that are implemented in many forms (e.g., transactional centralized systems, distributed databases, multidimensional datacubes). Among those possibilities, the multidimensional datacube is more appropriate to support interactive analysis and to guide the organization’s strategic decisions, especially when different epochs and levels of information granularity are involved. However, one may need to use several geospatial multidimensional datacubes which may be semantically heterogeneous and having different degrees of appropriateness to the context of use. Overcoming the semantic problems related to the semantic heterogeneity and to the difference in the appropriateness to the context of use in a manner that is transparent to users has been the principal aim of interoperability for the last fifteen years. However, in spite of successful initiatives, today's solutions have evolved in a non systematic way. Moreover, no solution has been found to address specific semantic problems related to interoperability between geospatial datacubes. In this thesis, we suppose that it is possible to define an approach that addresses these semantic problems to support interoperability between geospatial datacubes. For that, we first describe interoperability between geospatial datacubes. Then, we define and categorize the semantic heterogeneity problems that may occur during the interoperability process of different geospatial datacubes. In order to resolve semantic heterogeneity between geospatial datacubes, we propose a conceptual framework that is essentially based on human communication. In this framework, software agents representing geospatial datacubes involved in the interoperability process communicate together. Such communication aims at exchanging information about the content of geospatial datacubes. Then, in order to help agents to make appropriate decisions during the interoperability process, we evaluate a set of indicators of the external quality (fitness-for-use) of geospatial datacube schemas and of production context (e.g., metadata). Finally, we implement the proposed approach to show its feasibility

    Improving the geospatial consistency of digital libraries metadata

    Get PDF
    Consistency is an essential aspect of the quality of metadata. Inconsistent metadata records are harmful: given a themed query, the set of retrieved metadata records would contain descriptions of unrelated or irrelevant resources, and may even not contain some resources considered obvious. This is even worse when the description of the location is inconsistent. Inconsistent spatial descriptions may yield invisible or hidden geographical resources that cannot be retrieved by means of spatially themed queries. Therefore, ensuring spatial consistency should be a primary goal when reusing, sharing and developing georeferenced digital collections. We present a methodology able to detect geospatial inconsistencies in metadata collections based on the combination of spatial ranking, reverse geocoding, geographic knowledge organization systems and information-retrieval techniques. This methodology has been applied to a collection of metadata records describing maps and atlases belonging to the Library of Congress. The proposed approach was able to automatically identify inconsistent metadata records (870 out of 10,575) and propose fixes to most of them (91.5%) These results support the ability of the proposed methodology to assess the impact of spatial inconsistency in the retrievability and visibility of metadata records and improve their spatial consistency

    Design for geospatially enabled climate modeling and alert system (CLIMSYS):A position paper

    Get PDF
    The paper brings the focus on to multi-disciplinary approach of presenting climate analysis studies, taking help of interdisciplinary fields to structure the information. The system CLIMSYS provides the crucial element of spatially enabling climate data processing. Even though climate change is a matter of great scientific relevance and of broad general interest, there are some problems related to its communication. Its a fact that finding practical, workable and cost-efficient solutions to the problems posed by climate change is now a world priority and one which links government and non-government organizations in a way not seen before. An approach that should suffice is to create an accessible intelligent system that houses prior knowledge and curates the incoming data to deliver meaningful results. The objective of the proposed research is to develop a generalized system for climate data analysis that facilitates open sharing, central implementation, integrated components, knowledge creation, data format understanding, inferencing and ultimately optimal solution delivery, by the way of geospatial enablement

    Geospatial information infrastructures

    Get PDF
    Manual of Digital Earth / Editors: Huadong Guo, Michael F. Goodchild, Alessandro Annoni .- Springer, 2020 .- ISBN: 978-981-32-9915-3Geospatial information infrastructures (GIIs) provide the technological, semantic,organizationalandlegalstructurethatallowforthediscovery,sharing,and use of geospatial information (GI). In this chapter, we introduce the overall concept and surrounding notions such as geographic information systems (GIS) and spatial datainfrastructures(SDI).WeoutlinethehistoryofGIIsintermsoftheorganizational andtechnologicaldevelopmentsaswellasthecurrentstate-of-art,andreïŹ‚ectonsome of the central challenges and possible future trajectories. We focus on the tension betweenincreasedneedsforstandardizationandtheever-acceleratingtechnological changes. We conclude that GIIs evolved as a strong underpinning contribution to implementation of the Digital Earth vision. In the future, these infrastructures are challengedtobecomeïŹ‚exibleandrobustenoughtoabsorbandembracetechnological transformationsandtheaccompanyingsocietalandorganizationalimplications.With this contribution, we present the reader a comprehensive overview of the ïŹeld and a solid basis for reïŹ‚ections about future developments

    Integration of temporal and semantic components into the Geographic Information. Part II: Methodology

    Get PDF
    The overall objective of this research project is to enrich geographic data with temporal and semantic components in order to significantly improve spatio-temporal analysis of geographic phenomena. To achieve this goal, we intend to establish and incorporate three new layers (structures) into the core of the Geographic Information by using mark-up languages as well as defining a set of methods and tools for enriching the system to make it able to retrieve and exploit such layers (semantic-temporal, geosemantic, and incremental spatio-temporal). Besides these layers, we also propose a set of models (temporal and spatial) and two semantic engines that make the most of the enriched geographic data. The roots of the project and its definition have been previously presented in Siabato & Manso-Callejo 2011. In this new position paper, we extend such work by delineating clearly the methodology and the foundations on which we will base to define the main components of this research: the spatial model, the temporal model, the semantic layers, and the semantic engines. By putting together the former paper and this new work we try to present a comprehensive description of the whole process, from pinpointing the basic problem to describing and assessing the solution. In this new article we just mention the methods and the background to describe how we intend to define the components and integrate them into the GI

    Consortial Geospatial Data Collection: Toward Standards and Processes for Shared GeoBlacklight Metadata

    Get PDF
    Consortial geospatial data communities, such as the OpenGeoPortal federation and the GeoBlacklight initiative, facilitate contextualized discovery and promote metadata sharing to disperse hosting and preservation responsibilities across institutions. However, the challenges of communal metadata are manifold; they include proliferating standards, varying levels of completeness, mutable technology infrastructures, and uneven availability of human labor. Drawing from literature on metadata quality control, we outline a procedure for “scoring” GeoBlacklight records to establish a Domain Specific Language for metadata best practices. We propose strategies for authorship and management conducive to functionally interoperable geospatial metadata, that is versioned and enhanceable by the collective

    PĂ”llumassiivide identifitseerimissĂŒsteemi kontseptuaalne mudel: geoinfo huvigruppi kontseptuaalse mudeli loomine

    Get PDF
    VĂ€itekirja elektrooniline versioon ei sisalda publikatsioone.KĂ€esolevas doktoritöös kĂ€sitletakse PĂ”llumassiivide identifitseerimissĂŒsteemi (Land Parcel Identification System, LPIS) Kontseptuaalse Mudeli (LPIS Conceptual Model, LCM) loomist ja selle kasutamist ruumiandmete standardiseerimisel, kvaliteedi hindamisel ja koostoimimisel teiste valdkondade ruumiandmetega. Mudelis kĂ€sitletud ruumiandmeid kasutatakse pĂ”llumajandustoetuste haldamise ja kontrolli eesmĂ€rgil ELi Ühise PĂ”llumajanduspoliitika (ÜPP) raames. ÜPP raames makstavate toetuste haldamiseks on igas EL liikmesriigis asutatud Ühtne haldus-ja kontrollisĂŒsteem (Eestis PĂ”llumajandusregistrite amet, PRIA), mille ruumiandmeid haldav komponent on pĂ”llumassiivide register ehk identifitseerimissĂŒsteem. NĂ”ue kaardistada ja registreerida toetuskĂ”lbulik maa on viinud olukorrani, kus pĂ”llumajandussektoris on tekkinud suur hulk ruumiandmeid. Viimase aastakĂŒmne jooksul on kasvanud ÜPP-ga seotud geoinformaatika sektor Euroopas. ÜPP-ga seotud geoinfo huvigrupp (Spatial Data Interest Community) hĂ”lmab nii andmete tootjaid, haldajaid ja kasutajaid, kui ka IT rakenduste arendajaid ning kaugseire andmete tarnijaid. Vajadus hinnata registrite kvaliteeti ja selle vastavust EL mÀÀrustele ning tagada koostalitlusvĂ”ime keskkonnaalaseid nĂ”udeid toetavate ruumiandmete ja sĂŒsteemidega, kutsus esile LCM-i loomise. Töö eesmĂ€rgiks oli edendada kontseptuaalmodelleerimist pĂ”lluregistrite ruumiandmete kvaliteedi hindamisel ja teiste geoinfo (eelkĂ”ige keskkonnakaitse) valdkondadega koostalitlusvĂ”ime arendamisel. LCM vĂ€ljatöötamise metodoloogia aluseks oli ISO19100 seeria rahvusvaheliste standardite metoodika, mida samuti rakendavad ja laiendavad INSPIRE direktiivi printsiibid ja millele keskendutakse uurimistöö teoreetilises osas. Mudeli peamiseks sisendiks said ÜPP-d reguleeritavates mÀÀrustes sĂ€testatud kontseptsioonide pĂ”hjalik kĂ€sitlus ja olemasolevate töötavate sĂŒsteemide analĂŒĂŒs, mis pĂ”hineb LPIS kĂŒsitluste tulemustel (Milenov ja Kay, 2006; ZieliƄski ja Sagris, 2008 ja 2009) ja hĂ”lmab erinevate liikmesriikide pĂ”lluregistreid. VĂ€itekirjas on keskendutud ÜPP otsetoetuste Ă€rimudeli analĂŒĂŒsile ehk ÜPP toetustesĂŒsteemi pĂ”hikontseptsioonidele, tehtud kokkuvĂ”tted ja jĂ€reldused 2006. ja 2008. aasta kĂŒsimustikust. KĂŒsimustikust saadut info laiendati EL pĂ”lluregistrite kvaliteedi hindamise programmi raames. LCM esimese versiooni keskmes on kaks klassi: ReferenceParcel ehk pĂ”llumassiiv ja AgriculturalParcel ehk toetustaotluses deklareeritud pĂ”ld. ReferenceParcel-i klassi ĂŒlesandeks on toetuskĂ”lbliku pĂ”llumaa identifitseerimine, lokaliseerimine ja pindala mÀÀramine. ReferenceParcel tĂ€idab „konteineri“ rolli deklareeritavate maatĂŒkkide suhtes. Kuid kĂ€sitletud pĂ”llumassiiviklassi alamtĂŒĂŒpe ning analĂŒĂŒsitud erinevaid pĂ”llumajanduslikke maakatte klassifitseerimise ja kaardistamise lĂ€henemisviise EL liikmesriikides. Töö teisel etapil on otsitud vĂ”imalusi kahe mudeli – LCM ja Maakatastri infosĂŒsteemi mudeli (Land Administration Domain Model, LADM, ISO 19152) – lĂ”imiseks. Kaks mudelit on omavahel integreeritud uue ruumilise klassi SubCadParsel abil – katastriĂŒksuse sees eristuvad maakatte tĂŒĂŒbi alamĂŒksused. KĂ€sitletakse ka mĂ”lema mudeli semantiliselt sarnaseid haldusklasse ja tehakse kindlaks uued seosed kahe mudeliklassi vahel. Ära on toodud pĂ”hjalik analĂŒĂŒs, millistes reaalse elu tingimustes vĂ”iks toimida kahe mudeli integreerimine. LCM viimane versioon keskendub kahele aspektile: (i) nende klasside modelleerimisele, mis toetavad vastavust keskkonna, tervise ja loomade heaolu majandamisnĂ”uetele ning mis toetavad maa heade pĂ”llumajandus- ja keskkonnatingimuste kontrolli; (ii) mudeli kasutamisele pĂ”lluregistrite loogilise Ă”igsuse (ehk EL mÀÀruste nĂ”uetele vastavuse) testimiseks. Selleks on vĂ€lja töötatud ISO19105 standardil pĂ”hinev testide kogum (Abstract Tests Suite, ATS), mis vĂ”imaldab kaardistada olemasolevaid LPIS registreid vastavalt LCM skeemile. ATS töötati vĂ€lja ja testiti koostöös mitmete EL liikmesriikidega ja selle metodoloogia on osa Euroopa komisjoni poolt kehtestatud LPIS kvaliteedi tagamise raamprogrammist alates 2010. aastast. LCM-i kasutati ka LPIS testimise portaali prototĂŒĂŒbi loomisel, mis koondas enda alla OGC ĂŒhilduvaid veebiteenuseid. Nende eesmĂ€rgiks on vĂ”imaldada andmevahetust rahvuslike pĂ”lluregistrite ja auditeerijatega Euroopa komisjonist. Eelvalitud pĂ”llumassiivide geograafiliste kihtide temaatilist ja positsioonilist Ă”igsust kontrolliti liikmesriikide ekspertide poolt kĂ”rge resolutsiooniga kaugseire andmete taustal. Selleks et vĂ”imaldata auditeerijate juurdepÀÀsu kvaliteedikontrolli tulemustele, loodi kolm prototĂŒĂŒp-veebiteenust, kus kasutati LCM-i originaalandmete transformeerimiseks. Edasised uuringud kontsentreeruvad erinevate Euroopa pĂ”llumajandussĂŒsteemide kajastamisele pĂ”lluregistite andmetes ja nende andmete kasutamise vĂ”imalustele pĂ”llumajanduspoliitika keskkonnamĂ”ju hindamisel. LPIS/IACS pĂ”hikontseptsioonid vaadatakse uuesti lĂ€bi, nĂŒĂŒd juba mĂ”juhindamise ja indikaatorite vĂ€ljatöötamise kontekstis. Teoreetilist arutlust illustreerib kĂ”rge loodusvÀÀrtusega pĂ”llumajandusmaa (KLV) indikaatorite vĂ€ljatöötamise nĂ€ide JĂ”gevamaal – pĂ”lluregistrist saadud detailiderohked andmed lubavad arvutada nii maastiku meetrika kui ka pĂ”llumajandusintensiivsuse indikaatoreid, seejuures tĂŒpiseerides pĂ”llumajandussĂŒsteemide erinevaid aspekte. Seega, LCM toetab geograafiliste andmete harmoniseerimist ja koostalitusvĂ”imet mitmel moel: (i) pakkudes valdkonna siseselt andmete ĂŒhiselt mĂ”istetavat tehnilist lugemist, nii mudeli vastavusetesti (ATS) kui ka veebiteenuste kaudu transformeerimisel; (ii) vĂ”imaldades semantilise vastavuse leidmist ja andmete/sĂŒsteemide integreerimist erinevate geoinfo valdkondade vahel. Loodud ja arendatud esialgselt Euroopa komisjoni LPIS kvaliteedisĂŒsteemi vajadusi silmas pidades, vĂ”imaldab LCM erinevate liikmesriikide pĂ”llumajandusregistrite andmete ĂŒhiselt mĂ”istetavat lugemist ka teistes valdkondades. LCM on lisatud kasutusjuhtumina rahvusvahelise standardi ISO 19152 ’Land Administration Domain Model’ lisasse H ja INSPIRE DS2.8 Land Cover rakenduseeskirja lisasse B2.This dissertation presents the development of the Land Parcel Identification System (LPIS) Conceptual Model (LCM) for the administration and control of agricultural subsidies of the European Common Agricultural Policy (CAP). The subsidies which European farmers receive in the frame of the CAP are administered through the Integrated Administration and Control System (IACS) that are established and run by the EU member states. IACS includes a Land Parcel Identification System (LPIS) as its spatial component. The requirement to map and record land eligible for payments has led to the situation where the agricultural sector has acquired a large amount of geographic data; the geospatial community of data producers, custodians and users has grown during the last decades. The need to assess the quality and consistency of the LPIS towards the EU regulators as well as to ensure systems’ interoperability as it is required for compliance with environmental legislation, call for harmonisation efforts. In the view of this, an LPIS Conceptual Model (LCM) was developed. The objective of the study was to introduce the modeling framework of ISO 19100 series for advance of quality of geospatial data in the LPIS domain and of interoperability with other geospatial domains. The LCM was generated by means of both (i) methodological approaches of International Standards of ISO 19100 series, further extended by the INSPIRE principles, and (ii) reverse engineering of existing operational LPIS systems. The latter is based on the results of two LPIS surveys covering different national implementations. Business analysis of the relevant EU regulations and the LPIS surveys led to the first-cut LCM. Model’s core classes – reference and agricultural parcels – cover process of land registration for administration of agricultural subsidies, agri-environmental measures of rural development and environmental restriction. Agricultural and reference parcels of the model build the framework for recording land cover and land use. Further model refinement addressed the quality aspects of the geographical databases: the LCM became naturally a part of the LPIS Quality Assurance programme between the European Commission and EU countries. The LCM was used (i) for conformance assessment of national systems and (ii) for implementation of the LPIS Test Bed portal: set of OGC compliant Web services allowing for agricultural data transformation from national data schemas to the common model as well as transferring, checking and storing spatial and non-spatial observations from the quality inspection. The study case for interoperability with cadastral domain looked for possibilities of the collaboration of two models – the LCM and the Land Administration Domain Model (became ISO19152 LADM). Owner’s rights, restrictions and responsibilities arising from land ownership in the cadastral domain have many similarities, but also differences with agricultural practice. The collaboration model established via newly introduced spatial class, also the semantic similarity of administrative classes of both models were analysed in details. Further studies include a representation of different European agricultural systems in LPIS and potentials of using LPIS data in the environmental impact assessment of the agricultural policy. Different types of land parcel proposed by the thesis and ways of integration with data from environmental domain viewed in context of the development of agri-environmental indicators. Developed firstly for the needs of LPIS Quality Assurance Framework of the European Commission, the LCM also became a part of the International Standard ISO19152 – Land Administration Domain Model (Annex H: use case in agriculture) and INSPIRE DS2.8 Land Cover specification (Annex B2: use case in agriculture)

    Towards intelligent transport systems: geospatial ontological framework and agent simulation

    Get PDF
    In an Intelligent Transport System (ITS) environment, the communication component is of high significance as it supports interactions between vehicles and the roadside infrastructure. Existing studies focus on the physical capability and capacity of the communication technologies, but the equally important development of suitable and efficient semantic content for transmission has received notably less attention. Using an ontology is one promising approach for context modelling in ubiquitous computing environments. In the transport domain, an ontology can be used both for context modelling and semantic contents for vehicular communications. This research explores the development of an ontological framework implementing a geosemantic messaging model to support vehicle-to-vehicle communications. To develop an ontology model, two scenarios (an ambulance situation and a breakdown on the motorway) are constructed to describe specific situations using short-range communication in an ITS environment. In the scenarios, spatiotemporal relations and semantic relations among vehicles and road facilities are extracted and defined as classes, objects, and properties/relations in the ontology model. For the ontology model, some functions and query templates are also developed to update vehicles’ movements and to provide some logical procedures that vehicles need to follow in emergency situations. To measure the effects of the vehicular communication based on the ontology model, an agent-based approach is adopted to dynamically simulate the moving vehicles and their communications following the scenarios. The simulation results demonstrate that the ontology model can support vehicular communications to update each vehicle’s context model and assist its decision-making process to resolve the emergency situations. The results also show the effect of vehicular communications on the efficiency trends of traffic in emergency situations, where some vehicles have a communication device, and others do not. The efficiency trends, based on the percentage of vehicles having a communication device, can be useful to set a transition period plan for implanting communication devices onto vehicles and the infrastructure. The geospatial ontological framework and agent simulation may contribute to increase the intelligence of ITS by supporting data-level and application-level implementation of autonomous vehicle agents to share knowledge in local contexts. This work can be easily extended to support more complex interactions amongst vehicles and the infrastructure

    Geographic information metadata — an outlook from the international standardization perspective

    Get PDF
    Geographic information metadata provides a detailed description of geographic information resources. Well before digital data emerged, metadata were shown in the margins of paper maps to inform the reader of the name of the map, the scale, the orientation of the magnetic North, the projection used, the coordinate systems, the legend, and so on. Metadata were used to communicate practical information for the proper use of maps. When geographic information entered the digital era with geographic information systems, metadata was also collected digitally to describe datasets and the dataset collections for various purposes. Initially, metadata were collected and saved in digital files by data producers for their own specific needs. The sharing of geographic datasets that required producers to provide metadata with the dataset to guide proper use of the dataset—map scale, data sources, extent, datum, coordinate reference system, etc. Because of issues with sharing and no common understanding of metadata requirements, the need for metadata standardization was recognized by the geographic information community worldwide. The ISO technical committee 211 was created in 1994 with the scope of standardization in the field of digital geographic information to support interoperability. In the early years of the committee, standardization of metadata was initiated for di erent purposes, which culminated in the ISO 19115:2003 standard. Now, there are many ISO Geographic information standards that covers the various aspect of geographic information metadata. This paper traces an illustration of the development and evolution of the requirements and international standardization activities of geographic information metadata standards, profiles and resources, and how these attest to facilitating the discovery, evaluation, and appropriate use of geographic information in various contexts.http://www.mdpi.com/journal/ijgiam2020Geography, Geoinformatics and Meteorolog

    The role of geographic knowledge in sub-city level geolocation algorithms

    Get PDF
    Geolocation of microblog messages has been largely investigated in the lit- erature. Many solutions have been proposed that achieve good results at the city-level. Existing approaches are mainly data-driven (i.e., they rely on a training phase). However, the development of algorithms for geolocation at sub-city level is still an open problem also due to the absence of good training datasets. In this thesis, we investigate the role that external geographic know- ledge can play in geolocation approaches. We show how di)erent geographical data sources can be combined with a semantic layer to achieve reasonably accurate sub-city level geolocation. Moreover, we propose a knowledge-based method, called Sherloc, to accurately geolocate messages at sub-city level, by exploiting the presence in the message of toponyms possibly referring to the speci*c places in the target geographical area. Sherloc exploits the semantics associated with toponyms contained in gazetteers and embeds them into a metric space that captures the semantic distance among them. This allows toponyms to be represented as points and indexed by a spatial access method, allowing us to identify the semantically closest terms to a microblog message, that also form a cluster with respect to their spatial locations. In contrast to state-of-the-art methods, Sherloc requires no prior training, it is not limited to geolocating on a *xed spatial grid and it experimentally demonstrated its ability to infer the location at sub-city level with higher accuracy
    corecore