5 research outputs found

    Aggregation-based information retrieval system for geospatial data catalogs

    Get PDF
    Geospatial data catalogs enable users to discover and access geographical information. Prevailing solutions are document oriented and fragment the spatial continuum of the geospatial data into independent and disconnected resources described through metadata. Due to this, the complete answer for a query may be scattered across multiple resources, making its discovery and access more difficult. This paper proposes an improved information retrieval process for geospatial data catalogs that aggregates the search results by identifying the implicit spatial/thematic relations between the metadata records of the resources. These aggregations are constructed in such a way that they match better the user query than each resource individually

    Approaches for the clustering of geographic metadata and the automatic detection of quasi-spatial dataset series

    Get PDF
    The discrete representation of resources in geospatial catalogues affects their information retrieval performance. The performance could be improved by using automatically generated clusters of related resources, which we name quasi-spatial dataset series. This work evaluates whether a clustering process can create quasi-spatial dataset series using only textual information from metadata elements. We assess the combination of different kinds of text cleaning approaches, word and sentence-embeddings representations (Word2Vec, GloVe, FastText, ELMo, Sentence BERT, and Universal Sentence Encoder), and clustering techniques (K-Means, DBSCAN, OPTICS, and agglomerative clustering) for the task. The results demonstrate that combining word-embeddings representations with an agglomerative-based clustering creates better quasi-spatial dataset series than the other approaches. In addition, we have found that the ELMo representation with agglomerative clustering produces good results without any preprocessing step for text cleaning

    Análise de tipos de ontologias nas áreas de ciência da informação e ciência da computação

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro de Ciências da Educação, Programa de Pós-Graduação em Ciência da Informação, Florianópolis, 2014.A emergência de tecnologias que visam complementar a web, associada às problemáticas na busca por novos modelos de recuperação de informação mais eficientes, abriram espaço para estudos que utilizam os benefícios da organização semântica da informação e do conhecimento. Sistemas de Organização do Conhecimento (SOCs) permitem representar um domínio por meio da sistematização dos conceitos e das relações semânticas que se estabelecem entre eles. Entre os tipos desses sistemas conceituais estão as ontologias, utilizadas para representar o conhecimento relativo a um dado domínio do conhecimento. A presente pesquisa tem como objetivo, por meio de uma pesquisa documental, identificar as principais características dos tipos de ontologias. Para tanto, foi empregado, nos procedimentos metodológicos, o método de Análise de Conteúdo de Laurence Bardin. Para a construção do corpus de análise foram utilizadas as bases de dados da Library and Information Science Abstracts (LISA) e da Computer and Information Systems Abstracts. A análise dos resultados permitiu identificar um predomínio significativo nas pesquisas relacionadas às ontologias de domínio, utilizando-a como ferramenta para representação de conceitos e relações que estejam inseridas na visão de mundo desejada. Diferentemente, as ontologias de topo definem os conceitos mais básicos e que sejam extensíveis a outras ações e domínios associados a sua área de abordagem. Os tipos aplicação e tarefa permitem um nível de representação mais específico, alinhado a modelagem de ambientes particulares.Abstract : The emergence of technologies that aim at complementing the internet, associated with the problematics that arise in the search for new models of information retrieval that are more efficient, have made room for studies that make use of the benefits of the semantic organization of information and knowledge. Knowledge Organization Systems (KOS) allow the representation of a domain through the systematization of concepts and semantic relations that have been stablished between them. Among these forms of conceptual systems are the ontologies, utilized in the representation of knowledge relative to a given knowledge domain. The goal of this research, therefore, is to identify the main characteristics of the types of ontologies through documentary research. For that, we have employed in the methodological procedures the Laurence Bardin Content Analysis Method. As for the corpus analysis construction we made use of the databases of the Library and Information Science Abstracts (LISA) and Computer and Information Systems Abstracts. The analysis of the results allowed the identification of a significant predominance of researches related to domain ontologies, they were used as tools for the representation of concepts and relations that are inserted in the desired world view. In contrast, top level ontologies define the most basic concepts that are extendable to other actions and domains associated to its approach area. The application and task types allow a representation that is more specific and alligned with the modeling of particular environments

    Web-based discovery and dissemination of multidimensional geographic information

    Get PDF
    A spatial data clearinghouse is an electronic facility for searching, viewing, transferring, ordering, advertising, and disseminating spatial data from numerous sources via the Internet. Governments and other institutions have been implementing spatial data clearinghouses to minimise data duplication and thus reduce the cost of spatial data acquisition. Underlying these clearinghouses are geoportals and databases of geospatial metadata.A geoportal is an access point of a spatial data clearinghouse and metadata is data that describes data. The success of a clearinghouse's spatial data discovery system is dependent on its ability to communicate the contents of geospatial metadata by providing both visual and analytical assistancet o a user. The model currently adopted by the geographic information community was inherited from generic information systems and thus to an extent ignores spatial characteristics of geographic data. Consequently, research in Geographic Information Retrieval (GIR) has focussed on spatial aspects of webbased data discovery and acquisition. This thesis considers how the process of GIR from geoportals can be enhanced through multidimensional visualisation served by web-based geographic data sources. An approach is proposed for the presentation of search results in ontology assisted GIR. Also proposed is an approach for the visualisation of multidimensional geographic data from web-based data sources. These approaches are implemented in two prototypes, the Geospatial Database Online Visualisation Environment (GeoDOVE) and the Spatio-Temporal Ontological Relevance Model (STORM). A discussion of their design, implementation and evaluation is presented. The results suggest that ontology-assisted visualisation can improve a user's ability to identify the most relevant multidimensional geographic datasets from a set of search results. Additional results suggest that it is possible to offer the proposed visualisation approaches on existing geoportal frameworks. The implication of the results is that multidimensional visualisation should be considered by the wider geographic information community as an alternative to historic approaches for presenting search results on geoportals, such as the textual ranked list and two-dimensional maps.EThOS - Electronic Theses Online ServiceUniversity of Newcastle upon TyneGBUnited Kingdo