    The impact of spatial data redundancy on SOLAP query performance

    Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.FAPESPCNPqCoordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES)INEPFINE

    COVID-19 Data Warehouse: A Systematic Literature Review

    The coronavirus disease (COVID-19) affects the whole world and led clinicians to use the available knowledge to diagnose or predict the infection. Data Warehouse is one of the most crucial tools that may enhance decision-making (DW).In this paper, three main questions will be investigated according to using DW in the COVID-19 pandemic. The effect of using DW in the field of diagnosing and prediction will be investigated, besides, the most used architecture of DW will be explored. The sectors that faced a lot of researchers' attention such as diagnosing, predicting, and finding the correlations among features will be examined. The selected studies are explored where the papers that have been published between 2019-2022 in the digital libraries (ACM, IEEE, Springer, Science Direct, and Elsevier) in the field of DW that handle the COVID-19 are selected. During the research, many limitations have been detected, while some future works are presented. Enterprise DW is the most used architecture for COVID-19 DW while finding correlation among features and prediction are the sectors that had taken the researchers' attentio

    Large spatial datasets: Present Challenges, future opportunities

    The key advantages of a well-designed multidimensional database is its ability to allow as many users as possible across an organisation to simultaneously gain access and view of the same data. Large spatial datasets evolve from scientific activities (from recent days) that tends to generate large databases which always come in a scale nearing terabyte of data size and in most cases are multidimensional. In this paper, we look at the issues pertaining to large spatial datasets; its feature (for example views), architecture, access methods and most importantly design technologies. We also looked at some ways of possibly improving the performance of some of the existing algorithms for managing large spatial datasets. The study reveals that the major challenges militating against effective management of large spatial datasets is storage utilization and computational complexity (both of which are characterised by the size of spatial big data which now tends to exceeds the capacity of commonly used spatial computing systems owing to their volume, variety and velocity). These problems fortunately can be combated by employing functional programming method or parallelization techniques

    Spatial Data Warehouse Modelling

    is concerned with multidimensional data models for spatial data warehouses. It first draws a picture of the research area, and then introduces a novel spatial multidimensional data model for spatial objects with geometry: the Multigranular Spatial Data warehouse (MuSD). The main novelty of the model is the representation of spatial measures at multiple levels of geometric granularit

    An OLAP-GIS System for Numerical-Spatial Problem Solving in Community Health Assessment Analysis

    Community health assessment (CHA) professionals who use information technology need a complete system that is capable of supporting numerical-spatial problem solving. On-Line Analytical Processing (OLAP) is a multidimensional data warehouse technique that is commonly used as a decision support system in standard industry. Coupling OLAP with Geospatial Information System (GIS) offers the potential for a very powerful system. For this work, OLAP and GIS were combined to develop the Spatial OLAP Visualization and Analysis Tool (SOVAT) for numerical-spatial problem solving. In addition to the development of this system, this dissertation describes three studies in relation to this work: a usability study, a CHA survey, and a summative evaluation.The purpose of the usability study was to identify human-computer interaction issues. Fifteen participants took part in the study. Three participants per round used the system to complete typical numerical-spatial tasks. Objective and subjective results were analyzed after each round and system modifications were implemented. The result of this study was a novel OLAP-GIS system streamlined for the purposes of numerical-spatial problem solving.The online CHA survey aimed to identify the information technology currently used for numerical-spatial problem solving. The survey was sent to CHA professionals and allowed for them to record the individual technologies they used during specific steps of a numerical-spatial routine. In total, 27 participants completed the survey. Results favored SPSS for numerical-related steps and GIS for spatial-related steps.Next, a summative within-subjects crossover design compared SOVAT to the combined use of SPSS and GIS (termed SPSS-GIS) for numerical-spatial problem solving. Twelve individuals from the health sciences at the University of Pittsburgh participated. Half were randomly selected to use SOVAT first, while the other half used SPSS-GIS first. In the second session, they used the alternate application. Objective and subjective results favored SOVAT over SPSS-GIS. Inferential statistics were analyzed using linear mixed model analysis. At the .01 level, SOVAT was statistically significant from SPSS-GIS for satisfaction and time (p < .002).The results demonstrate the potential for OLAP-GIS in CHA analysis. Future work will explore the impact of an OLAP-GIS system in other areas of public health

    Estado actual de las tecnologías de bodega de datos y olap aplicadas a bases de datos espaciales

    Las organizaciones requieren de una información oportuna, dinámica, amigable, centralizada y de fácil acceso para analizar y tomar decisiones acertadas y correctas en el momento preciso. La centralización se logra con la tecnología de bodega de datos. El análisis lo proporcionan los sistemas de procesamiento analítico en línea, OLAP (On Line Analytical Processing). Y en la presentación de los datos se pueden aprovechar tecnologías que usen gráficos y mapas para tener una visión global de la compañía y así tomar mejores decisiones. Aquí son útiles los sistemas de información geográfica, SIG, que están diseñados para ubicar espacialmente la información y representarla por medio de mapas. Las bodegas de datos generalmente se implementan con el modelo multidimensional para facilitar los análisis con OLAP. Uno de los puntos fundamentales de este modelo es la definición de medidas y de dimensiones, entre las cuales está la geografía. Diversos investigadores del tema han concluido que en los sistemas de análisis actuales la dimensión geográfica es un atributo más que describe los datos, pero sin profundizar en su parte espacial y sin ubicarlos en un mapa, como si se hace en los SIG. Visto de esa manera, es necesaria la interoperabilidad entre SIG y OLAP (que ha recibido el nombre de Spatial OLAP o SOLAP) y diversas entidades han adelantado varios trabalos de investigación para lograrla.Organisations require their information on a timely, dynamic, friendly, centralised and easy-to-access basis for analysing it and taking correct decisions at the right time. Centralisation can be achieved with data warehouse technology. On-line analytical processing (OLAP) is used for analysis. Technologies using graphics and maps in data presentation can be exploited for an overall view of a company and helping to take better decisions. Geo- graphic information systems (GIS) are useful for spatially locating information and representing it using maps. Data warehouses are generally implemented with a multidimensional data model to make OLAP analysis easier. A fundamental point in this model is the definition of measurements and dimensions; geography lies within such dimensions. Many researchers have concluded that the geographic dimension is another attribute for describing data in current analysis systems but without having an in-depth study of its spatial feature and without locating them on a map, like GIS does. Seen this way, interoperability is necessary between GIS and OLAP (called spatial OLAP or SOLAP) and several entities are currently researching this. This document summarises the current status of such research

    Uma interface baseada em conhecimento para interação com data warehouses espaciais

    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-graduação em Ciência da Computação, Florianópolis, 2010A análise de informação em um data warehouse espacial (SDW) pode envolver o manuseio de grandes volumes de dados espaciais. Usuários de domínios específicos de aplicação, com habilidades básicas de computação, são geralmente incapazes ou têm sérias dificuldades para resolver suas necessidades de análise de informação interagindo diretamente com SDWs, embora alguns sejam capazes de interagir com data warehouses (DW) convencionais através de uma interface gráfica (GUI). As dificuldades são maiores em um SDW que em um DW convencional, entre outras razões, pela variedade e complexidade dos dados espaciais, operadores espaciais e funções de agregação espacial utilizadas para especificar consultas SOLAP. Este trabalho propõe um sistema baseado em conhecimento, chamado de S2DW (Semantic and Spatial Data Warehouses), para auxiliar estes usuários de domínios específicos a efetuar análises de informação em SDWs, acessando descrições semânticas dos data marts espaciais através de uma interface gráfica baseada em conhecimento (GUI). Este trabalho descreve a arquitetura geral do S2DW e foca em sua GUI. A interface gráfica baseada em conhecimento do S2DW permite ao usuário pesquisar data marts relacionados a um determinado assunto, através da especificação de palavras-chave ou pela navegação em uma visão de uma ontologia do domínio. Cada data mart relacionado ao assunto pesquisado é apresentado ao usuário como um grafo representando a estrutura dimensional do cubo de informação. Este grafo é semanticamente enriquecido com descrições do conteúdo dos dados e dos recursos de processamento de dados do data mart espacial. Consultas espaciais OLAP podem ser especificadas interagindo com a interface gráfica baseada em conhecimento, a qual orienta o usuário a compor adequadamente operadores e funções para tratar os diferentes tipos de dados disponíveis no data mart, visando atender diferentes necessidades de análise. As tabelas, gráficos e mapas fornecidos como resposta as consultas SOLAP também permitem a interação do usuário para gradualmente refinar a análise da informação. As principais contribuições deste trabalho são a proposta inicial da GUI baseada em conhecimento do S2DW e o teste de usabilidade desta GUI, em um estudo de caso com usuários reais do domínio agrícola

    Infraestrutura para análise de tráfego e comportamento de condutores

    Mestrado em Engenharia de Computadores e TelemáticaO trabalho realizado nesta dissertação pode ser visto como um sistema de apoio à decisão para tráfego. Foi motivado pelos projetos smart cities dos quais os transportes são uma área importante. Com a evolução das tecnologias nas viaturas é possível fazer uma recolha de cada vez mais informação sobre veículos num ambiente real, permitindo assim fazer uma análise mais detalhada sobre o tráfego e comportamento dos condutores. A pesquisa efetuada sobre trabalho relacionado nesta área revelou que muitas das análises efetuadas não tem em consideração o contexto sendo que alguns estudos apontavam integrar fatores influentes na condução como trabalho futuro. Nesta dissertação os conceitos do trabalho relacionado são integrados assim como fontes de dados heterogénias com informação sobre o contexto. Foi também feito um estudo sobre diferentes paradigmas de bases de dados, onde foram estudados os principais paradigmas NoSQL, os seus casos de uso e as sua principais implementações. Esta dissertação tem como objetivo propor o desenho e a implementação de uma infraestrutura para análise de tráfego e comportamento de condutores a partir de dados sobre trajetórias obtidos de viaturas em circulação. Para a prova de conceito, foram efetuados dois casos de estudo com dados extraidos de duas fontes distintas. Um conjunto de ferramentas de extração, transformação e carregamento de dados foi criado para alimentar os data marts desenvolvidos. Ferramentas de visualização foram usadas de modo a poder fazer uma análise visual através de gráficos para as medidas agregadas e software sistemas de informação geográficos para os detalhes espaciais. Esta infraestrutura foi desenhada de modo a poder ser adaptada para diferentes casos de uso da área, desde gestão de transportes públicos até seguros com base em comportamento. Os resultados obtidos permitem estudar o comportamento dos condutores de modo a obter conhecimento nesta área e possivelmente melhorar o tráfego ou a experiência de condução.The work in this dissertation can be seen as a traffic decision support system. It was motivated for the smart cities project which transportation are a major area. With the technology evolution on vehicles it is possible to gather even more information about vehicles in a real scenario, this allows to perform a more detailed analysis about traffic and drivers’ behavior. The research done about related work in this area showed that a lot of the analysis performed did not have into consideration the context, some of this studies even proposed to integrate factors that influence the driving experience in the future. In this dissertation the concepts of the related work are integrated as well as heterogeneous data sources with context information. It was also performed a study about different database paradigms, in which were studied the most relevant NoSQL paradigms, their use cases and most used implementations. This dissertation proposes the design and implementation of a framework for traffic data analysis and drivers’ behavior based on trajectory data gathered from moving vehicles. For the proof of concept, it was performed two different case studies with data extracted from two distinct datasets with vehicles trajectories. A set of tools was developed to extract, transform and load data to the data marts developed. Visualization tools were used in order to perform a visual analysis through charts for aggregate measures and GIS software for the geospatial details. This framework was designed to be adaptable for different application scenarios involving moving vehicles, from public transportation management to behavior based insurance. The achieved results allows the study of traffic and drivers’ behavior in order to obtain knowledge in this area and possibly improve traffic management or the driving experience
