23 research outputs found

    Um benchmark voltado a analise de desempenho de sistemas de informações geograficas

    Get PDF
    Orientador: Geovane Cayres MagalhãesDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Ciencia da ComputaçãoResumo: A enorme quantidade e a natureza dos dados armazenados por aplicações que utilizam sistemas de informações geográficas (SIGs) implicam em alterações ou extensões nos métodos de acesso, otimizadores de consulta e linguagens de consulta estabelecidos para sistemas gerenciadotes de banco de dados (SGBDs) convencionais. Com isto, diferentes soluções têm sido apresentadas, tornando-se imprescindível a criação de algum mecanismo que possa medir a eficiência destas soluções para auxiliar o direcionamento de futuros trabalhos de pesquisas. Para tal propósito é utilizada, nesta dissertação, a técnica experimental de benchmark. Esta dissertação propõe a carga de trabalho e caracteriza os dados de um benchmark voltado à análise de desempenho de SIGs. A carga de trabalho do benchmark é composta por um conjunto de transações primitivas, especificadas em alto nível, que podem ser utilizadas para a formação de transações mais complexas. Estas transações primitivas são predominantemente orientadas aos dados espaciais, sendo, a priori, independentes do formato de dados utilizado (raster ou vetorial). A caracterização dos dados do benchmark foi efetuada em termos dos tipos de dados necessários para a representação de aplicações georeferenciadas, e adicionalmente procedimentos para se realizar a geração de dados sintéticos. Finalmente, uma aplicação alvo utilizando dados sintéticos foi definida com a finalidade de validar o benchmark proposto.Abstract: Geographical Information Systems (GIS) de ai with data that are special in nature and size. Thus, the technologies developed for conventional data base systems such as access methods, query optimizers and languages, have to be modified in order to satisfy the needs of a GIS. These modifications, embedded in several GIS, or being proposed by research projects, need to be evaluated. This thesis proposes mechanisms for evaluating GIS based on benchmarks. The benchmark is composed of a workload to be submitted to the GIS being analysed and data characterizing the information. The workload is made of a set of primitive transactions that can be. combined in order to derive transactions of any degree of complexity. These primitive transactions are oriented to spatial data but not dependent on the way they are represented (vector or raster). The benchmark data base characterization was defined in terms of the types of data required by applications that use georeferencing, and by the need to generate complex and controlled artificial data. The proposed technique and methods were used to show how to create the transactions and the data for a given application.MestradoMestre em Ciência da Computaçã

    An abstract data type to handle vague spatial objects based on the fuzzy model

    Get PDF
    Crisp spatial data are geometric features with exact location on the extent and well-known boundaries. On the other hand, vague spatial data are characterized by inaccurate locations or uncertain boundaries. Despite the importance of vague spatial data in spatial applications, few related work indeed implement vague spatial data and they do not define abstract data types to enable the management of vague spatial data by using database management systems. In this sense, we propose the abstract data type FuzzyGeometry to handle vague spatial data based on the fuzzy model. FuzzyGeometry was developed as a PostgreSQL extension and its implementation is open source. It offers management for fuzzy points and fuzzy lines. As a result, spatial applications are able to access the PostgreSQL to handle vague spatial objects.FAPESPCAPESCNP

    The impact of spatial data redundancy on SOLAP query performance

    Get PDF
    Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.FAPESPCNPqCoordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES)INEPFINE

    Embedding vague spatial objects into spatial databases using the VagueGeometry abstract data type

    Get PDF
    Spatial vagueness has been required by geoscientists to handle vague spatial objects, i.e., spatial objects that do not have exact locations, strict boundaries, or sharp interiors. However, there is a gap in the literature in how to handle these objects in spatial database management systems since they mainly provide support to crisp spatial objects, i.e., objects that have well-defined locations, boundaries, and interiors. In this paper, we fill this gap by proposing VagueGeometry, a novel abstract data type that handles vague spatial objects, includes an expressive set of vague spatial operations, and its implementation is open source. Experimental results show that VagueGeometry improved the performance of spatial queries with vague topological predicates from 23% up to 84% if compared with functionalities available in current spatial databases.FAPESPCAPESCNP

    Modeling fuzzy topological predicates for fuzzy regions

    Get PDF
    Spatial database systems and Geographical Information Systems (GIS) are currently only able to handle crisp spatial objects, i.e., objects whose extent, shape, and boundary are precisely determined. However, GIS applications are also interested in managing vague or fuzzy spatial objects. Spatial fuzziness captures the inherent property of many spatial objects in reality that do not have sharp boundaries and interiors or whose boundaries and interiors cannot be precisely determined. While topological relationships have been broadly explored for crisp spatial objects, this is not the case for fuzzy spatial objects. In this paper, we propose a novel model to formally define fuzzy topological predicates for simple and complex fuzzy regions. The model encompasses six fuzzy predicates (overlap, disjoint, inside, contains, equal and meet), wherein here we focus on the fuzzy overlap and the fuzzy disjoint predicates only. For their computation we consider two low-level measures, the degree of membership and the degree of coverage, and map them to high-level fuzzy modifiers and linguistic values respectively that are\ud deployed in spatial queries by end-users.FAPESP (grant numbers 2012/12299-8 and 2013/19633-3)CAPESCNPqNational Science Foundation (grant number NSF-IIS-0915914

    Analytical Processing Over XML and XLink

    Get PDF
    Current commercial and academic OLAP tools do not process XML data that contains XLink. Aiming at overcoming this issue, this paper proposes an analytical system composed by LMDQL, an analytical query language. Also, the XLDM metamodel is given to model cubes of XML documents with XLink and to deal with syntactic, semantic and structural heterogeneities commonly found in XML documents. As current W3C query languages for navigating in XML documents do not support XLink, XLPath is discussed in this article to provide features for the LMDQL query processing. A prototype system enabling the analytical processing of XML documents that use XLink is also detailed. This prototype includes a driver, named sql2xquery, which performs the mapping of SQL queries into XQuery. To validate the proposed system, a case study and its performance evaluation are presented to analyze the impact of analytical processing over XML/XLink documents.FAPESPFACEPECAPESCNPqINEPFINE

    Ontology-Driven IoT System for Monitoring Hypertension

    Get PDF
    Hypertension is a noncommunicable disease (NCD) that causes global concern, high costs and a high number of deaths. Internet of Things, Ubiquitous Computing, and Cloud Computing enable the development of systems for remote and real-time monitoring of patients affected with NCDs like hypertension. This paper reports on a system for monitoring hypertension patients that was built by employing these techniques. This system allows the vital signs of a patient (blood pressure, heart rate, body temperature) to be captured via sensors built in a wearable device similar to a wristwatch. These signals are transmitted to the patient's mobile device for processing, and the generated clinical data are sent to the cloud to be properly presented and analysed by the health professionals responsible for the patient. To deal with semantic interoperability issues that arise when multiple different devices and system components must interoperate, a semantic model was conceived for this system in terms of ontologies for diseases and devices. This paper also presents the semantic module that we developed and implemented in the cloud to perform reasoning based on this model, demonstrating the potential benefits of incorporating semantic technologies in our system.</p

    Análise da Influência do Fator Distribuição Espacial dos Dados no Desempenho de Métodos de Acesso Multidimensionais

    No full text
    Um método de acesso multidimensional (MAM) é uma estrutura de indexação voltada ao suporte de objetos espaciais, especialmente de retângulos. O principal objetivo de um MAM é propiciar uma rápida obtenção dos objetos espaciais que satisfazem um certo relacionamento topológico, métrico ou direcional. Neste sentido, o espaço indexado é organizado de tal forma que, por exemplo, a recuperação dos retângulos de dados contidos em uma área particular requeira apenas o acesso aos retângulos próximos a esta área, em oposição à análise do conjunto completo de retângulos armazenados em memória secundária. Um MAM, portanto, é projetado como um caminho otimizado aos dados espaciais e o seu uso melhora significativamente o desempenho de sistemas gerenciadores de banco de dados espaciais no processamento de consultas. Nesta tese, nós investigamos o desempenho de um conjunto de MAM, a maioria dos quais tem sido identificado na literatura como um MAM muito eficiente no suporte a consultas espaciais de seleção. Este grupo consiste dos seguintes métodos de acesso: R-tree, R-tree Greene, R+-tree, Hilbert R-tree, SR-tree e três variantes da R* -tree chamadas de R* -tree CR (i.e., close reinsert), de R* -tree FR (isto é, far reinsert) e de R* -tree WR (isto é, without reinsertion). A comparação do desempenho destes MAM foi realizada visando-se analisar prioritariamente a influência do fator distribuição espacial dos dados. Neste sentido, nós propusemos uma metodologia de avaliação de desempenho que permite a geração de um conjunto de tipos de distribuição espacial com diferentes características, as quais tornam possível que a influência do fator distribuição espacial dos dados seja analisada sob diferentes perspectivas, desde uma fraca até uma forte influência. Por meio de diversos testes de desempenho, nós observamos de que forma a distribuição espacial dos dados afetou os custos de inserção e de armazenamento de novas entradas no índice espacial, além do custo de point queries, intersection range queries, enclosure range queries e containment range queries. Com relação a estas consultas espaciais de seleção, os resultados de desempenho mostraram que a R+-tree foi a melhor estrutura de indexação espacial para poin queries e enclosure range queries, ao passo que as variantes da R* -tree produziram os melhores resultados de desempenho para intersection e containment range queries. Por outro lado, os métodos Hilbert R-tree e SR-tree geraram um baixo desempenho para as quatro consultas espaciais investigadas. No entanto, em testes de desempenho adicionais, os quais modificaram tanto o tamanho quanto o formato dos retângulos de dados, os métodos de acesso Hilbert R-tree e SR-tree geraram resultados competitivos, particularmente para intersection e containment range querie

    Physical Data Warehouse Design on NoSQL Databases - OLAP Query Processing over HBase

    No full text
    Nowadays, data warehousing and online analytical processing (OLAP) are core technologies in business intelligence and therefore have drawn much interest by researchers in the last decade. However, these technologies have been mainly developed for relational database systems in centralized environments. In other words, these technologies have not been designed to be applied in scalable systems such as NoSQL databases. Adapting a data warehousing environment to NoSQL databases introduces several advantages, such as scalability and flexibility. This paper investigates three physical data warehouse designs to adapt the Star Schema Benchmark for its use in NoSQL databases. In particular, our main investigation refers to the OLAP query processing over column-oriented databases using the MapReduce framework. We analyze the impact of distributing attributes among column-families in HBase on the OLAP query performance. Our experiments showed how processing time of OLAP queries was impacted by a physical data warehouse design regarding the number of dimensions accessed and the data volume. We conclude that using distinct distributions of attributes among column-families can improve OLAP query performance in HBase and consequently make the benchmark more suitable for OLAP over NoSQL databases.FAPESP (Grant: 2014/12233-2)FINEPCAPESCNP
    corecore