10 research outputs found

    Bulk-load Operations for Multidimensional Data Structures

    Get PDF
    Import 03/11/2016Tato bakalářská práce se věnuje problematice hromadného vkládání vícerozměrných dat do stromových datových struktur, jmenovitě B-stromu a R-stromu. Cílem této práce je navrhnout a naimplementovat algoritmus pro hromadné vkládání dat do výše zmíněných struktur a porovnat rychlost hromadného vkládání s vkládáním po jednotlivých záznamech.This bachelor thesis deals with the issues of multidimensional data bulk loading in tree data structures, namely B-tree and R-tree. The goal of this thesis is to design and implement alghorithm for data bulk loading into data structures mentioned above and to compare bulk loading with the one-by-one insertion.460 - Katedra informatikyvelmi dobř

    An Efficient Algorithm for Bulk-Loading xBR+ -trees

    Get PDF
    A major part of the interface to a database is made up of the queries that can be addressed to this database and answered (processed) in an efficient way, contributing to the quality of the developed software. Efficiently processed spatial queries constitute a fundamental part of the interface to spatial databases due to the wide area of applications that may address such queries, like geographical information systems (GIS), location-based services, computer visualization, automated mapping, facilities management, etc. Another important capability of the interface to a spatial database is to offer the creation of efficient index structures to speed up spatial query processing. The xBR + -tree is a balanced disk-resident quadtree-based index structure for point data, which is very efficient for processing such queries. Bulk-loading refers to the process of creating an index from scratch, when the dataset to be indexed is available beforehand, instead of creating the index gradually (and more slowly), when the dataset elements are inserted one-by-one. In this paper, we present an algorithm for bulk-loading xBR + -trees for big datasets residing on disk, using a limited amount of main memory. The resulting tree is not only built fast, but exhibits high performance in processing a broad range of spatial queries, where one or two datasets are involved. To justify these characteristics, using real and artificial datasets of various cardinalities, first, we present an experimental comparison of this algorithm vs. a previous version of the same algorithm and STR, a popular algorithm of bulk-loading R-trees, regarding tree creation time and the characteristics of the trees created, and second, we experimentally compare the query efficiency of bulk-loaded xBR + -trees vs. bulk-loaded R-trees, regarding I/O and execution time. Thus, this paper contributes to the implementation of spatial database interfaces and the efficient storage organization for big spatial data management

    A user configurable implementation of B-trees

    Get PDF
    The use of B-trees for achieving good performance for updates and retrievals in databases is well-known. Many excellent implementations of B-trees are available as well. However it is difficult to find B-trees that are easily configured and deployed into experimental systems. We undertake an implementation of B-trees from scratch that specifically addresses configurability and deployablility issue. An XML file is used to store as well as document information such as page formats of the nodes of the B-trees and details about the nature of records and keys. The behavior of the tree is encapsulated by commands for creation of B-trees, insertions of records in the tree, and make retrievals via the tree. The XML based configuration together with commands make the deployment and functionality of the tree completely clear and straightforward

    Advanced Map Matching Technologies and Techniques for Pedestrian/Wheelchair Navigation

    Get PDF
    Due to the constantly increasing technical advantages of mobile devices (such as smartphones), pedestrian/wheelchair navigation recently has achieved a high level of interest as one of smartphones’ potential mobile applications. While vehicle navigation systems have already reached a certain level of maturity, pedestrian/wheelchair navigation services are still in their infancy. By comparing vehicle navigation systems, a set of map matching requirements and challenges unique in pedestrian/wheelchair navigation is identified. To provide navigation assistance to pedestrians and wheelchair users, there is a need for the design and development of new map matching techniques. The main goal of this research is to investigate and develop advanced map matching technologies and techniques particular for pedestrian/wheelchair navigation services. As the first step in map matching, an adaptive candidate segment selection algorithm is developed to efficiently find candidate segments. Furthermore, to narrow down the search for the correct segment, advanced mathematical models are applied. GPS-based chain-code map matching, Hidden Markov Model (HMM) map matching, and fuzzy-logic map matching algorithms are developed to estimate real-time location of users in pedestrian/wheelchair navigation systems/services. Nevertheless, GPS signal is not always available in areas with high-rise buildings and even when there is a signal, the accuracy may not be high enough for localization of pedestrians and wheelchair users on sidewalks. To overcome these shortcomings of GPS, multi-sensor integrated map matching algorithms are investigated and developed in this research. These algorithms include a movement pattern recognition algorithm, using accelerometer and compass data, and a vision-based positioning algorithm to fill in signal gaps in GPS positioning. Experiments are conducted to evaluate the developed algorithms using real field test data (GPS coordinates and other sensors data). The experimental results show that the developed algorithms and the integrated sensors, i.e., a monocular visual odometry, a GPS, an accelerometer, and a compass, can provide high-quality and uninterrupted localization services in pedestrian/wheelchair navigation systems/services. The map matching techniques developed in this work can be applied to various pedestrian/wheelchair navigation applications, such as tracking senior citizens and children, or tourist service systems, and can be further utilized in building walking robots and automatic wheelchair navigation systems

    On construction, performance, and diversification for structured queries on the semantic desktop

    Get PDF
    [no abstract

    O processo de refrescamento nos sistemas de data warehouse: guião de modelação conceptual da tarefa de extracção de dados

    Get PDF
    Nos últimos anos, os Sistemas de Data Warehouse (SDW) têm sido os sistemas de apoio à decisão mais utilizados nas organizações, integrando dados de diferentes fontes nos Repositórios de Data Warehouse (RDW). Com o decorrer do tempo de funcionamento do sistema, coloca-se o problema do refrescamento, entendido como o problema de assegurar que os conteúdos dos RDW são periodicamente refrescados, de modo a reflectirem as alterações que ocorrem nos dados das fontes que lhes servem de base. Esta dissertação propõe uma abordagem que tem como objectivos principais tornar explícito e documentar o problema do refrescamento e apresentar um guião de modelação conceptual da tarefa de extracção de dados que possa enriquecer as fases subsequentes de desenho para a especificação formal do processo de refrescamento. São dois os contributos desta dissertação. Primeiro, providencia um quadro detalhado sobre o problema do refrescamento que inclui os conceitos e questões fundamentais que permitem caracterizar os SDW, na perspectiva das funcionalidades no apoio à decisão, das abordagens de integração de fontes de dados e dos componentes da arquitectura, os constrangimentos e tarefas que compreendem o processo de refrescamento, as principais abordagens disponíveis na literatura. Segundo, propõe um guião de apoio à modelação conceptual da tarefa de extracção de dados, com base na UML, apresentando os passos que devem ser seguidos pelo designer e disponibilizando as construções que permitem representar os dados que se extraem das fontes, de acordo com as regras que permitem isolar e extrair os dados relevantes para a tomada de decisão.Data Warehouse Systems (DWS) have become very popular in the last years for decision making, by integrating data from internal and external sources into data warehouse stores. As times advances and the sources from which warehouse data is integrated change, the data warehouse contents must be regularly refreshed, such that warehouse data reflect the state of the underlying data sources. This dissertation proposes an approach which main goals are to explicit and document the data warehouse refreshment problem and to present a guidelines for the conceptual modelling of data extraction in order to enrich the subsequent design steps for the formal specification of the refreshment process. The contributions of our approach are twofold. First, it provides a detailed outline of data warehouse refreshment problem, including the main concepts and issues that characterise the general domain of the DWS, such as decision making functionalities, data sources integration approaches and architecture and, the refreshment tasks and constraints as well as the main approaches. Second, it proposes a guidelines for an UML conceptual modelling of data extraction, by giving the sequence of steps for a designer to follow, the modelling constructs for the definition of extracting data, according to the rules that must be accomplished for extracting relevant data

    Differential buffer for a relational column store in-memory database

    Get PDF
    At the present the financial and the analytical reporting has taken importance over the operational reporting. The main difference is that operational reporting focus on day-to-day operations and requires data on the detail of transactions, while the financial and analytical reporting focus on long term operations and uses multiple transactions. That situation, added to the hardware evolution, have determined the relational databases over the time. One of the different approaches is the actual SAP HANA database. This database focus on the financial and the analytical reporting without the use of the Data Warehouses. But it also provides good capabilities for the operational reporting. That was achieve through the use of a column store structure in main memory. But in order to prepare the data in the database, it holds up the insertion performance. This document studies the possibility to use a buffer in a prototype based in the SAP HANA database architecture, with the goal of improve that performance. In order to compare the impact in the system of the addition of a buffer, multiple approaches has been implemented, tested and carefully compared to each other and also to the original prototype.Grado en Ingeniería Informátic

    An evaluation of generic bulk loading techniques

    No full text
    Bulk loading refers to the process of creating an index from scratch for a given data set. This problem is well understood for B-trees, but so far, non-traditional index structures received modest attention. We are particularly interested in fast generic bulk loading techniques whose implementations only employ a small interface that is satisfied by a broad class of index structures. Generic techniques are very attractive to extensible database systems since different user-implemented index structures implementing that small interface can be bulk-loaded without any modification of the generic code. The main contribution of the paper is the proposal of two new generic and conceptually simple bulk loading algorithms. These algorithms recursively partition the input by using a main-memory index of the same type as the target index to be build. In contrast to previous generic bulk loading algorithms, the implementation of our new algorithms turns out to be much easier. Another advantage is that our new algorithms possess fewer parameters whose settings have to be taken into consideration. An experimental performance comparison is presented where different bulk loading algorithms are investigated in a system-like scenario. Our experiments are unique in the sense that we examine the same code for different index structures (R-tree and Slim-tree). The results consistently indicate that our new algorithms outperform asymptotically worst-case optimal competitors. Moreover, the search quality of the target index will be better when our new bulk loading algorithms are used. *This work has been supported by grant no. SE 553/2-1 from DFG
    corecore