927 research outputs found

    A link-based storage scheme for efficient aggregate query processing on clustered road networks

    Get PDF
    Cataloged from PDF version of article.The need to have efficient storage schemes for spatial networks is apparent when the volume of query processing in some road networks (e.g., the navigation systems) is considered. Specifically, under the assumption that the road network is stored in a central server, the adjacent data elements in the network must be clustered on the disk in such a way that the number of disk page accesses is kept minimal during the processing of network queries. In this work, we introduce the link-based storage scheme for clustered road networks and compare it with the previously proposed junction-based storage scheme. in order to investigate the performance of aggregate network queries in clustered road networks, we extend our recently proposed clustering hypergraph model from junction-based storage to link-based storage. We propose techniques for additional storage savings in bidirectional networks that make the link-based storage scheme even more preferable in terms of the storage efficiency. We evaluate the performance of our link-based storage scheme against the junction-based storage scheme both theoretically and empirically. The results of the experiments conducted on a wide range of road network datasets show that the link-based storage scheme is preferable in terms of both storage and query processing efficiency. (C) 2009 Elsevier B.V. All rights reserved

    Reproducible geoscientific modelling with hypergraphs

    Get PDF
    Reproducing the construction of a geoscientific model is a hard task. It requires the availability of all required data and an exact description how the construction was performed. In practice data availability and the exactness of the description is often lacking. As part of this thesis I introduce a conceptual framework how geoscientific model constructions can be described as directed acyclic hypergraphs, how such recorded construction graphs can be used to reconstruct the model, and how repetitive constructions can be used to verify the reproducibility of a geoscientific model construction process. In addition I present a software prototype, implementing these concepts. The prototype is tested with three different case studies, including a geophysical measurement analysis, a subsurface model construction and the calculation of a hydrological balance model.:1. Introduction 1.1. Survey on Reproducibility and Automation for Geoscientific Model Construction 1.2. Motivating Example 1.3. Previous Work 1.4. Problem Description 1.5. Structure of this Thesis 1.6. Results Accomplished by this Thesis 2. Terms, Definitions and Requirements 2.1. Terms and Definitions 2.1.1. Geoscientific model 2.1.2. Reproducibility 2.1.3. Realisation 2.2. Requirements 3. Related Work 3.1. Overview 3.2. Geoscientific Data Storage Systems 3.2.1. PostGIS and Similar Systems 3.2.2. Geoscience in Space and Time (GST) 3.3. Geoscientific Modelling Software 3.3.1. gOcad 3.3.2. GemPy 3.4. Experimentation Management Software 3.4.1. DataLad 3.4.2. Data Version Control (DVC) 3.5. Reproducible Software Builds 3.6. Summarised Releated Work 4. Concept 4.1. Construction Hypergraphs 4.1.1. Reproducibility Based on Construction Hypergraphs 4.1.2. Equality definitions 4.1.3. Design Constraints 4.2. Data Handling 5. Design 5.1. Application Structure 5.1.1. Choice of Application Architecture for GeoHub 5.2. Extension Mechanisms 5.2.1. Overview 5.2.2. A Shared Library Based Extension System 5.2.3. Inter-Process Communication Based Extension System 5.2.4. An Extension System Based on a Scripting Language 5.2.5. An Extension System Based on a WebAssembly Interface 5.2.6. Comparison 5.3. Data Storage 5.3.1. Overview 5.3.2. Stored Data 5.3.3. Potential Solutions 5.3.4. Model Versioning 5.3.5. Transactional security 6. Implementation 6.1. General Application Structure 6.2. Data Storage 6.2.1. Database 6.2.2. User-provided Data-processing Extensions 6.3. Operation Executor 6.3.1. Construction Step Descriptions 6.3.2. Construction Step Scheduling 6.3.3. Construction Step Execution 7. Case Studies 7.1. Overview 7.2. Geophysical Model of the BHMZ block 7.2.1. Provided Data and Initial Situation 7.2.2. Construction Process Description 7.2.3. Reproducibility 7.2.4. Identified Problems and Construction Process Improvements 7.2.5. Recommendations 7.3. Three-Dimensional Subsurface Model of the Kolhberg Region 7.3.1. Provided Data and Initial Situation 7.3.2. Construction Process Description 7.3.3. Reproducibility 7.3.4. Identified Problems and Construction Process Improvements 7.3.5. Recommendations 7.4. Hydrologic Balance Model of a Saxonian Stream 7.4.1. Provided Data and Initial Situation 7.4.2. Construction Process Description 7.4.3. Reproducibility 7.4.4. Identified Problems and Construction Process Improvements 7.4.5. Recommendations 7.5. Lessons Learned 8. Conclusions 8.1. Summary 8.2. Outlook 8.2.1. Parametric Model Construction Process 8.2.2. Pull and Push Nodes 8.2.3. Parallelize Single Construction Steps 8.2.4. Provable Model Construction Process Attestation References Appendi

    Efficient successor retrieval operations for aggregate query processing on clustered road networks

    Get PDF
    Cataloged from PDF version of article.Get-Successors (GS) which retrieves all successors of a junction is a kernel operation used to facilitate aggregate computations in road network queries. Efficient implementation of the GS operation is crucial since the disk access cost of this operation constitutes a considerable portion of the total query processing cost. Firstly, we propose a new successor retrieval operation Get-Unevaluated-Successors (GUS), which retrieves only the unevaluated successors of a given junction. The GUS operation is an efficient implementation of the GS operation, where the candidate successors to be retrieved are pruned according to the properties and state of the algorithm. Secondly, we propose a hypergraph-based model for clustering successively retrieved junctions by the GUS operations to the same pages. The proposed model utilizes query logs to correctly capture the disk access cost of GUS operations. The proposed GUS operation and associated clustering model are evaluated for two different instances of GUS operations which typically arise in Dijkstra's single source shortest path algorithm and incremental network expansion framework. Our simulation results show that the proposed successor retrieval operation together with the proposed clustering hypergraph model is quite effective in reducing the number of disk accesses in query processing. (C) 2010 Published by Elsevier Inc

    Replicated partitioning for undirected hypergraphs

    Get PDF
    Cataloged from PDF version of article.Hypergraph partitioning (HP) and replication are diverse but powerful tools that are traditionally applied separately to minimize the costs of parallel and sequential systems that access related data or process related tasks. When combined together, these two techniques have the potential of achieving significant improvements in performance of many applications. In this study, we provide an approach involving a tool that simultaneously performs replication and partitioning of the vertices of an undirected hypergraph whose vertices represent data and nets represent task dependencies among these data. In this approach, we propose an iterative-improvement-based replicated bipartitioning heuristic, which is capable of move, replication, and unreplication of vertices. In order to utilize our replicated bipartitioning heuristic in a recursive bipartitioning framework, we also propose appropriate cut-net removal, cut-net splitting, and pin selection algorithms to correctly encapsulate the two most commonly used cutsize metrics. We embed our replicated bipartitioning scheme into the state-of-the-art multilevel HP tool PaToH to provide an effective and efficient replicated HP tool, rpPaToH. The performance of the techniques proposed and the tools developed is tested over the undirected hypergraphs that model the communication costs of parallel query processing in information retrieval systems. Our experimental analysis indicates that the proposed technique provides significant improvements in the quality of the partitions, especially under low replication ratios. (C) 2012 Elsevier Inc. All rights reserved