616 research outputs found

    A storage and access architecture for efficient query processing in spatial database systems

    Get PDF
    Due to the high complexity of objects and queries and also due to extremely large data volumes, geographic database systems impose stringent requirements on their storage and access architecture with respect to efficient query processing. Performance improving concepts such as spatial storage and access structures, approximations, object decompositions and multi-phase query processing have been suggested and analyzed as single building blocks. In this paper, we describe a storage and access architecture which is composed from the above building blocks in a modular fashion. Additionally, we incorporate into our architecture a new ingredient, the scene organization, for efficiently supporting set-oriented access of large-area region queries. An experimental performance comparison demonstrates that the concept of scene organization leads to considerable performance improvements for large-area region queries by a factor of up to 150

    The performance of object decomposition techniques for spatial query processing

    Get PDF

    Possibilities and Limits in Visualizing Large Amounts of Multidimensional Data

    Get PDF

    Multi-Step Processing of Spatial Joins

    Get PDF
    Spatial joins are one of the most important operations for combining spatial objects of several relations. In this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last year’s conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by the following two steps. First of all, sophisticated approximations are used to identify answers as well as to filter out false hits from the set of candidates. For this purpose, we investigate various types of conservative and progressive approximations. In the last step, the exact geometry of the remaining candidates has to be tested against the join predicate. The time required for computing spatial join predicates can essentially be reduced when objects are adequately organized in main memory. In our approach, objects are first decomposed into simple components which are exclusively organized by a main-memory resident spatial data structure. Overall, we present a complete approach of spatial join processing on complex spatial objects. The performance of the individual steps of our approach is evaluated with data sets from real cartographic applications. The results show that our approach reduces the total execution time of the spatial join by factors

    Efficient Processing of Spatial Joins Using R-Trees

    Get PDF
    Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems

    Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently

    Full text link
    \newcommand{\dist}{\operatorname{dist}} In this paper we define the notion of a probabilistic neighborhood in spatial data: Let a set PP of nn points in Rd\mathbb{R}^d, a query point q∈Rdq \in \mathbb{R}^d, a distance metric \dist, and a monotonically decreasing function f:R+→[0,1]f : \mathbb{R}^+ \rightarrow [0,1] be given. Then a point p∈Pp \in P belongs to the probabilistic neighborhood N(q,f)N(q, f) of qq with respect to ff with probability f(\dist(p,q)). We envision applications in facility location, sensor networks, and other scenarios where a connection between two entities becomes less likely with increasing distance. A straightforward query algorithm would determine a probabilistic neighborhood in Θ(n⋅d)\Theta(n\cdot d) time by probing each point in PP. To answer the query in sublinear time for the planar case, we augment a quadtree suitably and design a corresponding query algorithm. Our theoretical analysis shows that -- for certain distributions of planar PP -- our algorithm answers a query in O((∣N(q,f)∣+n)log⁡n)O((|N(q,f)| + \sqrt{n})\log n) time with high probability (whp). This matches up to a logarithmic factor the cost induced by quadtree-based algorithms for deterministic queries and is asymptotically faster than the straightforward approach whenever ∣N(q,f)∣∈o(n/log⁡n)|N(q,f)| \in o(n / \log n). As practical proofs of concept we use two applications, one in the Euclidean and one in the hyperbolic plane. In particular, our results yield the first generator for random hyperbolic graphs with arbitrary temperatures in subquadratic time. Moreover, our experimental data show the usefulness of our algorithm even if the point distribution is unknown or not uniform: The running time savings over the pairwise probing approach constitute at least one order of magnitude already for a modest number of points and queries.Comment: The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-44543-4_3

    Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms

    Full text link
    Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids. In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not hold for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains such as biology that require the use of Jaccard, Gower, or more complex distances. A key issue with PAM is its high run time cost. We propose modifications to the PAM algorithm to achieve an O(k)-fold speedup in the second SWAP phase of the algorithm, but will still find the same results as the original PAM algorithm. If we slightly relax the choice of swaps performed (at comparable quality), we can further accelerate the algorithm by performing up to k swaps in each iteration. With the substantially faster SWAP, we can now also explore alternative strategies for choosing the initial medoids. We also show how the CLARA and CLARANS algorithms benefit from these modifications. It can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance with high k becomes increasingly important. In experiments on real data with k=100, we observed a 200-fold speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets as long as we can afford to compute a distance matrix, and in particular to higher k (at k=2, the new SWAP was only 1.5 times faster, as the speedup is expected to increase with k)

    How Do You Like Me in This: User Embodiment Preferences for Companion Agents

    Get PDF
    We investigate the relationship between the embodiment of an artificial companion and user perception and interaction with it. In a Wizard of Oz study, 42 users interacted with one of two embodiments: a physical robot or a virtual agent on a screen through a role-play of secretarial tasks in an office, with the companion providing essential assistance. Findings showed that participants in both condition groups when given the choice would prefer to interact with the robot companion, mainly for its greater physical or social presence. Subjects also found the robot less annoying and talked to it more naturally. However, this preference for the robotic embodiment is not reflected in the users’ actual rating of the companion or their interaction with it. We reflect on this contradiction and conclude that in a task-based context a user focuses much more on a companion’s behaviour than its embodiment. This underlines the feasibility of our efforts in creating companions that migrate between embodiments while maintaining a consistent identity from the user’s point of view

    Query processing of spatial objects: Complexity versus Redundancy

    Get PDF
    The management of complex spatial objects in applications, such as geography and cartography, imposes stringent new requirements on spatial database systems, in particular on efficient query processing. As shown before, the performance of spatial query processing can be improved by decomposing complex spatial objects into simple components. Up to now, only decomposition techniques generating a linear number of very simple components, e.g. triangles or trapezoids, have been considered. In this paper, we will investigate the natural trade-off between the complexity of the components and the redundancy, i.e. the number of components, with respect to its effect on efficient query processing. In particular, we present two new decomposition methods generating a better balance between the complexity and the number of components than previously known techniques. We compare these new decomposition methods to the traditional undecomposed representation as well as to the well-known decomposition into convex polygons with respect to their performance in spatial query processing. This comparison points out that for a wide range of query selectivity the new decomposition techniques clearly outperform both the undecomposed representation and the convex decomposition method. More important than the absolute gain in performance by a factor of up to an order of magnitude is the robust performance of our new decomposition techniques over the whole range of query selectivity
    • 

    corecore