417 research outputs found

    Performance comparison of point and spatial access methods

    Get PDF
    In the past few years a large number of multidimensional point access methods, also called multiattribute index structures, has been suggested, all of them claiming good performance. Since no performance comparison of these structures under arbitrary (strongly correlated nonuniform, short "ugly") data distributions and under various types of queries has been performed, database researchers and designers were hesitant to use any of these new point access methods. As shown in a recent paper, such point access methods are not only important in traditional database applications. In new applications such as CAD/CIM and geographic or environmental information systems, access methods for spatial objects are needed. As recently shown such access methods are based on point access methods in terms of functionality and performance. Our performance comparison naturally consists of two parts. In part I we w i l l compare multidimensional point access methods, whereas in part I I spatial access methods for rectangles will be compared. In part I we present a survey and classification of existing point access methods. Then we carefully select the following four methods for implementation and performance comparison under seven different data files (distributions) and various types of queries: the 2-level grid file, the BANG file, the hB-tree and a new scheme, called the BUDDY hash tree. We were surprised to see one method to be the clear winner which was the BUDDY hash tree. It exhibits an at least 20 % better average performance than its competitors and is robust under ugly data and queries. In part I I we compare spatial access methods for rectangles. After presenting a survey and classification of existing spatial access methods we carefully selected the following four methods for implementation and performance comparison under six different data files (distributions) and various types of queries: the R-tree, the BANG file, PLOP hashing and the BUDDY hash tree. The result presented two winners: the BANG file and the BUDDY hash tree. This comparison is a first step towards a standardized testbed or benchmark. We offer our data and query files to each designer of a new point or spatial access method such that he can run his implementation in our testbed

    SVS-JOIN : efficient spatial visual similarity join for geo-multimedia

    Get PDF
    In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently

    THREE TEMPORAL PERSPECTIVES ON DECENTRALIZED LOCATION-AWARE COMPUTING: PAST, PRESENT, FUTURE

    Get PDF
    Durant les quatre derniĂšres dĂ©cennies, la miniaturisation a permis la diffusion Ă  large Ă©chelle des ordinateurs, les rendant omniprĂ©sents. Aujourd’hui, le nombre d’objets connectĂ©s Ă  Internet ne cesse de croitre et cette tendance n’a pas l’air de ralentir. Ces objets, qui peuvent ĂȘtre des tĂ©lĂ©phones mobiles, des vĂ©hicules ou des senseurs, gĂ©nĂšrent de trĂšs grands volumes de donnĂ©es qui sont presque toujours associĂ©s Ă  un contexte spatiotemporel. Le volume de ces donnĂ©es est souvent si grand que leur traitement requiert la crĂ©ation de systĂšme distribuĂ©s qui impliquent la coopĂ©ration de plusieurs ordinateurs. La capacitĂ© de traiter ces donnĂ©es revĂȘt une importance sociĂ©tale. Par exemple: les donnĂ©es collectĂ©es lors de trajets en voiture permettent aujourd’hui d’éviter les em-bouteillages ou de partager son vĂ©hicule. Un autre exemple: dans un avenir proche, les donnĂ©es collectĂ©es Ă  l’aide de gyroscopes capables de dĂ©tecter les trous dans la chaussĂ©e permettront de mieux planifier les interventions de maintenance Ă  effectuer sur le rĂ©seau routier. Les domaines d’applications sont par consĂ©quent nombreux, de mĂȘme que les problĂšmes qui y sont associĂ©s. Les articles qui composent cette thĂšse traitent de systĂšmes qui partagent deux caractĂ©ristiques clĂ©s: un contexte spatiotemporel et une architecture dĂ©centralisĂ©e. De plus, les systĂšmes dĂ©crits dans ces articles s’articulent autours de trois axes temporels: le prĂ©sent, le passĂ©, et le futur. Les systĂšmes axĂ©s sur le prĂ©sent permettent Ă  un trĂšs grand nombre d’objets connectĂ©s de communiquer en fonction d’un contexte spatial avec des temps de rĂ©ponses proche du temps rĂ©el. Nos contributions dans ce domaine permettent Ă  ce type de systĂšme dĂ©centralisĂ© de s’adapter au volume de donnĂ©e Ă  traiter en s’étendant sur du matĂ©riel bon marchĂ©. Les systĂšmes axĂ©s sur le passĂ© ont pour but de faciliter l’accĂšs a de trĂšs grands volumes donnĂ©es spatiotemporelles collectĂ©es par des objets connectĂ©s. En d’autres termes, il s’agit d’indexer des trajectoires et d’exploiter ces indexes. Nos contributions dans ce domaine permettent de traiter des jeux de trajectoires particuliĂšrement denses, ce qui n’avait pas Ă©tĂ© fait auparavant. Enfin, les systĂšmes axĂ©s sur le futur utilisent les trajectoires passĂ©es pour prĂ©dire les trajectoires que des objets connectĂ©s suivront dans l’avenir. Nos contributions permettent de prĂ©dire les trajectoires suivies par des objets connectĂ©s avec une granularitĂ© jusque lĂ  inĂ©galĂ©e. Bien qu’impliquant des domaines diffĂ©rents, ces contributions s’articulent autour de dĂ©nominateurs communs des systĂšmes sous-jacents, ouvrant la possibilitĂ© de pouvoir traiter ces problĂšmes avec plus de gĂ©nĂ©ricitĂ© dans un avenir proche. -- During the past four decades, due to miniaturization computing devices have become ubiquitous and pervasive. Today, the number of objects connected to the Internet is in- creasing at a rapid pace and this trend does not seem to be slowing down. These objects, which can be smartphones, vehicles, or any kind of sensors, generate large amounts of data that are almost always associated with a spatio-temporal context. The amount of this data is often so large that their processing requires the creation of a distributed system, which involves the cooperation of several computers. The ability to process these data is important for society. For example: the data collected during car journeys already makes it possible to avoid traffic jams or to know about the need to organize a carpool. Another example: in the near future, the maintenance interventions to be carried out on the road network will be planned with data collected using gyroscopes that detect potholes. The application domains are therefore numerous, as are the prob- lems associated with them. The articles that make up this thesis deal with systems that share two key characteristics: a spatio-temporal context and a decentralized architec- ture. In addition, the systems described in these articles revolve around three temporal perspectives: the present, the past, and the future. Systems associated with the present perspective enable a very large number of connected objects to communicate in near real-time, according to a spatial context. Our contributions in this area enable this type of decentralized system to be scaled-out on commodity hardware, i.e., to adapt as the volume of data that arrives in the system increases. Systems associated with the past perspective, often referred to as trajectory indexes, are intended for the access to the large volume of spatio-temporal data collected by connected objects. Our contributions in this area makes it possible to handle particularly dense trajectory datasets, a problem that has not been addressed previously. Finally, systems associated with the future per- spective rely on past trajectories to predict the trajectories that the connected objects will follow. Our contributions predict the trajectories followed by connected objects with a previously unmet granularity. Although involving different domains, these con- tributions are structured around the common denominators of the underlying systems, which opens the possibility of being able to deal with these problems more generically in the near future

    Label Space Partition Selection for Multi-Object Tracking Using Two-Layer Partitioning

    Full text link
    Estimating the trajectories of multi-objects poses a significant challenge due to data association ambiguity, which leads to a substantial increase in computational requirements. To address such problems, a divide-and-conquer manner has been employed with parallel computation. In this strategy, distinguished objects that have unique labels are grouped based on their statistical dependencies, the intersection of predicted measurements. Several geometry approaches have been used for label grouping since finding all intersected label pairs is clearly infeasible for large-scale tracking problems. This paper proposes an efficient implementation of label grouping for label-partitioned generalized labeled multi-Bernoulli filter framework using a secondary partitioning technique. This allows for parallel computation in the label graph indexing step, avoiding generating and eliminating duplicate comparisons. Additionally, we compare the performance of the proposed technique with several efficient spatial searching algorithms. The results demonstrate the superior performance of the proposed approach on large-scale data sets, enabling scalable trajectory estimation.Comment: 6 pages, 4 figure

    Top-K Queries Over Digital Traces

    Get PDF
    Recent advances in social and mobile technology have enabled an abundance of digital traces (in the form of mobile check-ins, WiFi hotspots handshaking, etc.) revealing the physical presence history of diverse sets of entities. One challenging, yet important, task is to identify k entities that are most closely associated with a given query entity based on their digital traces. We propose a suite of hierarchical indexing techniques and algorithms to enable fast query processing for this problem at scale. We theoretically analyze the pruning effectiveness of the proposed methods based on a human mobility model which we propose and validate in real life situations. Finally, we conduct extensive experiments on both synthetic and real datasets at scale, evaluating the performance of our techniques, confirming the effectiveness and superiority of our approach over other applicable approaches across a variety of parameter settings and datasets
    • 

    corecore