417 research outputs found
Performance comparison of point and spatial access methods
In the past few years a large number of multidimensional point access methods, also called
multiattribute index structures, has been suggested, all of them claiming good performance. Since no
performance comparison of these structures under arbitrary (strongly correlated nonuniform, short
"ugly") data distributions and under various types of queries has been performed, database
researchers and designers were hesitant to use any of these new point access methods. As shown in
a recent paper, such point access methods are not only important in traditional database applications.
In new applications such as CAD/CIM and geographic or environmental information systems, access
methods for spatial objects are needed. As recently shown such access methods are based on point
access methods in terms of functionality and performance. Our performance comparison naturally
consists of two parts. In part I we w i l l compare multidimensional point access methods, whereas in
part I I spatial access methods for rectangles will be compared. In part I we present a survey and
classification of existing point access methods. Then we carefully select the following four methods
for implementation and performance comparison under seven different data files (distributions) and
various types of queries: the 2-level grid file, the BANG file, the hB-tree and a new scheme, called
the BUDDY hash tree. We were surprised to see one method to be the clear winner which was the
BUDDY hash tree. It exhibits an at least 20 % better average performance than its competitors and is
robust under ugly data and queries. In part I I we compare spatial access methods for rectangles.
After presenting a survey and classification of existing spatial access methods we carefully selected
the following four methods for implementation and performance comparison under six different data
files (distributions) and various types of queries: the R-tree, the BANG file, PLOP hashing and the
BUDDY hash tree. The result presented two winners: the BANG file and the BUDDY hash tree.
This comparison is a first step towards a standardized testbed or benchmark. We offer our data and
query files to each designer of a new point or spatial access method such that he can run his
implementation in our testbed
SVS-JOIN : efficient spatial visual similarity join for geo-multimedia
In the big data era, massive amount of multimedia data with geo-tags has been generated and collected by smart devices equipped with mobile communications module and position sensor module. This trend has put forward higher request on large-scale geo-multimedia retrieval. Spatial similarity join is one of the significant problems in the area of spatial database. Previous works focused on spatial textual document search problem, rather than geo-multimedia retrieval. In this paper, we investigate a novel geo-multimedia retrieval paradigm named spatial visual similarity join (SVS-JOIN for short), which aims to search similar geo-image pairs in both aspects of geo-location and visual content. Firstly, the definition of SVS-JOIN is proposed and then we present the geographical similarity and visual similarity measurement. Inspired by the approach for textual similarity join, we develop an algorithm named SVS-JOIN B by combining the PPJOIN algorithm and visual similarity. Besides, an extension of it named SVS-JOIN G is developed, which utilizes spatial grid strategy to improve the search efficiency. To further speed up the search, a novel approach called SVS-JOIN Q is carefully designed, in which a quadtree and a global inverted index are employed. Comprehensive experiments are conducted on two geo-image datasets and the results demonstrate that our solution can address the SVS-JOIN problem effectively and efficiently
THREE TEMPORAL PERSPECTIVES ON DECENTRALIZED LOCATION-AWARE COMPUTING: PAST, PRESENT, FUTURE
Durant les quatre derniĂšres dĂ©cennies, la miniaturisation a permis la diffusion Ă large Ă©chelle des ordinateurs, les rendant omniprĂ©sents. Aujourdâhui, le nombre dâobjets connectĂ©s Ă Internet ne cesse de croitre et cette tendance nâa pas lâair de ralentir. Ces objets, qui peuvent ĂȘtre des tĂ©lĂ©phones mobiles, des vĂ©hicules ou des senseurs, gĂ©nĂšrent de trĂšs grands volumes de donnĂ©es qui sont presque toujours associĂ©s Ă un contexte spatiotemporel. Le volume de ces donnĂ©es est souvent si grand que leur traitement requiert la crĂ©ation de systĂšme distribuĂ©s qui impliquent la coopĂ©ration de plusieurs ordinateurs. La capacitĂ© de traiter ces donnĂ©es revĂȘt une importance sociĂ©tale. Par exemple: les donnĂ©es collectĂ©es lors de trajets en voiture permettent aujourdâhui dâĂ©viter les em-bouteillages ou de partager son vĂ©hicule. Un autre exemple: dans un avenir proche, les donnĂ©es collectĂ©es Ă lâaide de gyroscopes capables de dĂ©tecter les trous dans la chaussĂ©e permettront de mieux planifier les interventions de maintenance Ă effectuer sur le rĂ©seau routier. Les domaines dâapplications sont par consĂ©quent nombreux, de mĂȘme que les problĂšmes qui y sont associĂ©s. Les articles qui composent cette thĂšse traitent de systĂšmes qui partagent deux caractĂ©ristiques clĂ©s: un contexte spatiotemporel et une architecture dĂ©centralisĂ©e. De plus, les systĂšmes dĂ©crits dans ces articles sâarticulent autours de trois axes temporels: le prĂ©sent, le passĂ©, et le futur. Les systĂšmes axĂ©s sur le prĂ©sent permettent Ă un trĂšs grand nombre dâobjets connectĂ©s de communiquer en fonction dâun contexte spatial avec des temps de rĂ©ponses proche du temps rĂ©el. Nos contributions dans ce domaine permettent Ă ce type de systĂšme dĂ©centralisĂ© de sâadapter au volume de donnĂ©e Ă traiter en sâĂ©tendant sur du matĂ©riel bon marchĂ©. Les systĂšmes axĂ©s sur le passĂ© ont pour but de faciliter lâaccĂšs a de trĂšs grands volumes donnĂ©es spatiotemporelles collectĂ©es par des objets connectĂ©s. En dâautres termes, il sâagit dâindexer des trajectoires et dâexploiter ces indexes. Nos contributions dans ce domaine permettent de traiter des jeux de trajectoires particuliĂšrement denses, ce qui nâavait pas Ă©tĂ© fait auparavant. Enfin, les systĂšmes axĂ©s sur le futur utilisent les trajectoires passĂ©es pour prĂ©dire les trajectoires que des objets connectĂ©s suivront dans lâavenir. Nos contributions permettent de prĂ©dire les trajectoires suivies par des objets connectĂ©s avec une granularitĂ© jusque lĂ inĂ©galĂ©e. Bien quâimpliquant des domaines diffĂ©rents, ces contributions sâarticulent autour de dĂ©nominateurs communs des systĂšmes sous-jacents, ouvrant la possibilitĂ© de pouvoir traiter ces problĂšmes avec plus de gĂ©nĂ©ricitĂ© dans un avenir proche.
--
During the past four decades, due to miniaturization computing devices have become ubiquitous and pervasive. Today, the number of objects connected to the Internet is in- creasing at a rapid pace and this trend does not seem to be slowing down. These objects, which can be smartphones, vehicles, or any kind of sensors, generate large amounts of data that are almost always associated with a spatio-temporal context. The amount of this data is often so large that their processing requires the creation of a distributed system, which involves the cooperation of several computers. The ability to process these data is important for society. For example: the data collected during car journeys already makes it possible to avoid traffic jams or to know about the need to organize a carpool. Another example: in the near future, the maintenance interventions to be carried out on the road network will be planned with data collected using gyroscopes that detect potholes. The application domains are therefore numerous, as are the prob- lems associated with them. The articles that make up this thesis deal with systems that share two key characteristics: a spatio-temporal context and a decentralized architec- ture. In addition, the systems described in these articles revolve around three temporal perspectives: the present, the past, and the future. Systems associated with the present perspective enable a very large number of connected objects to communicate in near real-time, according to a spatial context. Our contributions in this area enable this type of decentralized system to be scaled-out on commodity hardware, i.e., to adapt as the volume of data that arrives in the system increases. Systems associated with the past perspective, often referred to as trajectory indexes, are intended for the access to the large volume of spatio-temporal data collected by connected objects. Our contributions in this area makes it possible to handle particularly dense trajectory datasets, a problem that has not been addressed previously. Finally, systems associated with the future per- spective rely on past trajectories to predict the trajectories that the connected objects will follow. Our contributions predict the trajectories followed by connected objects with a previously unmet granularity. Although involving different domains, these con- tributions are structured around the common denominators of the underlying systems, which opens the possibility of being able to deal with these problems more generically in the near future
Label Space Partition Selection for Multi-Object Tracking Using Two-Layer Partitioning
Estimating the trajectories of multi-objects poses a significant challenge
due to data association ambiguity, which leads to a substantial increase in
computational requirements. To address such problems, a divide-and-conquer
manner has been employed with parallel computation. In this strategy,
distinguished objects that have unique labels are grouped based on their
statistical dependencies, the intersection of predicted measurements. Several
geometry approaches have been used for label grouping since finding all
intersected label pairs is clearly infeasible for large-scale tracking
problems. This paper proposes an efficient implementation of label grouping for
label-partitioned generalized labeled multi-Bernoulli filter framework using a
secondary partitioning technique. This allows for parallel computation in the
label graph indexing step, avoiding generating and eliminating duplicate
comparisons. Additionally, we compare the performance of the proposed technique
with several efficient spatial searching algorithms. The results demonstrate
the superior performance of the proposed approach on large-scale data sets,
enabling scalable trajectory estimation.Comment: 6 pages, 4 figure
Top-K Queries Over Digital Traces
Recent advances in social and mobile technology have enabled an abundance of digital traces (in the form of mobile check-ins, WiFi hotspots handshaking, etc.) revealing the physical presence history of diverse sets of entities. One challenging, yet important, task is to identify k entities that are most closely associated with a given query entity based on their digital traces. We propose a suite of hierarchical indexing techniques and algorithms to enable fast query processing for this problem at scale. We theoretically analyze the pruning effectiveness of the proposed methods based on a human mobility model which we propose and validate in real life situations. Finally, we conduct extensive experiments on both synthetic and real datasets at scale, evaluating the performance of our techniques, confirming the effectiveness and superiority of our approach over other applicable approaches across a variety of parameter settings and datasets
- âŠ