10 research outputs found

    PPQ-Trajectory : spatio-temporal quantization for querying in large trajectory repositories

    Get PDF
    We present PPQ-trajectory, a spatio-temporal quantization based solution for querying large dynamic trajectory data. PPQ-trajectory includes a partition-wise predictive quantizer (PPQ) that generates an error-bounded codebook with autocorrelation and spatial proximity-based partitions. The codebook is indexed to run approximate and exact spatio-temporal queries over compressed trajectories. PPQ-trajectory includes a coordinate quadtree coding for the codebook with support for exact queries. An incremental temporal partition-based index is utilised to avoid full reconstruction of trajectories during queries. An extensive set of experimental results for spatio-temporal queries on real trajectory datasets is presented. PPQ-trajectory shows significant improvements over the alternatives with respect to several performance measures, including the accuracy of results when the summary is used directly to provide approximate query results, the spatial deviation with which spatio-temporal path queries can be answered when the summary is used as an index, and the time taken to construct the summary. Superior results on the quality of the summary and the compression ratio are also demonstrated

    Big data-driven prediction of airspace congestion

    Full text link
    Air Navigation Service Providers (ANSP) worldwide have been making a considerable effort for the development of a better method to measure and predict aircraft counts within a particular airspace, also referred to as airspace density. An accurate measurement and prediction of airspace density is crucial for a better managed airspace, both strategically and tactically, yielding a higher level of automation and thereby reducing the air traffic controller's workload. Although the prior approaches have been able to address the problem to some extent, data management and query processing of ever-increasing vast volume of air traffic data at high rates, for various analytics purposes such as predicting aircraft counts, still remains a challenge especially when only linear prediction models are used. In this paper, we present a novel data management and prediction system that accurately predicts aircraft counts for a particular airspace sector within the National Airspace System (NAS). The incoming Traffic Flow Management (TFM) data is streaming, big, uncorrelated and noisy. In the preprocessing step, the system continuously processes the incoming raw data, reduces it to a compact size, and stores it in a NoSQL database, where it makes the data available for efficient query processing. In the prediction step, the system learns from historical trajectories and uses their segments to collect key features such as sector boundary crossings, weather parameters, and other air traffic data. The features are fed into various regression models, including linear, non-linear and ensemble models, and the best performing model is used for prediction. Evaluation on an extensive set of real track, weather, and air traffic data including boundary crossings in the U.S. verify that our system efficiently and accurately predicts aircraft counts in each airspace sector.Comment: Submitted to the 2023 IEEE/AIAA Digital Aviation Systems Conference (DASC

    Review and classification of trajectory summarisation algorithms: From compression to segmentation

    Get PDF
    With the continuous development and cost reduction of positioning and tracking technologies, a large amount of trajectories are being exploited in multiple domains for knowledge extraction. A trajectory is formed by a large number of measurements, where many of them are unnecessary to describe the actual trajectory of the vehicle, or even harmful due to sensor noise. This not only consumes large amounts of memory, but also makes the extracting knowledge process more difficult. Trajectory summarisation techniques can solve this problem, generating a smaller and more manageable representation and even semantic segments. In this comprehensive review, we explain and classify techniques for the summarisation of trajectories according to their search strategy and point evaluation criteria, describing connections with the line simplification problem. We also explain several special concepts in trajectory summarisation problem. Finally, we outline the recent trends and best practices to continue the research in next summarisation algorithms.The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was funded by public research projects of Spanish Ministry of Economy and Competitivity (MINECO), reference TEC2017-88048-C2-2-

    ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data Management

    Get PDF
    University of Minnesota Ph.D. dissertation.May 2019. Major: Computer Science. Advisor: Mohamed Mokbel. 1 computer file (PDF); x, 123 pages.Apache Hadoop, employing the MapReduce programming paradigm, that has been widely accepted as the standard framework for analyzing big data in distributed environments. Unfortunately, this rich framework was not genuinely exploited towards processing large scale spatio-temporal data, especially with the emergence and popularity of applications that create them in large-scale. The huge volumes of spatio-temporal data come from applications, like Taxi fleet in urban computing, Asteroids in astronomy research studies, animal movements in habitat studies, neuron analysis in neuroscience research studies, and contents of social networks (e.g., Twitter or Facebook). Managing space and time are two fundamental characteristics that raised the demand for processing spatio-temporal data created by these applications. Besides the massive size of data, the complexity of shapes and formats associated with these data raised many challenges in managing spatio-temporal data. The goal of the dissertation is centered on establishing a full-fledged big spatio-temporal data management system that serves the need for a wide range of spatio-temporal applications. This involves indexing, querying, and analyzing spatio-temporal data. We propose ST-Hadoop; the first full-fledged open-source system with native support for big spatio-temporal data, available to download http://st-hadoop.cs.umn.edu/. ST- Hadoop injects spatio-temporal data awareness inside the highly popular Hadoop system that is considered state-of-the-art for off-line analysis of big data systems. Considering a distributed environment, we focus on the following: (1) indexing spatio-temporal data and (2) Supporting various fundamental spatio-temporal operations, such as range, kNN, and join (3) Supporting indexing and querying trajectories, which is considered as a special class of spatio-temporal data that require special handling. Throughout this dissertation, we will touch base on the background and related work, motivate for the proposed system, and highlight our contributions

    Gestion efficace et partage sécurisé des traces de mobilité

    Get PDF
    Nowadays, the advances in the development of mobile devices, as well as embedded sensors have permitted an unprecedented number of services to the user. At the same time, most mobile devices generate, store and communicate a large amount of personal information continuously. While managing personal information on the mobile devices is still a big challenge, sharing and accessing these information in a safe and secure way is always an open and hot topic. Personal mobile devices may have various form factors such as mobile phones, smart devices, stick computers, secure tokens or etc. It could be used to record, sense, store data of user's context or environment surrounding him. The most common contextual information is user's location. Personal data generated and stored on these devices is valuable for many applications or services to user, but it is sensitive and needs to be protected in order to ensure the individual privacy. In particular, most mobile applications have access to accurate and real-time location information, raising serious privacy concerns for their users.In this dissertation, we dedicate the two parts to manage the location traces, i.e. the spatio-temporal data on mobile devices. In particular, we offer an extension of spatio-temporal data types and operators for embedded environments. These data types reconcile the features of spatio-temporal data with the embedded requirements by offering an optimal data presentation called Spatio-temporal object (STOB) dedicated for embedded devices. More importantly, in order to optimize the query processing, we also propose an efficient indexing technique for spatio-temporal data called TRIFL designed for flash storage. TRIFL stands for TRajectory Index for Flash memory. It exploits unique properties of trajectory insertion, and optimizes the data structure for the behavior of flash and the buffer cache. These ideas allow TRIFL to archive much better performance in both Flash and magnetic storage compared to its competitors.Additionally, we also investigate the protect user's sensitive information in the remaining part of this thesis by offering a privacy-aware protocol for participatory sensing applications called PAMPAS. PAMPAS relies on secure hardware solutions and proposes a user-centric privacy-aware protocol that fully protects personal data while taking advantage of distributed computing. For this to be done, we also propose a partitioning algorithm an aggregate algorithm in PAMPAS. This combination drastically reduces the overall costs making it possible to run the protocol in near real-time at a large scale of participants, without any personal information leakage.Aujourd'hui, les progrès dans le développement d'appareils mobiles et des capteurs embarqués ont permis un essor sans précédent de services à l'utilisateur. Dans le même temps, la plupart des appareils mobiles génèrent, enregistrent et de communiquent une grande quantité de données personnelles de manière continue. La gestion sécurisée des données personnelles dans les appareils mobiles reste un défi aujourd’hui, que ce soit vis-à-vis des contraintes inhérentes à ces appareils, ou par rapport à l’accès et au partage sûrs et sécurisés de ces informations. Cette thèse adresse ces défis et se focalise sur les traces de localisation. En particulier, s’appuyant sur un serveur de données relationnel embarqué dans des appareils mobiles sécurisés, cette thèse offre une extension de ce serveur à la gestion des données spatio-temporelles (types et operateurs). Et surtout, elle propose une méthode d'indexation spatio-temporelle (TRIFL) efficace et adaptée au modèle de stockage en mémoire flash. Par ailleurs, afin de protéger les traces de localisation personnelles de l'utilisateur, une architecture distribuée et un protocole de collecte participative préservant les données de localisation ont été proposés dans PAMPAS. Cette architecture se base sur des dispositifs hautement sécurisés pour le calcul distribué des agrégats spatio-temporels sur les données privées collectées

    SEGMENTATION TECHNIQUES BASED ON CLUSTERING FOR THE ANALYSIS OF MOBILITY DATA

    Get PDF
    La Tesi riguarda l'analisi e applicazione di metodi di segmentazione per il partizionamento delle traiettorie spaziali in sotto-traiettorie semanticamente significative, e il loro utilizzo per l'analisi del comportamento di oggetti in movimento. Le traiettorie spaziali sono dati strutturati complessi costituiti da sequenze ordinate di punti spazio-temporali che campionano il movimento continuo di un oggetto in uno spazio di riferimento. Le tecniche di segmentazione sono essenziali per l'analisi delle traiettorie spaziali. In generale, l'attivit\ue0 di segmentazione divide una sequenza di punti dati in una serie di sottosequenze disgiunte basate su criteri di omogeneit\ue0. La Tesi si focalizza, in particolare, sulle tecniche di segmentazione basate su \u201cdensity based clustering\u201d. A differenza dei processi di clustering tradizionali, che sono applicati ad \u201cinsiemi\u201d di punti, le tecniche di segmentazione basate su clustering partizionano \u201csequenze\u201d in una serie di \u201cclusters\u201d temporalmente separati. Possibili applicazioni includono l'analisi del movimento di individui in ambito urbano e lo studio del comportamento di animali. Alcune tecniche di segmentazione basate su \u201ccluster\u201d sono descritte in letteratura, tuttavia nessuna di queste soluzioni permette di gestire in modo efficace i punti non strutturati (noise). Inoltre, le metodologie adottate per validare queste tecniche soffrono di gravi limitazioni, ad esempio le verifiche sperimentale utilizzano dati molto semplici che non riflettono la complessit\ue0 del movimento reale, come pure non permettono di effettuare un confronto con ground truth. Questa Tesi si focalizza su una recente tecnica per la segmentazione basata su cluster con noise, chiamata SeqScan, proposta in un lavoro precedente. In particolare, la ricerca ha affrontato i seguenti problemi: i) definizione di un framework rigoroso per l' analisi delle propriet\ue0 del modello di segmentazione; ii) validazione del metodo attraverso un'ampia sperimentazione che prevede il confronto con la ground truth; iii) estensione dell'approccio per consentire la individuazione di gatherings. Il gathering \ue9 un gruppo di oggetti mobili che condividono la stessa zona, per un certo periodo di tempo con la possibilit\ue0 di assenze occasionali; iv) sviluppo di una piattaforma software che integra i diversi algoritmi ed ulteriori strumenti a supporto dell'analisi dei dati di mobilit\ue0.The Thesis focuses on segmentation methods for the partitioning of spatial trajectories in semantically meaningful sub-trajectories and their application to the analysis of mobility behavior. Spatial trajectories are complex structured data consisting of sequences of temporally ordered spatio-temporal points sampling the continuous movement of an object in a reference space. Spatial trajectories can reveal behavioral information about individuals and groups of individuals, and that motivates the concern for data analysis techniques. Segmentation techniques are key for the analysis of spatial trajectories. In general, the segmentation task partitions a sequence of data points in a series of disjoint sub-sequences based on some homogeneity criteria. The Thesis focuses, in particular, on the use of clustering methods for the segmentation of spatial trajectories. Unlike the traditional clustering task, which is applied to sets of data points, the goal of this class of techniques is to partition sequential data in temporally separated clusters. Such techniques can be utilized for example to detect the sequences of places or regions visited by moving objects. While a number of techniques for the cluster-based segmentation are proposed in literature, none of them is really robust again noise, while the methodologies put in place to validate those techniques suffer from severe limitations, e.g., simple datasets, no comparison with ground truth. This Thesis focuses on a recent cluster-based segmentation method, called SeqScan, proposed in previous work. This technique promises to be robust against noise, nonetheless the approach is empirical and lacks a formal and theoretical framework. The contribution of this research is twofold. First it provides analytical support to SeqScan, defining a rigorous framework for the analysis of the properties of the model. The method is validated through an extensive experimentation conducted in an interdisciplinary setting and contrasting the segmentation with ground truth. The second contribution is the proposal of a technique for the discovery of a collective pattern, called gathering. The gathering pattern describes a situation in which a significant number of moving objects share the same region, for enough time periods with possibility of occasional absences, e.g. a concert, an exhibition. The technique is built on SeqScan. A platform, called MigrO, has been finally developed, including not only the algorithms but also a variety of tools facilitating data analysis

    Fast trajectory search for real-world applications

    Get PDF
    With the popularity of smartphones equipped with GPS, a vast amount of trajectory data are being produced from location-based services, such as Uber, Google Maps, and Foursquare. We broadly divide trajectory data into three types: 1) commuter trajectories from taxicabs and ride-sharing apps; 2) vehicle trajectories from GPS navigation apps; 3) activity trajectories from social network check-ins and travel blogs. We investigate efficient and effective search on each of the three types of trajectory data, each of which has a real-world application. In particular: 1) commuter trajectory search can serve for the transport capacity estimation and route planning; 2) vehicle trajectory search can help real-time traffic monitoring and trend analysis; 3) activity trajectory search can be used in interactive and personalized trip planning. As the most straightforward trajectory data, a commuter trajectory only contains two points: origin and destination indicating a passenger’s movement, which is valuable for transportation decision making. In this thesis, we propose a novel query RkNNT to estimate the capacity of a bus route in the transport network. Answering RkNNT is challenging due to the high amount of data from commuters. We propose efficient solutions to prune most trajectories which cannot choose a query route as their nearest one. Further, we apply RkNNT to the optimal route planning problem-MaxRkNNT. A vehicle trajectory has more points than a commuter trajectory, as it tracks the whole trace of a vehicle and can further advocate the application of traffic monitoring. We conclude the common queries over trajectory data for monitoring purposes and proposes a search engine Torch to manage and search trajectories with map matching over a road network, instead of storing raw data sampled from GPS with a high cost. Besides improving the efficiency of search, Torch also supports compression, effectiveness evaluation of various existing similarity measures, and large-scale clustering k-paths with a novel similarity measure LORS. Exploring the activity trajectory data which contains textual information can help plan personalized trips for tourists. Based on spatial indexes which we propose for commuter and vehicle trajectory data, we further develop a unified search paradigm to process various top-k queries over activity trajectory and POIs data (hotels, restaurants, and attractions, etc.) at the same time. In particular, a new point-wise similarity measure PATS and an indexing framework with a unified search paradigm are proposed

    Demonstration of the TrajStore system

    No full text
    corecore