7 research outputs found

    Efficient point-based trajectory search

    Get PDF
    LNCS v. 9239 entitled: Advances in Spatial and Temporal Databases: 14th International Symposium, SSTD 2015, Hong Kong, China, August 26-28, 2015. ProceedingsTrajectory data capture the traveling history of moving objects such as people or vehicles. With the proliferation of GPS and tracking technology, huge volumes of trajectories are rapidly generated and collected. Under this, applications such as route recommendation and traveling behavior mining call for efficient trajectory retrieval. In this paper, we first focus on distance-based trajectory search; given a collection of trajectories and a set query points, the goal is to retrieve the top-k trajectories that pass as close as possible to all query points. We advance the state-of-the-art by combining existing approaches to a hybrid method and also proposing an alternative, more efficient rangebased approach. Second, we propose and study the practical variant of bounded distance-based search, which takes into account the temporal characteristics of the searched trajectories. Through an extensive experimental analysis with real trajectory data, we show that our rangebased approach outperforms previous methods by at least one order of magnitude. © Springer International Publishing Switzerland 2015.postprin

    Retrieving k-nearest neighboring trajectories by a set of point locations

    No full text
    Abstract. The advance of object tracking technologies leads to huge volumes of spatio-temporal data accumulated in the form of location trajectories. Such data bring us new opportunities and challenges in efficient trajectory retrieval. In this paper, we study a new type of query that finds the k Nearest Neighboring Trajectories (k-NNT) with the minimum aggregated distance to a set of query points. Such queries, though have a broad range of applications like trip planning and moving object study, cannot be handled by traditional k-NN query processing techniques that only find the neighboring points of an object. To facilitate scalable, flexible and effective query execution, we propose a k-NN trajectory retrieval algorithm using a candidate-generation-and-verification strategy. The algorithm utilizes a data structure called global heap to retrieve candidate trajectories near each individual query point. Then, at the verification step, it refines these trajectory candidates by a lower-bound computed based on the global heap. The global heap guarantees the candidate’s completeness (i.e., all the k-NNTs are included), and reduces the computational overhead of candidate verification. In addition, we propose a qualifier expectation measure that ranks partial-matching candidate trajectories to accelerate query processing in the cases of non-uniform trajectory distributions or outlier query locations. Extensive experiments on both real and synthetic trajectory datasets demonstrate the feasibility and effectiveness of proposed methods.

    Parallel trajectory similarity joins in spatial networks

    Get PDF
    2018 Springer-Verlag GmbH Germany, part of Springer Nature The matching of similar pairs of objects, called similarity join, is fundamental functionality in data management. We consider two cases of trajectory similarity joins (TS-Joins), including a threshold-based join (Tb-TS-Join) and a top-k TS-Join (k-TS-Join), where the objects are trajectories of vehicles moving in road networks. Given two sets of trajectories and a threshold (Formula presented.), the Tb-TS-Join returns all pairs of trajectories from the two sets with similarity above (Formula presented.). In contrast, the k-TS-Join does not take a threshold as a parameter, and it returns the top-k most similar trajectory pairs from the two sets. The TS-Joins target diverse applications such as trajectory near-duplicate detection, data cleaning, ridesharing recommendation, and traffic congestion prediction. With these applications in mind, we provide purposeful definitions of similarity. To enable efficient processing of the TS-Joins on large sets of trajectories, we develop search space pruning techniques and enable use of the parallel processing capabilities of modern processors. Specifically, we present a two-phase divide-and-conquer search framework that lays the foundation for the algorithms for the Tb-TS-Join and the k-TS-Join that rely on different pruning techniques to achieve efficiency. For each trajectory, the algorithms first find similar trajectories. Then they merge the results to obtain the final result. The algorithms for the two joins exploit different upper and lower bounds on the spatiotemporal trajectory similarity and different heuristic scheduling strategies for search space pruning. Their per-trajectory searches are independent of each other and can be performed in parallel, and the mergings have constant cost. An empirical study with real data offers insight in the performance of the algorithms and demonstrates that they are capable of outperforming well-designed baseline algorithms by an order of magnitude

    Top-k queries over digital traces

    Full text link
    Recent advances in social and mobile technology have enabled an abundance of digital traces (in the form of mobile check-ins, association of mobile devices to specific WiFi hotspots, etc.) revealing the physical presence history of diverse sets of entities (e.g., humans, devices, and vehicles). One challenging yet important task is to identify k entities that are most closely associated with a given query entity based on their digital traces. We propose a suite of indexing techniques and algorithms to enable fast query processing for this problem at scale. We first define a generic family of functions measuring the association between entities, and then propose algorithms to transform digital traces into a lower-dimensional space for more efficient computation. We subsequently design a hierarchical indexing structure to organize entities in a way that closely associated entities tend to appear together. We then develop algorithms to process top-k queries utilizing the index. We theoretically analyze the pruning effectiveness of the proposed methods based on a mobility model which we propose and validate in real life situations. Finally, we conduct extensive experiments on both synthetic and real datasets at scale, evaluating the performance of our techniques both analytically and experimentally, confirming the effectiveness and superiority of our approach over other applicable approaches across a variety of parameter settings and datasets.Comment: Accepted by SIGMOD2019. Proceedings of the 2019 International Conference on Management of Dat

    Streaming Data Algorithm Design for Big Trajectory Data Analysis

    Get PDF
    Trajectory streams consist of large volumes of time-stamped spatial data that are constantly generated from diverse and geographically distributed sources. Discovery of traveling patterns on trajectorystreamssuchasgatheringandcompaniesneedstoprocesseachrecordwhenitarrivesand correlatesacrossmultiplerecordsnearreal-time. Thustechniquesforhandlinghigh-speedtrajectorystreamsshouldscaleondistributedclustercomputing. Themainissuesencapsulatethreeaspects, namely a data model to represent the continuous trajectory data, the parallelism of a discovery algorithm, and end-to-end performance improvement. In this thesis, I propose two parallel discovery methods,namelysnapshotmodelandslotmodelthateachconsistsof1)amodelofpartitioningtrajectoriessampledondifferenttimeintervals;2)definitionondistancemeasurementsoftrajectories; and 3) a parallel discovery algorithm. I develop these methods in a stream processing workflow. I evaluate our solution with a public dataset on Amazon Web Services (AWS) cloud cluster. From parallelization point of view, I investigate system performance, scalability, stability and pinpoint principle operations that contribute most to the run-time cost of computation and data shuffling. I improve data locality with fine-tuned data partition and data aggregation techniques. I observe that both models can scale on a cluster of nodes as the intensity of trajectory data streams grows. Generally, snapshot model has higher throughput thus lower latency, while slot model produce more accurate trajectory discovery

    Fast trajectory search for real-world applications

    Get PDF
    With the popularity of smartphones equipped with GPS, a vast amount of trajectory data are being produced from location-based services, such as Uber, Google Maps, and Foursquare. We broadly divide trajectory data into three types: 1) commuter trajectories from taxicabs and ride-sharing apps; 2) vehicle trajectories from GPS navigation apps; 3) activity trajectories from social network check-ins and travel blogs. We investigate efficient and effective search on each of the three types of trajectory data, each of which has a real-world application. In particular: 1) commuter trajectory search can serve for the transport capacity estimation and route planning; 2) vehicle trajectory search can help real-time traffic monitoring and trend analysis; 3) activity trajectory search can be used in interactive and personalized trip planning. As the most straightforward trajectory data, a commuter trajectory only contains two points: origin and destination indicating a passenger’s movement, which is valuable for transportation decision making. In this thesis, we propose a novel query RkNNT to estimate the capacity of a bus route in the transport network. Answering RkNNT is challenging due to the high amount of data from commuters. We propose efficient solutions to prune most trajectories which cannot choose a query route as their nearest one. Further, we apply RkNNT to the optimal route planning problem-MaxRkNNT. A vehicle trajectory has more points than a commuter trajectory, as it tracks the whole trace of a vehicle and can further advocate the application of traffic monitoring. We conclude the common queries over trajectory data for monitoring purposes and proposes a search engine Torch to manage and search trajectories with map matching over a road network, instead of storing raw data sampled from GPS with a high cost. Besides improving the efficiency of search, Torch also supports compression, effectiveness evaluation of various existing similarity measures, and large-scale clustering k-paths with a novel similarity measure LORS. Exploring the activity trajectory data which contains textual information can help plan personalized trips for tourists. Based on spatial indexes which we propose for commuter and vehicle trajectory data, we further develop a unified search paradigm to process various top-k queries over activity trajectory and POIs data (hotels, restaurants, and attractions, etc.) at the same time. In particular, a new point-wise similarity measure PATS and an indexing framework with a unified search paradigm are proposed
    corecore