68 research outputs found

    Big Data Management and Analytics for Mobility Forecasting in datAcron

    Get PDF
    The exploitation of heterogeneous data sources offering very large historical and streaming data is important to increasing the accuracy of operations when analysing and predicting future states of moving entities (planes, vessels, etc.). This article presents the overall goals and big data challenges addressed by datAcron on big data analytics for time-critical mobility forecasting

    Hot Spot Analysis over Big Trajectory Data

    Get PDF
    Hot spot analysis is the problem of identifying statistically significant spatial clusters from an underlying data set. In this paper, we study the problem of hot spot analysis for massive trajectory data of moving objects, which has many real-life applications in different domains, especially in the analysis of vast repositories of historical traces of spatio-temporal data (cars, vessels, aircrafts). In order to identify hot spots, we propose an approach that relies on the Getis-Ord statistic, which has been used successfully in the past for point data. Since trajectory data is more than just a collection of individual points, we formulate the problem of trajectory hot spot analysis, using the Getis-Ord statistic. We propose a parallel and scalable algorithm for this problem, called THS, which provides an exact solution and can operate on vast-sized data sets. Moreover, we introduce an approximate algorithm (aTHS) that avoids exhaustive computation and trades-off accuracy for efficiency in a controlled manner. In essence, we provide a method that quantifies the maximum induced error in the approximation, in relation with the achieved computational savings. We develop our algorithms in Apache Spark and demonstrate the scalability and efficiency of our approach using a large, historical, real-life trajectory data set of vessels sailing in the Eastern Mediterranean for a period of three years. Document type: Conference objec

    Hot Spot Analysis over Big Trajectory Data

    Get PDF
    Hot spot analysis is the problem of identifying statistically significant spatial clusters from an underlying data set. In this paper, we study the problem of hot spot analysis for massive trajectory data of moving objects, which has many real-life applications in different domains, especially in the analysis of vast repositories of historical traces of spatio-temporal data (cars, vessels, aircrafts). In order to identify hot spots, we propose an approach that relies on the Getis-Ord statistic, which has been used successfully in the past for point data. Since trajectory data is more than just a collection of individual points, we formulate the problem of trajectory hot spot analysis, using the Getis-Ord statistic. We propose a parallel and scalable algorithm for this problem, called THS, which provides an exact solution and can operate on vast-sized data sets. Moreover, we introduce an approximate algorithm (aTHS) that avoids exhaustive computation and trades-off accuracy for efficiency in a controlled manner. In essence, we provide a method that quantifies the maximum induced error in the approximation, in relation with the achieved computational savings. We develop our algorithms in Apache Spark and demonstrate the scalability and efficiency of our approach using a large, historical, real-life trajectory data set of vessels sailing in the Eastern Mediterranean for a period of three years. Document type: Conference objec

    Abstract

    No full text
    The advent and popularity of the World Wide Web (WWW) has enabled access to a variety of semi-structured data and, when available, this data follows some common XML schema. On the other hand the distribution of content has made centralized solutions inappropriate, entering the era of peer-to-peer (P2P) computing, where content is stored in XML databases residing on peers. In this paper, we propose XML schema caching as a summary indexing technique for searching in P2P networks. We study XML query routing in unstructured P2P networks, comparing different search strategies and showing the advantages of our approach in terms of completeness of the search.

    Querying and Updating a Context-Aware Service Directory in Mobile Environments

    No full text
    Several interesting research directions materialize through the convergence of mobile computing and service-oriented computing. As mobile devices keep getting smaller, cheaper and more sophisticated, their use is becoming a commodity. We envision future scenarios that involve mobile devices acting not only as requestors, but as providers of data as well. In order to hide the heterogeneous nature of web data, service-oriented architectures are adopted. Nevertheless, existing service discovery mechanisms usually focus on exact or semantic matching of static attributes, thus ignoring contextual parameters. We argue that context for mobile web services plays an important role in service discovery by increasing the precision and effectiveness of the search. We explain our notion of context regarding web services, describe query evaluation and updates in a context-aware service directory, and show how the management of device profiles is exploited. Our prototype is fully implemented, as part of the MobiShare architecture, thus proving the feasibility of our approach as well as demonstrating its advantages

    Monochromatic and Bichromatic Reverse Top-K Queries

    Get PDF
    Abstract—Nowadays, most applications return to the user a limited set of ranked results based on the individual user’s preferences, which are commonly expressed through top-k queries. From the per-spective of a manufacturer, it is imperative that her products appear in the highest ranked positions for many different user preferences, otherwise the product is not visible to potential customers. In this paper, we define a novel query type, namely the reverse top-k query, that covers this requirement: ”Given a potential product, which are the user preferences that make this product belong to the top-k query result set?”. Reverse top-k queries are essential for manufacturers to assess the impact of their products in the market based on the competition. We formally define reverse top-k queries and introduce two versions of the query, monochromatic and bichromatic. First, we provide a geometric interpretation of the monochromatic reverse top-k query to acquire an intuition of the solution space. Then, we study in detail the case of bichromatic reverse top-k query, and we propose two techniques for query processing, namely an efficient threshold-based algorithm and an algorithm based on materialized reverse top-k views. Our experimental evaluation demonstrates the efficiency of our techniques. Index Terms—reverse top-k query, top-k query, user preferences
    • …
    corecore