35 research outputs found

    Distributed Indexing Schemes for k-Dominant Skyline Analytics on Uncertain Edge-IoT Data

    Full text link
    Skyline queries typically search a Pareto-optimal set from a given data set to solve the corresponding multiobjective optimization problem. As the number of criteria increases, the skyline presumes excessive data items, which yield a meaningless result. To address this curse of dimensionality, we proposed a k-dominant skyline in which the number of skyline members was reduced by relaxing the restriction on the number of dimensions, considering the uncertainty of data. Specifically, each data item was associated with a probability of appearance, which represented the probability of becoming a member of the k-dominant skyline. As data items appear continuously in data streams, the corresponding k-dominant skyline may vary with time. Therefore, an effective and rapid mechanism of updating the k-dominant skyline becomes crucial. Herein, we proposed two time-efficient schemes, Middle Indexing (MI) and All Indexing (AI), for k-dominant skyline in distributed edge-computing environments, where irrelevant data items can be effectively excluded from the compute to reduce the processing duration. Furthermore, the proposed schemes were validated with extensive experimental simulations. The experimental results demonstrated that the proposed MI and AI schemes reduced the computation time by approximately 13% and 56%, respectively, compared with the existing method.Comment: 13 pages, 8 figures, 12 tables, to appear in IEEE Transactions on Emerging Topics in Computin

    A systematic literature review of skyline query processing over data stream

    Get PDF
    Recently, skyline query processing over data stream has gained a lot of attention especially from the database community owing to its own unique challenges. Skyline queries aims at pruning a search space of a potential large multi-dimensional set of objects by keeping only those objects that are not worse than any other. Although an abundance of skyline query processing techniques have been proposed, there is a lack of a Systematic Literature Review (SLR) on current research works pertinent to skyline query processing over data stream. In regard to this, this paper provides a comparative study on the state-of-the-art approaches over the period between 2000 and 2022 with the main aim to help readers understand the key issues which are essential to consider in relation to processing skyline queries over streaming data. Seven digital databases were reviewed in accordance with the Preferred Reporting Items for Systematic Reviews (PRISMA) procedures. After applying both the inclusion and exclusion criteria, 23 primary papers were further examined. The results show that the identified skyline approaches are driven by the need to expedite the skyline query processing mainly due to the fact that data streams are time varying (time sensitive), continuous, real time, volatile, and unrepeatable. Although, these skyline approaches are tailored made for data stream with a common aim, their solutions vary to suit with the various aspects being considered, which include the type of skyline query, type of streaming data, type of sliding window, query processing technique, indexing technique as well as the data stream environment employed. In this paper, a comprehensive taxonomy is developed along with the key aspects of each reported approach, while several open issues and challenges related to the topic being reviewed are highlighted as recommendation for future research direction

    SkyFlow: heterogeneous streaming for skyline computation using FlowGraph and SYCL

    Get PDF
    The skyline is an optimization operator widely used for multi-criteria decision making. It allows minimizing an n-dimensional dataset into its smallest subset. In this work we present SkyFlow, the first heterogeneous CPU+GPU graph-based engine for skyline computation on a stream of data queries. Two data flow approaches, Coarse-grained and Fine-grained, have been proposed for different streaming scenarios. Coarse-grained aims to keep in parallel the computation of two queries using a hybrid solution with two state-of-the-art skyline algorithms: one optimized for CPU and another for GPU. We also propose a model to estimate at runtime the computation time of any arriving data query. This estimation is used by a heuristic to schedule the data query on the device queue in which it will finish earlier. On the other hand, Fine-grained splits one query computation between CPU and GPU. An experimental evaluation using as target architecture a heterogeneous system comprised of a multicore CPU and an integrated GPU for different streaming scenarios and datasets, reveals that our heterogeneous CPU+GPU approaches always outperform previous only-CPU and only-GPU state-of-the-art implementations up to 6.86×and 5.19×, respectively, and they fall below 6% of ideal peak performance at most. We also evaluate Coarse-grained vs Fine-Grained finding that each approach is better suited to different streaming scenarios.This work was partially supported by the Spanish projects PID2019-105396RB-I00, UMA18-FEDERJA-108 and P20-00395-R. // Funding for open access charge: Universidad de Málaga / CBUA

    Multi-modal Spatial Crowdsourcing for Enriching Spatial Datasets

    Get PDF

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)
    corecore