445 research outputs found

    A model for computing skyline data items in cloud incomplete databases

    Get PDF
    Skyline queries intend to retrieve the most superior data items in the database that best fit with the user’s given preference. However, processing skyline queries are expensive and uneasy when applying on large distributed databases such as cloud databases. Moreover, it would be further sophisticated to process skyline queries if these distributed databases have missing values in certain dimensions. The effect of data incompleteness on skyline process is extremely severe because missing values result in un-hold the transitivity property of skyline technique and leads to the problem of cyclic dominance. This paper proposes an efficient model for computing skyline data items in cloud incomplete databases. The model focuses on processing skyline queries in cloud incomplete databases aiming at reducing the domination tests between data items, the processing time, and the amount of data transfer among the involved datacenters. Various set of experiments are conducted over two different types of datasets and the result demonstrates that the proposed solution outperforms the previous approaches in terms of domination tests, processing time, and amount of data transferred

    Skyline queries computation on crowdsourced- enabled incomplete database

    Get PDF
    Data incompleteness becomes a frequent phenomenon in a large number of contemporary database applications such as web autonomous databases, big data, and crowd-sourced databases. Processing skyline queries over incomplete databases impose a number of challenges that negatively influence processing the skyline queries. Most importantly, the skylines derived from incomplete databases are also incomplete in which some values are missing. Retrieving skylines with missing values is undesirable, particularly, for recommendation and decision-making systems. Furthermore, running skyline queries on a database with incomplete data raises a number of issues influence processing skyline queries such as losing the transitivity property of the skyline technique and cyclic dominance between the tuples. The issue of estimating the missing values of skylines has been discussed and examined in the database literature. Most recently, several studies have suggested exploiting the crowd-sourced databases in order to estimate the missing values by generating plausible values using the crowd. Crowd-sourced databases have proved to be a powerful solution to perform user-given tasks by integrating human intelligence and experience to process the tasks. However, task processing using crowd-sourced incurs additional monetary cost and increases the time latency. Also, it is not always possible to produce a satisfactory result that meets the user's preferences. This paper proposes an approach for estimating the missing values of the skylines by first exploiting the available data and utilizes the implicit relationships between the attributes in order to impute the missing values of the skylines. This process aims at reducing the number of values to be estimated using the crowd when local estimation is inappropriate. Intensive experiments on both synthetic and real datasets have been accomplished. The experimental results have proven that the proposed approach for estimating the missing values of the skylines over crowd-sourced enabled incomplete databases is scalable and outperforms the other existing approaches

    Optimizing skyline query processing in incomplete data

    Get PDF
    Given the significance of skyline queries, they are incorporated in various modern applications including personalized recommendation systems as well as decision-making and decision-support systems. Skyline queries are used to identify superior data items in the database. Most of the previously proposed skyline algorithms work on a complete database where the data are always present (non-missing). However, in many contemporary real-world databases, particularly those databases with large cardinality and high dimensionality, such assumption is not necessarily valid. Hence, missing data pose new challenges if the processing skyline queries cannot easily apply those methods that are designed for complete data. This is due to the fact that imperfect data cause the loss of the transitivity property of the skyline method and cyclic dominance. This paper presents a framework called Optimized Incomplete Skyline (OIS) which utilizes a technique that simplifies the skyline process on a database with missing data and helps prune the data items before performing the skyline process. The proposed strategy assures that the number of the domination tests is significantly reduced. A set of experiments has been accomplished using both real and synthetic datasets aimed at validating the performance of the framework. The experiment results confirm that the OIS framework is indeed superior and steadily outperforms the current approaches in terms of the number of domination tests required to retrieve the skylines

    Identifying skylines in cloud databases with incomplete data

    Get PDF
    Skyline queries is a rich area of research in the database community. Due to its great benefits, it has been integrated into many database applications including but not limited to personalized recommendation, multi-objective, decision support and decision-making systems. Many variations of skyline technique have been proposed in the literature addressing the issue of handling skyline queries in incomplete database. Nevertheless, these solutions are designed to fit with centralized incomplete database (single access). However, in many real-world database systems, this might not be the case, particularly for a database witha large amount of incomplete data distributed over various remote locations such as cloud databases. It is inadequate to directly apply skyline solutions designed for the centralized incomplete database to work on cloud due to the prohibitive cost. Thus, this paper introduces a new approach called Incomplete-data Cloud Skylines (ICS) aiming at processing skyline queries in cloud databases with incomplete data. This approach emphasizes on reducing the amount of data transfer and domination tests during skyline process. It incorporates sorting technique that assists in arranging the data items in a way where dominating data items will be placed at the top of the list helping in eliminate dominated data items. Besides, ICS also employs a filtering technique to prune the dominated data items before applying skyline technique. It comprises a technique named local skyline joiner that helps in reducing the amount of data transfer between datacenters when deriving the final skylines. It limit the amount of data items to be transferred to only those local skylines of each relation. A comprehensive experiment have been performed on both synthetic and real-life datasets, which demonstrate the effectiveness and versatility of our approach in comparison to the current existing approaches. We argue that our approach is practical and can be adopted in many contemporary cloud database systems with incomplete data to process skyline queries

    A systematic literature review of skyline query processing over data stream

    Get PDF
    Recently, skyline query processing over data stream has gained a lot of attention especially from the database community owing to its own unique challenges. Skyline queries aims at pruning a search space of a potential large multi-dimensional set of objects by keeping only those objects that are not worse than any other. Although an abundance of skyline query processing techniques have been proposed, there is a lack of a Systematic Literature Review (SLR) on current research works pertinent to skyline query processing over data stream. In regard to this, this paper provides a comparative study on the state-of-the-art approaches over the period between 2000 and 2022 with the main aim to help readers understand the key issues which are essential to consider in relation to processing skyline queries over streaming data. Seven digital databases were reviewed in accordance with the Preferred Reporting Items for Systematic Reviews (PRISMA) procedures. After applying both the inclusion and exclusion criteria, 23 primary papers were further examined. The results show that the identified skyline approaches are driven by the need to expedite the skyline query processing mainly due to the fact that data streams are time varying (time sensitive), continuous, real time, volatile, and unrepeatable. Although, these skyline approaches are tailored made for data stream with a common aim, their solutions vary to suit with the various aspects being considered, which include the type of skyline query, type of streaming data, type of sliding window, query processing technique, indexing technique as well as the data stream environment employed. In this paper, a comprehensive taxonomy is developed along with the key aspects of each reported approach, while several open issues and challenges related to the topic being reviewed are highlighted as recommendation for future research direction
    corecore