8 research outputs found

    Skyline computation of uncertain database: A survey

    Get PDF
    Conducting advance skyline analysis over certain and uncertain databases is still an evolving research area in the field of database, despite several research works that have been conducted in this area. This paper conducts a survey on research issues on computing skyline for uncertain databases, with the view of providing interested researchers with an overview of the most recent research directions in this area.It further suggests possible research direction on skyline processing for uncertain databases.Taxonomy of the existing approaches is also presented

    Computing All Restricted Skyline Probabilities on Uncertain Datasets

    Full text link
    Restricted skyline (rskyline) query is widely used in multi-criteria decision making. It generalizes the skyline query by additionally considering a set of personalized scoring functions F. Since uncertainty is inherent in datasets for multi-criteria decision making, we study rskyline queries on uncertain datasets from both complexity and algorithm perspective. We formalize the problem of computing rskyline probabilities of all data items and show that no algorithm can solve this problem in truly subquadratic-time, unless the orthogonal vectors conjecture fails. Considering that linear scoring functions are widely used in practical applications, we propose two efficient algorithms for the case where \calF is a set of linear scoring functions whose weights are described by linear constraints, one with near-optimal time complexity and the other with better expected time complexity. For special linear constraints involving a series of weight ratios, we further devise an algorithm with sublinear query time and polynomial preprocessing time. Extensive experiments demonstrate the effectiveness, efficiency, scalability, and usefulness of our proposed algorithms.Comment: Full version, a shorter version to appear in ICDE 202

    Finding Probabilistic k-Skyline Sets on Uncertain Data

    Get PDF
    ABSTRACT Skyline is a set of points that are not dominated by any other point. Given uncertain objects, probabilistic skyline has been studied which computes objects with high probability of being skyline. While useful for selecting individual objects, it is not sufficient for scenarios where we wish to compute a subset of skyline objects, i.e., a skyline set. In this paper, we generalize the notion of probabilistic skyline to probabilistic k-skyline sets (Pk-SkylineSets) which computes k-object sets with high probability of being skyline set. We present an efficient algorithm for computing probabilistic k-skyline sets. It uses two heuristic pruning strategies and a novel data structure based on the classic layered range tree to compute the skyline set probability for each instance set with a worst-case time bound. The experimental results on the real NBA dataset and the synthetic datasets show that Pk-SkylineSets is interesting and useful, and our algorithms are efficient and scalable

    Algorithms for Imprecise Trajectories

    Get PDF

    Algorithms and hardness results for geometric problems on stochastic datasets

    Get PDF
    University of Minnesota Ph.D. dissertation.July 2019. Major: Computer Science. Advisor: Ravi Janardan. 1 computer file (PDF); viii, 121 pages.Traditionally, geometric problems are studied on datasets in which each data object exists with probability 1 at its location in the underlying space. However, in many scenarios, there may be some uncertainty associated with the existence or the locations of the data points. Such uncertain datasets, called \textit{stochastic datasets}, are often more realistic, as they are more expressive and can model the real data more precisely. For this reason, geometric problems on stochastic datasets have received significant attention in recent years. This thesis studies three sets of geometric problems on stochastic datasets equipped with existential uncertainty. The first set of problems addresses the linear separability of a bichromatic stochastic dataset. Specifically, these problems are concerned with how to compute the probability that a realization of a bichromatic stochastic dataset is linearly separable as well as how to compute the expected separation-margin of such a realization. The second set of problems deals with the stochastic convex hull, i.e., the convex hull of a stochastic dataset. This includes computing the expected measures of a stochastic convex hull, such as the expected diameter, width, and combinatorial complexity. The third set of problems considers the dominance relation in a colored stochastic dataset. These problems involve computing the probability that a realization of a colored stochastic dataset does not contain any dominance pair consisting of two different-colored points. New algorithmic and hardness results are provided for the three sets of problems

    (Approximate) uncertain skylines

    No full text
    Given a set of points with uncertain locations, we consider the problem of computing the probability of each point lying on the skyline, that is, the probability that it is not dominated by any other input point. If each point’s uncertainty is described as a probability distribution over a discrete set of locations, we improve the best known exact solution. We also suggest why we believe our solution might be optimal. Next, we describe simple, near-linear time approximation algorithms for computing the probability of each point lying on the skyline. In addition, some of our methods can be adapted to construct data structures that can efficiently determine the probability of a query point lying on the skyline
    corecore