4 research outputs found

    Eclipse: Practicability Beyond kNN and Skyline

    Full text link
    The kk nearest neighbor (kkNN) query is a fundamental problem in databases. Given a set of multidimensional data points and a query point, kkNN returns the kk nearest neighbors based on a scoring function such as weighted sum given an attribute weight vector. However, the attribute weight vector can be difficult to specify in practice. Skyline returns the points including all possible nearest neighbors without requiring the exact attribute weight vector or a scoring function but the number of returned points can be prohibitively large for practical use. In this paper, we propose a novel \emph{eclipse} definition which provides a more flexible and customizable definition than the classic 11NN and skyline. In eclipse, users can specify a range of attribute weights and control the number of returned points. We show that both 11NN and skyline are instantiations of eclipse. To compute eclipse points, we propose a baseline algorithm with time complexity of O(n22d1)O(n^22^{d-1}), and an improved O(nlogd1n)O(n\log ^{d-1}n) time transformation-based algorithm by transforming the eclipse problem to the skyline problem, where nn is the number of points and dd is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness and efficiency of our eclipse algorithms

    Skyline Diagram: Efficient Space Partitioning for Skyline Queries

    Full text link
    Skyline queries are important in many application domains. In this paper, we propose a novel structure Skyline Diagram, which given a set of points, partitions the plane into a set of regions, referred to as skyline polyominos. All query points in the same skyline polyomino have the same skyline query results. Similar to kthk^{th}-order Voronoi diagram commonly used to facilitate kk nearest neighbor (kkNN) queries, skyline diagram can be used to facilitate skyline queries and many other applications. However, it may be computationally expensive to build the skyline diagram. By exploiting some interesting properties of skyline, we present several efficient algorithms for building the diagram with respect to three kinds of skyline queries, quadrant, global, and dynamic skylines. In addition, we propose an approximate skyline diagram which can significantly reduce the space cost. Experimental results on both real and synthetic datasets show that our algorithms are efficient and scalable

    Eclipse: Generalizing kNN and Skyline

    Full text link
    kk nearest neighbor (kkNN) queries and skyline queries are important operators on multi-dimensional data points. Given a query point, kkNN query returns the kk nearest neighbors based on a scoring function such as a weighted sum of the attributes, which requires predefined attribute weights (or preferences). Skyline query returns all possible nearest neighbors for any monotonic scoring functions without requiring attribute weights but the number of returned points can be prohibitively large. We observe that both kkNN and skyline are inflexible and cannot be easily customized. In this paper, we propose a novel \emph{eclipse} operator that generalizes the classic 11NN and skyline queries and provides a more flexible and customizable query solution for users. In eclipse, users can specify rough and customizable attribute preferences and control the number of returned points. We show that both 11NN and skyline are instantiations of eclipse. To process eclipse queries, we propose a baseline algorithm with time complexity O(n22d1)O(n^22^{d-1}), and an improved O(nlogd1n)O(n\log ^{d-1}n) time transformation-based algorithm, where nn is the number of points and dd is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness of the eclipse operator and the efficiency of our eclipse algorithms

    Secure and Efficient Skyline Queries on Encrypted Data

    Full text link
    Outsourcing data and computation to cloud server provides a cost-effective way to support large scale data storage and query processing. However, due to security and privacy concerns, sensitive data (e.g., medical records) need to be protected from the cloud server and other unauthorized users. One approach is to outsource encrypted data to the cloud server and have the cloud server perform query processing on the encrypted data only. It remains a challenging task to support various queries over encrypted data in a secure and efficient way such that the cloud server does not gain any knowledge about the data, query, and query result. In this paper, we study the problem of secure skyline queries over encrypted data. The skyline query is particularly important for multi-criteria decision making but also presents significant challenges due to its complex computations. We propose a fully secure skyline query protocol on data encrypted using semantically-secure encryption. As a key subroutine, we present a new secure dominance protocol, which can be also used as a building block for other queries. Furthermore, we demonstrate two optimizations, data partitioning and lazy merging, to further reduce the computation load. Finally, we provide both serial and parallelized implementations and empirically study the protocols in terms of efficiency and scalability under different parameter settings, verifying the feasibility of our proposed solutions.Comment: 16 page
    corecore