4 research outputs found
Eclipse: Practicability Beyond kNN and Skyline
The nearest neighbor (NN) query is a fundamental problem in databases.
Given a set of multidimensional data points and a query point, NN returns
the nearest neighbors based on a scoring function such as weighted sum
given an attribute weight vector. However, the attribute weight vector can be
difficult to specify in practice. Skyline returns the points including all
possible nearest neighbors without requiring the exact attribute weight vector
or a scoring function but the number of returned points can be prohibitively
large for practical use.
In this paper, we propose a novel \emph{eclipse} definition which provides a
more flexible and customizable definition than the classic NN and skyline.
In eclipse, users can specify a range of attribute weights and control the
number of returned points. We show that both NN and skyline are
instantiations of eclipse. To compute eclipse points, we propose a baseline
algorithm with time complexity of , and an improved time transformation-based algorithm by transforming the eclipse
problem to the skyline problem, where is the number of points and is
the number of dimensions. Furthermore, we propose a novel index-based algorithm
utilizing duality transform with much better efficiency. The experimental
results on the real NBA dataset and the synthetic datasets demonstrate the
effectiveness and efficiency of our eclipse algorithms
Skyline Diagram: Efficient Space Partitioning for Skyline Queries
Skyline queries are important in many application domains. In this paper, we
propose a novel structure Skyline Diagram, which given a set of points,
partitions the plane into a set of regions, referred to as skyline polyominos.
All query points in the same skyline polyomino have the same skyline query
results. Similar to -order Voronoi diagram commonly used to facilitate
nearest neighbor (NN) queries, skyline diagram can be used to facilitate
skyline queries and many other applications. However, it may be computationally
expensive to build the skyline diagram. By exploiting some interesting
properties of skyline, we present several efficient algorithms for building the
diagram with respect to three kinds of skyline queries, quadrant, global, and
dynamic skylines. In addition, we propose an approximate skyline diagram which
can significantly reduce the space cost. Experimental results on both real and
synthetic datasets show that our algorithms are efficient and scalable
Eclipse: Generalizing kNN and Skyline
nearest neighbor (NN) queries and skyline queries are important
operators on multi-dimensional data points. Given a query point, NN query
returns the nearest neighbors based on a scoring function such as a
weighted sum of the attributes, which requires predefined attribute weights (or
preferences). Skyline query returns all possible nearest neighbors for any
monotonic scoring functions without requiring attribute weights but the number
of returned points can be prohibitively large. We observe that both NN and
skyline are inflexible and cannot be easily customized.
In this paper, we propose a novel \emph{eclipse} operator that generalizes
the classic NN and skyline queries and provides a more flexible and
customizable query solution for users. In eclipse, users can specify rough and
customizable attribute preferences and control the number of returned points.
We show that both NN and skyline are instantiations of eclipse. To process
eclipse queries, we propose a baseline algorithm with time complexity
, and an improved time transformation-based
algorithm, where is the number of points and is the number of
dimensions. Furthermore, we propose a novel index-based algorithm utilizing
duality transform with much better efficiency. The experimental results on the
real NBA dataset and the synthetic datasets demonstrate the effectiveness of
the eclipse operator and the efficiency of our eclipse algorithms
Secure and Efficient Skyline Queries on Encrypted Data
Outsourcing data and computation to cloud server provides a cost-effective
way to support large scale data storage and query processing. However, due to
security and privacy concerns, sensitive data (e.g., medical records) need to
be protected from the cloud server and other unauthorized users. One approach
is to outsource encrypted data to the cloud server and have the cloud server
perform query processing on the encrypted data only. It remains a challenging
task to support various queries over encrypted data in a secure and efficient
way such that the cloud server does not gain any knowledge about the data,
query, and query result. In this paper, we study the problem of secure skyline
queries over encrypted data. The skyline query is particularly important for
multi-criteria decision making but also presents significant challenges due to
its complex computations. We propose a fully secure skyline query protocol on
data encrypted using semantically-secure encryption. As a key subroutine, we
present a new secure dominance protocol, which can be also used as a building
block for other queries. Furthermore, we demonstrate two optimizations, data
partitioning and lazy merging, to further reduce the computation load. Finally,
we provide both serial and parallelized implementations and empirically study
the protocols in terms of efficiency and scalability under different parameter
settings, verifying the feasibility of our proposed solutions.Comment: 16 page