Selectivity estimation aims at estimating the number of database objects that
satisfy a selection criterion. Answering this problem accurately and
efficiently is essential to many applications, such as density estimation,
outlier detection, query optimization, and data integration. The estimation
problem is especially challenging for large-scale high-dimensional data due to
the curse of dimensionality, the large variance of selectivity across different
queries, and the need to make the estimator consistent (i.e., the selectivity
is non-decreasing in the threshold). We propose a new deep learning-based model
that learns a query-dependent piecewise linear function as selectivity
estimator, which is flexible to fit the selectivity curve of any query object
and threshold, while guaranteeing that the output is non-decreasing in the
threshold. To improve the accuracy for large datasets, we propose to partition
the dataset into multiple disjoint subsets and build a local model on each of
them. We perform experiments on real datasets and show that the proposed model
significantly outperforms state-of-the-art models in accuracy and is
competitive in efficiency