Duet: efficient and scalable hybriD neUral rElation undersTanding

Li, Ziqi; Lu, Yabin; Shu, Chang; Wang, Hongzhi; Yan, Yu; Yang, Donghua; Zhang, Kaixin

Duet: efficient and scalable hybriD neUral rElation undersTanding

Authors: Ziqi Li
Yabin Lu
Chang Shu
Hongzhi Wang
Yu Yan
Donghua Yang
Kaixin Zhang
Publication date: 28 July 2023
Publisher

Abstract

Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches face the data and workload drift problem for a long time. Although both query-driven and hybrid methods are proposed to avoid this problem, even the state-of-the-art of them suffer from high training and estimation costs, limited scalability, instability, and long-tailed distribution problem on high cardinality and high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicates information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only reduces the inference complexity from O(n) to O(1) compared to Naru and UAE but also achieve higher accuracy on high cardinality and high-dimensional tables. Experimental results show that Duet can achieve all the design goals above and be much more practical and even has a lower inference cost on CPU than that of most learned methods on GPU

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2307.13494

Last time updated on 28/07/2023