Search CORE

832 research outputs found

Robust Query Optimization Methods With Respect to Estimation Errors: A Survey

Author: Hameurlain Abdelkader
Morvan Franck
Yin Shaoyi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

International audienceThe quality of a query execution plan chosen by a Cost-Based Optimizer (CBO) depends greatly on the estimation accuracy of input parameter values. Many research results have been produced on improving the estimation accuracy, but they do not work for every situation. Therefore, "robust query optimization" was introduced, in an effort to minimize the sub-optimality risk by accepting the fact that estimates could be inaccurate. In this survey, we aim to provide an overview of robust query optimization methods by classifying them into different categories, explaining the essential ideas, listing their advantages and limitations, and comparing them with multiple criteria

Prochlo: Strong Privacy for Analytics in the Crowd

Author: Abadi M.
Abadi M.
Abadi M.
Avent B.
Bellare M.
Bulck J. V.
Buse R. P. L.
Chen R.
Corrigan-Gibbs H.
Dang H.
Denning D. E. R.
Dinh T. T. A.
Dwork
Lee S.
Maniatis P.
Ohrimenko O.
Ravindranath L.
Roy I.
Saltzer J. H.
Viega J.
Wang T.
Warner
Zheng W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/10/2017
Field of study

The large-scale monitoring of computer users' software activities has become commonplace, e.g., for application telemetry, error reporting, or demographic profiling. This paper describes a principled systems architecture---Encode, Shuffle, Analyze (ESA)---for performing such monitoring with high utility while also protecting user privacy. The ESA design, and its Prochlo implementation, are informed by our practical experiences with an existing, large deployment of privacy-preserving software monitoring. (cont.; see the paper

arXiv.org e-Print Archive

University of Toronto Research Repository

Duet: efficient and scalable hybriD neUral rElation undersTanding

Author: Li Ziqi
Lu Yabin
Shu Chang
Wang Hongzhi
Yan Yu
Yang Donghua
Zhang Kaixin
Publication venue
Publication date: 28/07/2023
Field of study

Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches face the data and workload drift problem for a long time. Although both query-driven and hybrid methods are proposed to avoid this problem, even the state-of-the-art of them suffer from high training and estimation costs, limited scalability, instability, and long-tailed distribution problem on high cardinality and high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicates information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only reduces the inference complexity from O(n) to O(1) compared to Naru and UAE but also achieve higher accuracy on high cardinality and high-dimensional tables. Experimental results show that Duet can achieve all the design goals above and be much more practical and even has a lower inference cost on CPU than that of most learned methods on GPU

arXiv.org e-Print Archive

Weiterentwicklung analytischer Datenbanksysteme

Author: Kipf Andreas Michael
Publication venue: Technische Universität München
Publication date
Field of study

This thesis contributes to the state of the art in analytical database systems. First, we identify and explore extensions to better support analytics on event streams. Second, we propose a novel polygon index to enable efficient geospatial data processing in main memory. Third, we contribute a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization.Diese Arbeit trägt zum aktuellen Forschungsstand von analytischen Datenbanksystemen bei. Wir identifizieren und explorieren Erweiterungen um Analysen auf Eventströmen besser zu unterstützen. Wir stellen eine neue Indexstruktur für Polygone vor, die eine effiziente Verarbeitung von Geodaten im Hauptspeicher ermöglicht. Zudem präsentieren wir einen neuen Ansatz für Kardinalitätsschätzungen mittels maschinellen Lernens