Search CORE

1,588 research outputs found

Partial-sum queries in OLAP data cubes using covering codes

Author: Ching-tien Ho
Jehoshua Bruck
Rakesh Agrawal
Senior Member
Senior Member
Publication venue
Publication date: 01/01/1997
Field of study

A partial-sum query obtains the summation over a set of specified cells of a data cube. We establish a connection between the covering problem in the theory of error-correcting codes and the partial-sum problem and use this connection to devise algorithms for the partial-sum problem with efficient space-time trade-offs. For example, using our algorithms, with 44 percent additional storage, the query response time can be improved by about 12 percent; by roughly doubling the storage requirement, the query response time can be improved by about 34 percent

CiteSeerX

Caltech Authors

Collaborative OLAP with Tag Clouds: Web 2.0 OLAP Formalism and Experimental Evaluation

Author: Aouiche Kamel
Godin Robert
Lemire Daniel
Publication venue
Publication date: 15/03/2016
Field of study

Increasingly, business projects are ephemeral. New Business Intelligence tools must support ad-lib data sources and quick perusal. Meanwhile, tag clouds are a popular community-driven visualization technique. Hence, we investigate tag-cloud views with support for OLAP operations such as roll-ups, slices, dices, clustering, and drill-downs. As a case study, we implemented an application where users can upload data and immediately navigate through its ad hoc dimensions. To support social networking, views can be easily shared and embedded in other Web sites. Algorithmically, our tag-cloud views are approximate range top-k queries over spontaneous data cubes. We present experimental evidence that iceberg cuboids provide adequate online approximations. We benchmark several browser-oblivious tag-cloud layout optimizations.Comment: Software at https://github.com/lemire/OLAPTagClou

arXiv.org e-Print Archive

CiteSeerX

Clustering-Based Materialized View Selection in Data Warehouses

Author: A. Shukla
A.F. Cardenas
H. Gupta
H. Gupta
J.R. Smith
Jonathan Goldstein
S. Rizzi
S.B. Yao
X. Baril
Publication venue
Publication date: 01/01/2006
Field of study

Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

arXiv.org e-Print Archive

CiteSeerX

Crossref

XWeB: the XML Warehouse Benchmark

Author: A. Schmidt
A. Simitsis
C. Kit
J. Darmont
J. Gray
K. Runapongsa
L. Afanasiev
L. Wyatt
P. O’Neil
R. Kimball
R. Torlone
S. Bressan
S. Rizzi
T. Böhme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/09/2010
Field of study

With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

arXiv.org e-Print Archive

Crossref

HAL

Diamond Dicing

Author: Antony
Bouman
Börzsönyi
Cerf
Daniel Lemire
Donjerkovic
Engene
Fang
Frank
Godin
Hahn
Hazel Webb
Kaser
Knorr
Kondo
Korn
Kumar
Lemire
Ley
Mazón
MonetDB BV
Netflix Inc.
Ng
O'Neil
Owen Kaser
Porter
Rizzi
Sarawagi
Tang
Transaction Processing Performance Council
Turney
Webb
Webb
Wille
Ślezak
Publication venue: 'Elsevier BV'
Publication date: 01/09/2013
Field of study

In OLAP, analysts often select an interesting sample of the data. For example, an analyst might focus on products bringing revenues of at least 100 000 dollars, or on shops having sales greater than 400 000 dollars. However, current systems do not allow the application of both of these thresholds simultaneously, selecting products and shops satisfying both thresholds. For such purposes, we introduce the diamond cube operator, filling a gap among existing data warehouse operations. Because of the interaction between dimensions the computation of diamond cubes is challenging. We compare and test various algorithms on large data sets of more than 100 million facts. We find that while it is possible to implement diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a hundred times faster than popular database engines (including a row-store and a column-store).Comment: 29 page

arXiv.org e-Print Archive

R-libre

Crossref