Search CORE

4,084 research outputs found

Attribute Value Reordering For Efficient Hybrid OLAP

Author: Kaser Owen
Lemire Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal

Diamond Dicing

Author: Antony
Bouman
Börzsönyi
Cerf
Daniel Lemire
Donjerkovic
Engene
Fang
Frank
Godin
Hahn
Hazel Webb
Kaser
Knorr
Kondo
Korn
Kumar
Lemire
Ley
Mazón
MonetDB BV
Netflix Inc.
Ng
O'Neil
Owen Kaser
Porter
Rizzi
Sarawagi
Tang
Transaction Processing Performance Council
Turney
Webb
Webb
Wille
Ślezak
Publication venue: 'Elsevier BV'
Publication date: 01/09/2013
Field of study

In OLAP, analysts often select an interesting sample of the data. For example, an analyst might focus on products bringing revenues of at least 100 000 dollars, or on shops having sales greater than 400 000 dollars. However, current systems do not allow the application of both of these thresholds simultaneously, selecting products and shops satisfying both thresholds. For such purposes, we introduce the diamond cube operator, filling a gap among existing data warehouse operations. Because of the interaction between dimensions the computation of diamond cubes is challenging. We compare and test various algorithms on large data sets of more than 100 million facts. We find that while it is possible to implement diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a hundred times faster than popular database engines (including a row-store and a column-store).Comment: 29 page

arXiv.org e-Print Archive

R-libre

Crossref

Mapping the state of financial stability

Author: Peltonen Tuomas A.
Sarlin Peter
Publication venue
Publication date
Field of study

The paper uses the Self-Organizing Map for mapping the state of financial stability and visualizing the sources of systemic risks as well as for predicting systemic financial crises. The Self-Organizing Financial Stability Map (SOFSM) enables a two-dimensional representation of a multidimensional financial stability space that allows disentangling the individual sources impacting on systemic risks. The SOFSM can be used to monitor macro-financial vulnerabilities by locating a country in the financial stability cycle: being it either in the pre-crisis, crisis, post-crisis or tranquil state. In addition, the SOFSM performs better than or equally well as a logit model in classifying in-sample data and predicting out-of-sample the global financial crisis that started in 2007. Model robustness is tested by varying the thresholds of the models, the policymaker’s preferences, and the forecasting horizons. JEL Classification: E44, E58, F01, F37, G01macroprudential supervision, prediction, Self-Organizing Map (SOM), Systemic financial crisis, systemic risk, visualization

Research Papers in Economics

Mapping the State of Financial Stability

Author: Peltonen Tuomas A.
Sarlin Peter
Publication venue
Publication date
Field of study

The paper uses the Self-Organizing Map for mapping the state of financial stability and visualizing the sources of systemic risks on a two-dimensional plane as well as for predicting systemic financial crises. The Self-Organizing Financial Stability Map (SOFSM) enables a two-dimensional representation of a multidimensional financial stability space and thus allows disentangling the individual sources impacting on systemic risks. The SOFSM can be used to monitor macro-financial vulnerabilities by locating a country in the financial stability cycle: being it either in the pre-crisis, crisis, post-crisis or tranquil state. In addition, the SOFSM performs better than or equally well as a logit model in classifying in-sample data and predicting out-of-sample the global financial crisis that started in 2007. Model robustness is tested by varying the thresholds of the models, the policymaker’s preferences, and the forecasting horizon.systemic financial crisis; systemic risk; self-organizing maps; visualisation; prediction; macroprudential supervision

Research Papers in Economics

CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees

Author: Fu Lixin
NC DOCKS at The University of North Carolina at Greensboro
Publication venue
Publication date: 01/01/2003
Field of study

We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems

The University of North Carolina at Greensboro

RIOTs in Germany - Constructing an interregional input-output table for Germany

Author: Krebs Oliver
Publication venue: Universität Tübingen
Publication date: 01/01/2020
Field of study

This paper shows how to adapt recent methodological advances to derive a shipment based interregional input output table for 402 German counties and 26 foreign partners for 17 sectors that is, for national aggregates, cell-by-cell compatible with the WIOD tables. It far outperforms the standard approach of applying unit values to interregional shipments in replicating observed regional statistics and can be used for improved impact analysis and CGE model calibration. It thereby mitigates the surprising but problematic lack of regional German trade data in the analysis of both, regional effects of aggregate shocks such as trade agreements as well as network effects of regional policies. Moreover, the paper takes an in-depth look at the derived German production structure and trade network at the county level finding a surprisingly vast heterogeneity with respect to specialization, agglomeration and trade partners

Publikationsserver der Universität Tübingen