4,084 research outputs found
Attribute Value Reordering For Efficient Hybrid OLAP
The normalization of a data cube is the ordering of the attribute values. For
large multidimensional arrays where dense and sparse chunks are stored
differently, proper normalization can lead to improved storage efficiency. We
show that it is NP-hard to compute an optimal normalization even for 1x3
chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are
nearly statistically independent, we show that dimension-wise attribute
frequency sorting is an optimal normalization and takes time O(d n log(n)) for
data cubes of size n^d. When dimensions are not independent, we propose and
evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is
already 19%-30% more efficient than ROLAP, but normalization can improve it
further by 9%-13% for a total gain of 29%-44% over ROLAP
Diamond Dicing
In OLAP, analysts often select an interesting sample of the data. For
example, an analyst might focus on products bringing revenues of at least 100
000 dollars, or on shops having sales greater than 400 000 dollars. However,
current systems do not allow the application of both of these thresholds
simultaneously, selecting products and shops satisfying both thresholds. For
such purposes, we introduce the diamond cube operator, filling a gap among
existing data warehouse operations.
Because of the interaction between dimensions the computation of diamond
cubes is challenging. We compare and test various algorithms on large data sets
of more than 100 million facts. We find that while it is possible to implement
diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a
hundred times faster than popular database engines (including a row-store and a
column-store).Comment: 29 page
Mapping the state of financial stability
The paper uses the Self-Organizing Map for mapping the state of financial stability and visualizing the sources of systemic risks as well as for predicting systemic financial crises. The Self-Organizing Financial Stability Map (SOFSM) enables a two-dimensional representation of a multidimensional financial stability space that allows disentangling the individual sources impacting on systemic risks. The SOFSM can be used to monitor macro-financial vulnerabilities by locating a country in the financial stability cycle: being it either in the pre-crisis, crisis, post-crisis or tranquil state. In addition, the SOFSM performs better than or equally well as a logit model in classifying in-sample data and predicting out-of-sample the global financial crisis that started in 2007. Model robustness is tested by varying the thresholds of the models, the policymaker’s preferences, and the forecasting horizons. JEL Classification: E44, E58, F01, F37, G01macroprudential supervision, prediction, Self-Organizing Map (SOM), Systemic financial crisis, systemic risk, visualization
Mapping the State of Financial Stability
The paper uses the Self-Organizing Map for mapping the state of financial stability and visualizing the sources of systemic risks on a two-dimensional plane as well as for predicting systemic financial crises. The Self-Organizing Financial Stability Map (SOFSM) enables a two-dimensional representation of a multidimensional financial stability space and thus allows disentangling the individual sources impacting on systemic risks. The SOFSM can be used to monitor macro-financial vulnerabilities by locating a country in the financial stability cycle: being it either in the pre-crisis, crisis, post-crisis or tranquil state. In addition, the SOFSM performs better than or equally well as a logit model in classifying in-sample data and predicting out-of-sample the global financial crisis that started in 2007. Model robustness is tested by varying the thresholds of the models, the policymaker’s preferences, and the forecasting horizon.systemic financial crisis; systemic risk; self-organizing maps; visualisation; prediction; macroprudential supervision
CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees
We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems
RIOTs in Germany - Constructing an interregional input-output table for Germany
This paper shows how to adapt recent methodological advances to derive a shipment
based interregional input output table for 402 German counties and 26 foreign partners
for 17 sectors that is, for national aggregates, cell-by-cell compatible with the WIOD
tables. It far outperforms the standard approach of applying unit values to interregional
shipments in replicating observed regional statistics and can be used for improved
impact analysis and CGE model calibration. It thereby mitigates the surprising but
problematic lack of regional German trade data in the analysis of both, regional effects
of aggregate shocks such as trade agreements as well as network effects of regional
policies. Moreover, the paper takes an in-depth look at the derived German production
structure and trade network at the county level finding a surprisingly vast heterogeneity
with respect to specialization, agglomeration and trade partners
- …