Search CORE

35,482 research outputs found

Data Cube Approximation and Mining using Probabilistic Modeling

Author: Boujenoui Ameur
Goutte Cyril
Missaoui Rokia
Publication venue
Publication date: 01/01/2007
Field of study

On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches

Hierarchical Structure of Magnetohydrodynamic Turbulence In Position-Position-Velocity Space

Author: A. Lazarian
Alyssa Goodman
Ballesteros-Paredes
Beresnyak
Blakesley Burkhart
Brunt
Brunt
Burkhart
Burkhart
Burkhart
Burkhart
Burkhart
Burkhart
Burkhart
Chepurnov
Chepurnov
Chepurnov
Chepurnov
Cho
Cho
Collins
Crutcher
Elmegreen
Elmegreen
Elmegreen
Erik Rosolowsky
Esquivel
Esquivel
Esquivel
Falceta-Gonçalves
Federrath
Federrath
Feitzinger
Ferriere
Goodman
Hill
Kim
Kowal
Larson
Lazarian
Lazarian
Lazarian
Lazarian
Lazarian
Lazarian
Lazarian
Lazarian
Lazarian
Li
Liu
Lunttila
Maron
Ossenkopf
Ostriker
Padoan
Padoan
Padoan
Padoan
Pichardo
Rosolowsky
Rosolowsky
Rosolowsky
Sawlaw
Scalo
Stanimirovic
Stutzki
Tofflemire
Publication venue: 'IOP Publishing'
Publication date: 10/04/2013
Field of study

Magnetohydrodynamic turbulence is able to create hierarchical structures in the interstellar medium that are correlated on a wide range of scales via the energy cascade. We use hierarchical tree diagrams known as dendrograms to characterize structures in synthetic Position-Position-Velocity (PPV) emission cubes of optically thin isothermal magnetohydrodynamic turbulence. We show that the structures and degree of hierarchy observed in PPV space are related to the physics of the gas, i.e. self-gravity and the global sonic and Alfvenic Mach number. Simulations with higher Alfvenic Mach number, self-gravity and supersonic flows display enhanced hierarchical structure. We observed a strong sonic and Alfvenic dependency when we apply the the statistical moments (i.e. mean, variance, skewness, kurtosis) to the dendrogram distribution. Larger magnetic field and sonic Mach number correspond to larger values of the moments. Application of the dendrogram to 3D density cubes, also known as Position-Position-Position cubes (PPP), reveals that the dominant emission contours in PPP and PPV are related for supersonic gas but not for subsonic. We also explore the effects of smoothing, thermal broadening and velocity resolution on the dendrograms in order to make our study more applicable to observational data. These results all point to hierarchical tree diagrams as being a promising additional tool for studying ISM turbulence and star forming regions in the direction of obtaining information on the degree of self-gravity, the Mach numbers and the complicated relationship between PPV and PPP.Comment: submitted to Ap

arXiv.org e-Print Archive

Crossref

Harvard University - DASH

JP3D compression of solar data-cubes: photospheric imaging and spectropolarimetry

Author: Berrilli Francesco
Del Moro Dario
Giovannelli Luca
Pietropaolo Ermanno
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Hyperspectral imaging is an ubiquitous technique in solar physics observations and the recent advances in solar instrumentation enabled us to acquire and record data at an unprecedented rate. The huge amount of data which will be archived in the upcoming solar observatories press us to compress the data in order to reduce the storage space and transfer times. The correlation present over all dimensions, spatial, temporal and spectral, of solar data-sets suggests the use of a 3D base wavelet decomposition, to achieve higher compression rates. In this work, we evaluate the performance of the recent JPEG2000 Part 10 standard, known as JP3D, for the lossless compression of several types of solar data-cubes. We explore the differences in: a) The compressibility of broad-band or narrow-band time-sequence; I or V stokes profiles in spectropolarimetric data-sets; b) Compressing data in [x,y,

\lambda

] packages at different times or data in [x,y,t] packages of different wavelength; c) Compressing a single large data-cube or several smaller data-cubes; d) Compressing data which is under-sampled or super-sampled with respect to the diffraction cut-off

arXiv.org e-Print Archive

ART

A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

Author: Alzeini H I
Habaebi M H
Hameed Sh A
Publication venue
Publication date: 01/12/2013
Field of study

The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

arXiv.org e-Print Archive

Crossref

The International Islamic University Malaysia Repository

Attribute Value Reordering For Efficient Hybrid OLAP

Author: Kaser Owen
Lemire Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2005
Field of study

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1x3 chunks, although we find an exact algorithm for 1x2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(d n log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19%-30% more efficient than ROLAP, but normalization can improve it further by 9%-13% for a total gain of 29%-44% over ROLAP

arXiv.org e-Print Archive

CiteSeerX

R-libre

Archipel - Université du Québec à Montréal