Search CORE

843 research outputs found

Holistic Cube Analysis: A Query Framework for Data Insights

Author: Benassi Joe
Cieslewicz John
Deep Shaleen
Foster James
Jang Uyeong
Jha Somesh
Kim Stella
Li Fengan
Naughton Jeffrey F.
Nguyen Long
Sun Yujing
Viglas Stratis
Wu Xi
Zhang Yaqi
Publication venue
Publication date: 04/06/2023
Field of study

We present Holistic Cube Analysis (HoCA), a framework that augments the capabilities of relational queries for data insights. We first define AbstractCube, a data type defined as a function from RegionFeatures space to relational tables. AbstractCube provides a logical form of data for HoCA operators and their compositions to operate on to analyze the data. This function-as-data modeling allows us to simultaneously capture a space of non-uniform tables on the co-domain of the function, and region space structure on the domain of the function. We describe two HoCA operators, cube crawling and cube join, which are cube-to-cube transformations (i.e., higher-order functions). Cube crawling explores a region subspace, and outputs a cube mapping regions to signal vectors. Cube join, in turn, allows users to meld information in different cubes, which is critical for composition. The cube crawling interface introduces two novel features: (1) Region Analysis Models (RAMs), which allows one to program and organize analysis on a set of data features into a module. (2) Multi-Model Crawling, which allows one to apply multiple models, potentially on different feature sets, during crawling. These two features, together with cube join and a rich RAM library, allows us to construct succinct HoCA programs to capture a wide variety of data-insight problems in system monitoring, experimentation analysis, and business intelligence. HoCA poses a rich algorithmic design space, such as optimizing crawling performance leveraging region space structure, optimizing cube join performance, and physical designs of cubes. We describe several cube crawling implementations leveraging different foundations (an in-house relational query engine, and Apache Beam), and evaluate their performance characteristics. Finally, we discuss avenues in extending the framework, such as devising more useful HoCA operators.Comment: Establishing core concepts of HoC

arXiv.org e-Print Archive

A probabilistic multidimensional data model and its applications in business management

Author: Moole Bhaskara Reddy
Publication venue: 'IUScholarWorks'
Publication date: 01/01/2005
Field of study

This dissertation develops a conceptual data model that can efficiently handle huge volumes of data containing uncertainty and are subject to frequent changes. This model can be used to build Decision Support Systems to improve decision-making process. Business intelligence and decision-making in today\u27s business world require extensive use of huge volumes of data. Real world data contain uncertainty and change over time. Business leaders should have access to Decision Support Systems that can efficiently handle voluminous data, uncertainty, and modifications to uncertain data. Database product vendors provide several extensions and features to support these requirements; however, these extensions lack support of standard conceptual models. Standardization generally creates more competition and leads to lower prices and improved standards of living. Results from this study could become a data model standard in the area of applied decisions sciences. The conceptual data model developed in this dissertation uses a mathematical concept based on set theory, probability axioms, and the Bayesian framework. Conceptual data model, algebra to manipulate data, a framework and an algorithm to modify the data are presented. The data modification algorithm is analyzed for time and space efficiency. Formal mathematical proof is provided to support identified properties of model, algebra, and the modification framework. Decision-making ability of this model was investigated using sample data. Advantages of this model and improvements in inventory management through its application are described. Comparison and contrast between this model and Bayesian belief networks are presented. Finally, scope and topics for further research are described

CiteSeerX

Walden University

Using Ontologies for the Design of Data Warehouses

Author: Mazón Jose-Norberto
Pardillo Jesús
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2011
Field of study

Obtaining an implementation of a data warehouse is a complex task that forces designers to acquire wide knowledge of the domain, thus requiring a high level of expertise and becoming it a prone-to-fail task. Based on our experience, we have detected a set of situations we have faced up with in real-world projects in which we believe that the use of ontologies will improve several aspects of the design of data warehouses. The aim of this article is to describe several shortcomings of current data warehouse design approaches and discuss the benefit of using ontologies to overcome them. This work is a starting point for discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure

arXiv.org e-Print Archive

Repositorio Institucional de la Universidad de Alicante

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Propagators and Violation Functions for Geometric and Workload Constraints Arising in Airspace Sectorisation

Author: Flener Pierre
Pearson Justin
Publication venue
Publication date: 29/01/2014
Field of study

Airspace sectorisation provides a partition of a given airspace into sectors, subject to geometric constraints and workload constraints, so that some cost metric is minimised. We make a study of the constraints that arise in airspace sectorisation. For each constraint, we give an analysis of what algorithms and properties are required under systematic search and stochastic local search

arXiv.org e-Print Archive

CiteSeerX