Search CORE

4,858 research outputs found

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Quality measures for ETL processes: from goals to implementation

Author: Akkaoui
Batini
Bellatreche
Brereton
Bresciani
Chung
Dustdar
Frakes
Gill
Giorgini
Horkoff
Horkoff
Horkoff
Jarke
Jarke
Jogalekar
Kitchenham
Lamsweerde
Leite
Naumann
Romero
Simitsis
Sánchez-González
Thiele
Yu
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

Extraction transformation loading (ETL) processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key business entities. The apparent complexity of these activities has been examined through the prism of business process management, mainly focusing on functional requirements and performance optimization. However, the quality dimension has not yet been thoroughly investigated, and there is a need for a more human-centric approach to bring them closer to business-users requirements. In this paper, we take a first step towards this direction by defining a sound model for ETL process quality characteristics and quantitative measures for each characteristic, based on existing literature. Our model shows dependencies among quality characteristics and can provide the basis for subsequent analysis using goal modeling techniques. We showcase the use of goal modeling for ETL process design through a use case, where we employ the use of a goal model that includes quantitative components (i.e., indicators) for evaluation and analysis of alternative design decisions.Peer ReviewedPostprint (author's final draft

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

An MDA approach for developing Secure OLAP applications: metamodels and transformations

Author: Anton Gabriel
Bréchot Christian
Clément Bruno
Dagher Georges
Hofman Paul
Kozera Lukasz
Parodi Barbara
Valence-Bertel Florence
Wichmann H. -Erich
Yuille Martin
Zaltoukal Kurt
Publication venue: ComSIS Consortium
Publication date: 01/01/2014
Field of study

Decision makers query enterprise information stored in Data Warehouses (DW) by using tools (such as On-Line Analytical Processing (OLAP) tools) which employ specific views or cubes from the corporate DW or Data Marts, based on multidimensional modelling. Since the information managed is critical, security constraints have to be correctly established in order to avoid unauthorized access. In previous work we defined a Model-Driven based approach for developing a secure DW repository by following a relational approach. Nevertheless, it is also important to define security constraints in the metadata layer that connects the DW repository with the OLAP tools; that is, over the same multidimensional structures that end users manage. This paper incorporates a proposal for developing secure OLAP applications within our previous approach: it improves a UML profile for conceptual modelling; it defines a logical metamodel for OLAP applications; and it defines and implements transformations from conceptual to logical models, as well as from logical models to secure implementation in a specific OLAP tool (SQL Server Analysis Services).This research is part of the following projects: SIGMA-CC (TIN2012-36904), GEODAS-BC (TIN2012-37493-C01) and GEODAS-BI (TIN2012-37493-C03) funded by the Ministerio de Economía y Competitividad and Fondo Europeo de Desarrollo Regional FEDER. SERENIDAD (PEII11-037-7035) and MOTERO (PEII11- 0399-9449) funded by the Consejería de Educación, Ciencia y Cultura de la Junta de Comunidades de Castilla La Mancha, and Fondo Europeo de Desarrollo Regional FEDER

Repositorio Institucional de la Universidad de Alicante

Crossref

The University of Manchester - Institutional Repository

PuSH

HAL-Pasteur

HAL-Rennes 1

Integration of Data Mining and Data Warehousing: a practical methodology

Author: Pears R
Usman M
Publication venue: Advanced Institute of Convergence IT (AICIT)
Publication date: 12/08/2011
Field of study

The ever growing repository of data in all fields poses new challenges to the modern analytical systems. Real-world datasets, with mixed numeric and nominal variables, are difficult to analyze and require effective visual exploration that conveys semantic relationships of data. Traditional data mining techniques such as clustering clusters only the numeric data. Little research has been carried out in tackling the problem of clustering high cardinality nominal variables to get better insight of underlying dataset. Several works in the literature proved the likelihood of integrating data mining with warehousing to discover knowledge from data. For the seamless integration, the mined data has to be modeled in form of a data warehouse schema. Schema generation process is complex manual task and requires domain and warehousing familiarity. Automated techniques are required to generate warehouse schema to overcome the existing dependencies. To fulfill the growing analytical needs and to overcome the existing limitations, we propose a novel methodology in this paper that permits efficient analysis of mixed numeric and nominal data, effective visual data exploration, automatic warehouse schema generation and integration of data mining and warehousing. The proposed methodology is evaluated by performing case study on real-world data set. Results show that multidimensional analysis can be performed in an easier and flexible way to discover meaningful knowledge from large datasets

AUT Scholarly Commons

Framework for Interoperable and Distributed Extraction-Transformation-Loading (ETL) Based on Service Oriented Architecture

Author: Awad Mohammed M.I.
Publication venue
Publication date: 01/01/2012
Field of study

Extraction. Transformation and Loading (ETL) are the major functionalities in data warehouse (DW) solutions. Lack of component distribution and interoperability is a gap that leads to many problems in the ETL domain, which is due to tightly-coupled components in the current ETL framework. This research discusses how to distribute the Extraction, Transformation and Loading components so as to achieve distribution and interoperability of these ETL components. In addition, it shows how the ETL framework can be extended. To achieve that, Service Oriented Architecture (SOA) is adopted to address the mentioned missing features of distribution and interoperability by restructuring the current ETL framework. This research contributes towards the field of ETL by adding the distribution and inter- operability concepts to the ETL framework. This Ieads to contributions towards the area of data warehousing and business intelligence, because ETL is a core concept in this area. The Design Science Approach (DSA) and Scrum methodologies were adopted for achieving the research goals. The integration of DSA and Scrum provides the suitable methods for achieving the research objectives. The new ETL framework is realized by developing and testing a prototype that is based on the new ETL framework. This prototype is successfully evaluated using three case studies that are conducted using the data and tools of three different organizations. These organizations use data warehouse solutions for the purpose of generating statistical reports that help their top management to take decisions. Results of the case studies show that distribution and interoperability can be achieved by using the new ETL framework

Universiti Utara Malaysia: UUM eTheses

Designing secure data warehouses by using MDA and QVT

Author: Blanco Bueno Carlos
Fernández-Medina Patón Eduardo
Soler Emilio
Trujillo Juan
Publication venue: J.UCS Consortium
Publication date: 01/01/2009
Field of study

The Data Warehouse (DW) design is based on multidimensional (MD) modeling which structures information into facts and dimensions. Due to the confidentiality of the data that it stores, it is crucial to specify security and audit measures from the early stages of design and to enforce them throughout the lifecycle. Moreover, the standard framework for software development, Model Driven Architecture (MDA), allows us to define transformations between models by proposing Query/View/Transformations (QVT). This proposal permits the definition of formal, elegant and unequivocal transformations between Platform Independent Models (PIM) and Platform Specific Models (PSM). This paper introduces a new framework for the design of secure DWs based on MDA and QVT, which covers all the design phases (conceptual, logical and physical) and specifies security measures in all of them. We first define two metamodels with which to represent security and audit measures at the conceptual and logical levels. We then go on to define a transformation between these models through which to obtain the traceability of the security rules from the early stages of development to the final implementation. Finally, in order to show the benefits of our proposal, it is applied to a case study.This work has been partially supported by the METASIGN project (TIN2004-00779) from the Spanish Ministry of Education and Science, of the Regional Government of Valencia, and by the QUASIMODO and MISTICO projects of the Regional Science and Technology Ministry of Castilla-La Mancha (Spain)

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA Preprints

Graph BI & analytics: current state and future challenges

Author: Ghrab Amine
Jouili Salim
Romero Moral Óscar
Skhiri Sabri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

An MDA approach for developing secure OLAP applications: Metamodels and transformations

Author: Blanco Bueno Carlos
Fernández-Medina Eduardo
García-Rodríguez de Guzmán Ignacio
Trujillo Juan
Publication venue: 'National Library of Serbia'
Publication date: 01/01/2015
Field of study

Decision makers query enterprise information stored in DataWarehouses (DW) by using tools (such as On-Line Analytical Processing (OLAP) tools) which employ specific views or cubes from the corporate DW or Data Marts, based on multidimensional modelling. Since the information managed is critical, security constraints have to be correctly established in order to avoid unauthorized access. In previous work we defined a Model-Driven based approach for developing a secure DW repository by following a relational approach. Nevertheless, it is also important to define security constraints in the metadata layer that connects the DW repository with the OLAP tools; that is, over the same multidimensional structures that end users manage. This paper incorporates a proposal for developing secure OLAP applications within our previous approach: it improves a UML profile for conceptual modelling; it defines a logical metamodel for OLAP applications; and it defines and implements transformations from conceptual to logical models, as well as from logical models to secure implementation in a specific OLAP tool (SQL Server Analysis Services). © 2015 ComSIS Consortium. All rights reserved.This research is part of the following projects: SIGMA-CC (TIN2012-36904), GEODAS-BC (TIN2012-37493-C01) and GEODAS-BI (TIN2012-37493-C03) funded by the Ministerio de Economía y Competitividad and Fondo Europeo de Desarrollo Regional FEDER

Repositorio Institucional de la Universidad de Alicante

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UCrea

Towards Conceptual Multidimensional Design in Decision Support Systems

Author: Teste Olivier
Publication venue: HAL CCSD
Publication date: 25/09/2001
Field of study

International audienceMultidimensional databases support efficiently on-line analytical processing (OLAP). In this paper, we depict a model dedicated to multidimensional databases. The approach we present designs decisional information through a constellation of facts and dimensions. Each dimension is possibly shared between several facts and it is organised according to multiple hierarchies. In addition, we define a comprehensive query algebra regrouping the more popular multidimensional operations in current commercial systems and research approaches. We introduce new operators dedicated to a constellation. Finally, we describe a prototype that allows managers to query constellations of facts, dimensions and multiple hierarchies

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes