Search CORE

1,803 research outputs found

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Repetitive querying of large random heterogeneous datasets in RDBMS using materialized views

Author: Albalkhi Ammar
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2012
Field of study

A methodology has been developed to increase time efficiency of querying large heterogeneous datasets repetitively by applying materialized views on repetitive complex queries. Additionally, a simple user interface is provided to demonstrate the utility of this research methodology. The programs demonstrate sufficiently that the core design can be used to deploy a complete system which could be used in different domains. The methodology as developed in this research is presented as an experimental proof-of-concept prototype based on an abstract design

Scholarship at UWindsor

Recommended from our members

The Application of Object-Oriented Views to an Engineering Environment.

Author: Shao Zhuang
Publication venue
Publication date: 16/11/1999
Field of study

With the increasing popularity of object-oriented technology, object-oriented database systems are being used in design environments as central repositories. In this thesis, we investigate the role of versioning and the characteristics of design databases in design environments. In an effort to improve the configuration management scheme in a design environment, we also investigate the use of database views as a possible configuration tool. We propose a unified version management scheme that facilitates cooperative team work and show that the use of database views provides a powerful configuration management scheme for a design environment

Open Research Online (The Open University)

A role-based access control schema for materialized views

Author: Yousafi Hassaan
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2013
Field of study

This thesis research presents a framework that enhances security at the level of materialized views. Materialized views can be used for performance reasons in very large systems such as data warehouses or distributed systems, or for providing a filtered selection of data from a more general database. Existing proposed techniques provide rule-based access control for materialized views, however, the administration of such systems is time consuming and cumbersome in a large environment. This thesis presents a role-based access control schema for materialized views in which data authorization rules are associated with roles and defined in Datalog syntax in plain text files, a column level restriction is imposed on a materialized view based on a user assigned role, and a role conflict strategy is defined in which priority is given to each conflicting role in order to resolve role conflicts if a user is gaining authorization for permissions associated with conflicting roles at the same time. KEYWORDS Materialized Views, Authorization Views, Session Roles, Role Conflict

Scholarship at UWindsor

Data modeling with NoSQL : how, when and why

Author: Silva Carlos André Reis Fernandes Oliveira da
Publication venue
Publication date: 01/01/2010
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

Repositório Aberto da Universidade do Porto

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX