Search CORE

1,985 research outputs found

Automatic physical database design : recommending materialized views

Author: Xu Wugang
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2007
Field of study

This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

Digital Commons @ New Jersey Institute of Technology (NJIT)

Selection strategy of materialized views in data warehouse

Author: 林小静
薛永生
Publication venue
Publication date: 08/07/2007
Field of study

为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想。因此,物化视图的选择策略是数据仓库研究的重要问题之一。其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图。提出一个以MVPP(mul-ti-view processing plan)为视图选择的搜索空间的物化视图选择新算法——VSMF(views selection base on multi-factor)算法。该算法在存储空间约束下同时实现多查询最优化和视图维护最优化。A set of materialized views are stored in the data warehouse for the purpose of efficiently implementing decision-support or OLAP queries.The selection of materialized views is one of the most important issues in the data warehouse development.The goal is to select an appropriate set of views so that the total cost of storage,maintenance and query is minimized.A new algorithm named VSMF(views selection base on multi-factor) algorithm using multi-view processing plan structure as search space is proposed,which solve the problem considering both multi-query optimization and the maintenance process optimization under the storage space constrain.福建省自然科学基金项目(A0310008);; 福建省重点科技基金项目(2003H043

Xiamen University Institutional Repository

Clustering-Based Materialized View Selection in Data Warehouses

Author: A. Shukla
A.F. Cardenas
H. Gupta
H. Gupta
J.R. Smith
Jonathan Goldstein
S. Rizzi
S.B. Yao
X. Baril
Publication venue
Publication date: 01/01/2006
Field of study

Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL Descartes

View Selection in Semantic Web Databases

Author: François Goasdoué
François Goasdoué
Ioana Manolescu
Julien Leblay
Julien Leblay
Konstantinos Karanasos
Konstantinos Karanasos
Équipes-projets Leo
Publication venue
Publication date: 01/01/2011
Field of study

We consider the setting of a Semantic Web database, containing both explicit data encoded in RDF triples, and implicit data, implied by the RDF semantics. Based on a query workload, we address the problem of selecting a set of views to be materialized in the database, minimizing a combination of query processing, view storage, and view maintenance costs. Starting from an existing relational view selection method, we devise new algorithms for recommending view sets, and show that they scale significantly beyond the existing relational ones when adapted to the RDF context. To account for implicit triples in query answers, we propose a novel RDF query reformulation algorithm and an innovative way of incorporating it into view selection in order to avoid a combinatorial explosion in the complexity of the selection process. The interest of our techniques is demonstrated through a set of experiments.Comment: VLDB201

arXiv.org e-Print Archive

HAL-CentraleSupelec

CiteSeerX

INRIA a CCSD electronic archive server

Oxford University Research Archive

HAL-Rennes 1

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC