Search CORE

1,809 research outputs found

EFFICIENT APPROACH FOR VIEW SELECTION FOR DATA WAREHOUSE USING TREE MINING AND EVOLUTIONARY COMPUTATION

Author: Deshpande Parag
Thakare Atul
Publication venue: 'AGHU University of Science and Technology Press'
Publication date: 25/11/2018
Field of study

Selection of a proper set of views to materialize plays an important role indatabase performance. There are many methods of view selection which uses different techniques and frameworks to select an efficient set of views for materialization. In this paper, we present a new efficient, scalable method for view selection under the given storage constraints using a tree mining approach and evolutionary optimization. Tree mining algorithm is designed to determine the exact frequency of (sub)queries in the historical SQL dataset. Query Cost model achieves the objective of maximizing the performance benefits from the final view set which is derived from the frequent view set given by tree mining algorithm. Performance benefit of a query is defined as a function of queryfrequency, query creation cost, and query maintenance cost. The experimental results shows that the proposed method is successful in recommending a solution which is fairly close to optimal solution

Clustering-Based Materialized View Selection in Data Warehouses

Author: A. Shukla
A.F. Cardenas
H. Gupta
H. Gupta
J.R. Smith
Jonathan Goldstein
S. Rizzi
S.B. Yao
X. Baril
Publication venue
Publication date: 01/01/2006
Field of study

Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

arXiv.org e-Print Archive

CiteSeerX

Parameterized XPath Views

Author: Böhme Timo
Rahm Erhard
Publication venue
Publication date: 12/11/2018
Field of study

We present a new approach for accelerating the execution of XPath expressions using parameterized materialized XPath views (PXV). While the approach is generic we show how it can be utilized in an XML extension for relational database systems. Furthermore we discuss an algorithm for automatically determining the best PXV candidates to materialize based on a given workload. We evaluate our approach and show the superiority of our cost based algorithm for determining PXV candidates over frequent pattern based algorithms

Contributions à l’Optimisation de Requêtes Multidimensionnelles

Author: Maabout Sofian
Publication venue: HAL CCSD
Publication date: 12/12/2014
Field of study

Analyser les données consiste à choisir un sous-ensemble des dimensions qui les décriventafin d'en extraire des informations utiles. Or, il est rare que l'on connaisse a priori les dimensions"intéressantes". L'analyse se transforme alors en une activité exploratoire où chaque passe traduit par une requête. Ainsi, il devient primordiale de proposer des solutions d'optimisationde requêtes qui ont une vision globale du processus plutôt que de chercher à optimiser chaque requêteindépendamment les unes des autres. Nous présentons nos contributions dans le cadre de cette approcheexploratoire en nous focalisant sur trois types de requêtes: (i) le calcul de bordures,(ii) les requêtes dites OLAP (On Line Analytical Processing) dans les cubes de données et (iii) les requêtesde préférence type skyline

Hybrid Classification of OLAP Queries in Cloud Computing Environment

Author: Ettaoufik Abdelaziz
Maizate Abderrahim
OUZZIF Mohammed
Publication venue: Revue Méditerranéenne des Télécommunications
Publication date: 05/02/2018
Field of study

Generally, the execution time of the decision requests on large tables is very high which degrades the performance of data warehouses (DW). On the other hand, having high traffic can influence the response time of queries. Cloud Computing (CC) offers a solution to this kind of problem by providing a flexible environment in which data is highly available since it is stored and duplicated in different nodes. Optimizing the performance of an DW deployed on CC is indispensable task that aims to make cloud services conform to customer expectations by increasing performance at a minimum cost. This optimization is based on the improvement of various factors such as the response time to the client queries, availability, scalability, etc. Thus, having a voluminous and dynamic queries load can make the task of optimization difficult. For this purpose, we propose in this paper a hybrid classification technique of queries, in order to minimize his number and reduce the total cost of hosting the DW on the CC