13,264 research outputs found
Clustering-Based Materialized View Selection in Data Warehouses
Materialized view selection is a non-trivial task. Hence, its complexity must
be reduced. A judicious choice of views must be cost-driven and influenced by
the workload experienced by the system. In this paper, we propose a framework
for materialized view selection that exploits a data mining technique
(clustering), in order to determine clusters of similar queries. We also
propose a view merging algorithm that builds a set of candidate views, as well
as a greedy process for selecting a set of views to materialize. This selection
is based on cost models that evaluate the cost of accessing data using views
and the cost of storing these views. To validate our strategy, we executed a
workload of decision-support queries on a test data warehouse, with and without
using our strategy. Our experimental results demonstrate its efficiency, even
when storage space is limited
Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation
Materialized views are important for optimizing Business Intelligence (BI) systems when they are designed without data cubes. Selecting candidate queries from large number of queries for materialized views is a challenging task. Most of the work done in the past involves finding out frequent queries from the past workload and creating materialized views from such queries by either manually analyzing workload or using approximate string matching algorithms using query text. Most of the existing methods suggest complete queries but ignore query components such as sub queries for creation of materialized views. This paper presents a novel method to determine on which queries and query components materialized views can be created to optimize aggregate and join queries by mining database of query execution plans which are in the form of binary trees. The proposed algorithm showed significant improvement in terms of more number of optimized queries because it is using the execution plan tree of the query as a basis of selection of query to be optimized using materialized views rather than choosing query text which is used by traditional methods. For selecting a correct set of queries to be optimized using materialized views, the paper proposes efficient specialized frequent tree component mining algorithm with novel heuristics to prune search space. These frequent components are used to determine the possible set of candidate queries for creation of materialized views. Experimentation on standard, real and synthetic data sets, and also the theoretical basis, proved that the proposed method is able to optimize a large number of queries with less number of materialized views and showed a significant improvement in performance compared to traditional methods
A novel algorithm with IM-LSI index for incremental maintenance of materialized view
The ability to afford decision makers with both accurate and timely consolidated information as well as rapid query response times is the fundamental requirement for the success of a Data Warehouse. To provide fast access, a data warehouse stores materialized views of the sources of its data. As a result, a data warehouse needs to be maintained to keep its contents consistent with the contents of its data sources. Incremental maintenance is generally regarded as a more efficient way to maintain materialized views in a data warehouse The view has to be maintained to reflect the updates done against the base relations stored at the various distributed data sources. The proposed approach contains two modules namely, materialized view selection(MVS) and maintenance of materialized view. (MMV). In recent times, several algorithms have been proposed for keeping the views up-to-date in response to the changes in the source data. Therefore, we present an improved algorithm for MVS and MMV using IM-LSI(Itemset Mining using Latent Semantic Index) algorithm. selection of views to materialize using the IM(Itemset Mining) algorithm method to overcome the problem resulting from conventional view selection algorithms and then we consider the maintenance of materialized views using LSI. For the justification of the proposed algorithm, we reveal the experimental results in which both time and space costs better than conventional algorithms.Facultad de Informátic
A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing
The overwhelmingly increasing amount of stored data has spurred researchers
seeking different methods in order to optimally take advantage of it which
mostly have faced a response time problem as a result of this enormous size of
data. Most of solutions have suggested materialization as a favourite solution.
However, such a solution cannot attain Real- Time answers anyhow. In this paper
we propose a framework illustrating the barriers and suggested solutions in the
way of achieving Real-Time OLAP answers that are significantly used in decision
support systems and data warehouses
- …