118 research outputs found

    Path Queries on Compressed XML

    Get PDF
    Central to any XML query language is a path language such as XPath which operates on the tree structure of the XML document. We demonstrate in this paper that the tree structure can be e#ectively compressed and manipulated using techniques derived from symbolic model checking . Specifically, we show first that succinct representations of document tree structures based on sharing subtrees are highly e#ective. Second, we show that compressed structures can be queried directly and e#ciently through a process of manipulating selections of nodes and partial decompression

    Tractable Optimization Problems through Hypergraph-Based Structural Restrictions

    Full text link
    Several variants of the Constraint Satisfaction Problem have been proposed and investigated in the literature for modelling those scenarios where solutions are associated with some given costs. Within these frameworks computing an optimal solution is an NP-hard problem in general; yet, when restricted over classes of instances whose constraint interactions can be modelled via (nearly-)acyclic graphs, this problem is known to be solvable in polynomial time. In this paper, larger classes of tractable instances are singled out, by discussing solution approaches based on exploiting hypergraph acyclicity and, more generally, structural decomposition methods, such as (hyper)tree decompositions

    Clustering-Based Materialized View Selection in Data Warehouses

    Full text link
    Materialized view selection is a non-trivial task. Hence, its complexity must be reduced. A judicious choice of views must be cost-driven and influenced by the workload experienced by the system. In this paper, we propose a framework for materialized view selection that exploits a data mining technique (clustering), in order to determine clusters of similar queries. We also propose a view merging algorithm that builds a set of candidate views, as well as a greedy process for selecting a set of views to materialize. This selection is based on cost models that evaluate the cost of accessing data using views and the cost of storing these views. To validate our strategy, we executed a workload of decision-support queries on a test data warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when storage space is limited

    Cut and Paste

    Get PDF
    AbstractThe paper develops Editor, a language for manipulating semistructured documents, such as those typically available on the Web. Editor programs are based on two simple ideas, taken from text editors: “search” instructions are used to select regions of interest in a document, and “cut & paste” instructions to restructure them. We study the expressive power and the complexity of these programs. We show that they are computationally complete, in the sense that any computable document restructuring can be expressed in Editor. We also study the complexity of a safe subclass of programs, showing that it captures exactly the class of polynomial-time restructurings. The language has been implemented in Java and is currently used in the Araneus project as a basis for a wrapper-generation toolkit

    On-line analytical processing in distributed data warehouses

    Get PDF
    The concepts of 'data warehousing' and 'on-line analytical processing' have seen a growing interest in the research and commercial product community. Today, the trend moves away from complex centralized data warehouses to distributed data marts integrated in a common conceptual schema. However, as the first part of this paper demonstrates, there are many problems and little solutions for large distributed decision support systems in worldwide operating corporations. After showing the benefits and problems of the distributed approach, this paper outlines possibilities for achieving performance in distributed online analytical processing. Finally, the architectural framework of the prototypical distributed OLAP system CUBESTAR is outlined

    Greedy Selection of Materialized Views

    Get PDF
    Greedy based approach for view selection at each step selects a beneficial view that fits within the space available for view materialization. Most of these approaches are focused around the HRU algorithm, which uses a multidimensional lattice framework to determine a good set of views to materialize. The HRU algorithm exhibits high run time complexity as the number of possible views is exponential with respect to the number of dimensions. The PGA algorithm provides a scalable solution to this problem by selecting views for materialization in polynomial time relative to the number of dimensions. This paper compares the HRU and the PGA algorithm. It was experimentally deduced that the PGA algorithm, in comparison with the HRU algorithm, achieves an improved execution time with lowered memory and CPU usages. The HRU algorithm has an edge over the PGA algorithm on the quality of the views selected for materialization
    corecore