61 research outputs found

    Expanding Pre-Suit Discovery Production and Preservation Orders

    Get PDF
    Article published in the Michigan State Law Review

    Detecting Redundancy in Data Warehouse Evolution

    Full text link

    Efficient keyword search on large tree structured datasets

    No full text

    Heuristic optimization of OLAP queries in multidimensionally hierarchically clustered databases

    No full text
    On-Line Analytical Processing (OLAP) is a technology that encompasses applications requiring a multidimen-sional and hierarchical view of data. OLAP applica-tions often require fast response time to complex group-ing/aggregation queries on enormous quantities of data. Commercial relational database management systems use mainly multiple one-dimensional indexes to process OLAP queries that restrict multiple dimensions. How-ever, in many cases, multidimensional access methods outperform one-dimensional indexing methods. We present an architecture for multidimensional data-bases that are clustered with respect to multiple hi-erarchical dimensions. It is based on the star schema and is called CSB star. Then, we focus on heuristi-cally optimizing OLAP queries over this schema using multidimensional access methods. Users can still formu-late their queries over a traditional star schema, which are then rewritten by the query processor over the CSB star. We exploit the different clustering features of the CSB star to efficiently process a class of typical OLAP queries. We detect special cases where the construction of an evaluation plan can be simplified and we discuss improvements of our technique. 1

    Processing OLAP queries in hierarchically clustered databases

    No full text

    Top-k-size keyword search on tree structured data

    No full text
    Keyword search is the most popular technique for querying large tree-structured datasets, often of unknown structure, in the web. Recent keyword search approaches return lowest common ancestors (LCAs) of the keyword matches ranked with respect to their relevance to the keyword query. A major challenge of a ranking approach is the efficiency of its algorithms as the number of keywords and the size and complexity of the data increase. To face this challenge most of the known approaches restrict their ranking to a subset of the LCAs (e.g., SLCAs, ELCAs), missing relevant results. In this work, we design novel top-k-size stack-based algorithms on tree-structured data. Our algorithms implement ranking semantics for keyword queries which is based on the concept of LCA size. Similar to metric selection in information retrieval, LCA size reflects the proximity of keyword matches in the data tree. This semantics does not rank a predefined subset of LCAs and through a layered presentation of results, it demonstrates improved effectiveness compared to previous relevant approaches. To address performance challenges our algorithms exploit a lattice of the partitions of the keyword set, which empowers a linear time performance. This result is obtained without the support of auxiliary precomputed data structures. An extensive experimental study on various and large datasets confirms the theoretical analysis. The results show that, in contrast to other approaches, our algorithms scale smoothly when the size of the dataset and the number of keywords increase

    A randomized approach for the incremental design of an evolving data warehouse

    No full text
    Abstract. A Data Warehouse (DW) can be used to integrate data from multiple distributed data sources. A DW can be seen as a set of materialized views that determine its schema and its content in terms of the schema and the content of the data sources. DW applications require high query performance. For this reason,the design of a typical DW consists of selecting views to materialize that are able to answer a set of input user queries. However,the cost of answering the queries has to be balanced against the cost of maintaining the materialized views. In an evolving DW application,new queries need to be answered by the DW. An incremental selection of materialized views uses the materialized views already in the DW to answer parts of the new queries,and avoids the re-implementation of the DW from scratch. This incremental design is complex and an exhaustive approach is not feasible. We have developed a randomized approach for incrementally selecting a set of views that are able to answer a set of input user queries locally while minimizing a combination of the query evaluation and view maintenance cost. In this process we exploit “common sub-expressions ” among new queries and between new queries and old views. Our approach is implemented and we report on its experimental evaluation.
    corecore