14 research outputs found

    XWeB: the XML Warehouse Benchmark

    Full text link
    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

    XML to Annotations Mapping Patterns

    Get PDF
    Configuration languages based on XML and source code annotations are very popular in the industry. There are situations in which there are reasons to move configuration languages from one format to the other, or to support multiple configuration languages. In such cases mappings between languages based on these formats have to be defined. Mapping can be used to support multiple configuration languages or to seamlessly move configurations from annotations to XML or vice versa. In this paper, we present XML to annotations mapping patterns that can be used to map languages from one format to the other

    Nested Queries and Quantifiers in an Ordered Context

    Full text link
    We present algebraic equivalences that allow to unnest nested algebraic expressions for order-preserving algebraic operators. We illustrate how these equivalences can be applied successfully to unnest nested queries given in the XQuery language. Measurements illustrate the performance gains possible by our approach

    Research and Innovative Design of Search Engine for Banking Industry Decision-makers

    Get PDF
    In order to solve the problem that General Search Engines (GSEs) involve a wide range of industries, the amount of information obtained through the search is large, information is disorderly arranged, queries are inaccurate, and it is not sufficiently professional. Based on the actual needs of the banking industry, this paper designs and develops an innovative Search Engine for Banking Industry Decision-makers (SEfBIDm). This paper presents the needs analysis, overall functional design, overall framework and workflow design of SEfBIDm. SEfBIDm can provide many functions such as a banking knowledge database, image search and analysis report. This article only gives the implementation method and workflow of typical functions of web search. SEfBIDm was deployed, tested, and operated by 69 branch decision-making agencies at the world's 10th largest bank. Decision-makers from these decision-making bodies believe that SEfBIDm is rooted in the banking industry and that it is supported by banking industry knowledge and experts, which are not available in GSEs. The information obtained from SEfBIDm has three distinct characteristics: the support of comprehensive information before decision making; the timeliness of feedback information during the execution of decisions; and the very accurate evaluation information obtained when the decision is executed

    Using semantics in XML query processing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    平成20年度 計算科学研究センター研究報告

    Get PDF
    1-1 素粒子分野1-2 宇宙分野2-1 計算物性分野2-2 原子核理論分野2-3 計算生命分野3-1 地球環境分野3-2 生物分野4-1 計算機アーキテクチャ研究分野5-1 計算知能分野5-2 計算メディア分

    TIMBER: A native XML database

    Full text link
    This paper describes the overall design and architecture of the Timber XML database system currently being implemented at the University of Michigan. The system is based upon a bulk algebra for manipulating trees, and natively stores XML. New access methods have been developed to evaluate queries in the XML context, and new cost estimation and query optimization techniques have also been developed. We present performance numbers to support some of our design decisions. We believe that the key intellectual contribution of this system is a comprehensive set-at-a-time query processing ability in a native XML store, with all the standard components of relational query processing, including algebraic rewriting and a cost-based optimizer.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/42328/1/20110274.pd

    Managing Uncertainty and Ontologies in Databases

    Get PDF
    Nowadays a vast amount of data is generated in Extensible Markup Language (XML). However, it is necessary for applications in some domains to store and manipulate uncertain information, e.g. when the sensor inputs are noisy, or we want to store data that is uncertain. Another big change we can see in applications and web data is the increasing use of ontologies to describe the semantics of data, i.e., the semantic relationships between the terms in the databases. As such information is usually absent from traditional databases, there is tremendous opportunity to ask new kinds of queries that could not be handled in the past. This provides new challenges on how to manipulate and maintain such new kinds of database systems. In this dissertation, we will see how we can (i) incorporate and manipulate uncertainty in databases, and (ii) efficiently compute aggregates and maintain views on ontology databases. First, I explain applications that require manipulating uncertain information in XML databases and maintaining web ontology databases written in Resource Description Framework (RDF). I then introduce the probabilistic semistructured PXML data model with two formal semantics. I describe a set of algebraic operations and its efficient implementation. Aggregations of PXML instances are studied with two semantics proposed: possible-worlds semantics and expectation semantics. Efficient algorithms with pruning are given and evaluated to show their feasibility. I introduce PIXML, an interval probability version of PXML, and develop a formal semantics for it. A query language and its operational semantics are given and proved to be sound and complete. Based on XML, RDF is a language used to describe web ontologies. RDQL, an RDF query language, is extended to support view definition and aggregations. Two sets of algorithms are given to maintain non-aggregate and aggregate views. Experimental results show that they are efficient compared with standard relational view maintenance algorithms

    Rewriting Declarative Query Languages

    Full text link
    Queries against databases are formulated in declarative languages. Examples are the relational query language SQL and XPath or XQuery for querying data stored in XML. Using a declarative query language, the querist does not need to know about or decide on anything about the actual strategy a system uses to answer the query. Instead, the system can freely choose among the algorithms it employs to answer a query. Predominantly, query processing in the relational context is accomplished using a relational algebra. To this end, the query is translated into a logical algebra. The algebra consists of logical operators which facilitate the application of various optimization techniques. For example, logical algebra expressions can be rewritten in order to yield more efficient expressions. In order to query XML data, XPath and XQuery have been developed. Both are declarative query languages and, hence, can benefit from powerful optimizations. For instance, they could be evaluated using an algebraic framework. However, in general, the existing approaches are not directly utilizable for XML query processing. This thesis has two goals. The first goal is to overcome the above-mentioned misfits of XML query processing, making it ready for industrial-strength settings. Specifically, we develop an algebraic framework that is designed for the efficient evaluation of XPath and XQuery. To this end, we define an order-aware logical algebra and a translation of XPath into this algebra. Furthermore, based on the resulting algebraic expressions, we present rewrites in order to speed up the execution of such queries. The second goal is to investigate rewriting techniques in the relational context. To this end, we present rewrites based on algebraic equivalences that unnest nested SQL queries with disjunctions. Specifically, we present equivalences for unnesting algebraic expressions with bypass operators to handle disjunctive linking and correlation. Our approach can be applied to quantified table subqueries as well as scalar subqueries. For all our results, we present experiments that demonstrate the effectiveness of the developed approaches
    corecore