274,345 research outputs found

    Formal Properties of XML Grammars and Languages

    Full text link
    XML documents are described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that every XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars, one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages. We also characterize those XML-grammars that generate regular XML-languages.Comment: 24 page

    Functional dependencies for XML : axiomatisation and normal form in the presence of frequencies and identifiers : a thesis presented in partial fulfilment of the requirements for the degree of Master of Sciences in Information Sciences at Massey University, Palmerston North, New Zealand

    Get PDF
    XML has gained popularity as a markup language for publishing and exchanging data on the web. Nowadays, there are also ongoing interests in using XML for representing and actually storing data. In particular, much effort has been directed towards turning XML into a real data model by improving the semantics that can be expressed about XML documents. Various works have addressed how to define different classes of integrity constraints and the development of a normalisation theory for XML. One area which received little to no attention from the research community up to five years ago is the study of functional dependencies in the context of XML [37]. Since then, there has been increasingly more research investigating functional dependencies in XML. Nevertheless, a comprehensive dependency theory and normalisation theory for XML have yet to emerge. Functional dependencies are an integral part of database theory in the relational data model (RDM). In particular, functional dependencies have been vital in the investigation of how to design "good" relational database schemas which avoid or minimise problems relating to data redundancy and data inconsistency. Since the same problems can be shown to exist in poorly designed XML schemas 1 , there is a need to investigate how these problems can be eliminated in the context of XML. We believe that the study of an analogy to relational functional dependencies in the context of XML is equally significant towards designing "good" XML schemas. [FROM INTRODUCTION

    XML Matchers: approaches and challenges

    Full text link
    Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

    XML data integrity based on concatenated hash function

    Get PDF
    Data integrity is the fundamental for data authentication. A major problem for XML data authentication is that signed XML data can be copied to another document but still keep signature valid. This is caused by XML data integrity protecting. Through investigation, the paper discovered that besides data content integrity, XML data integrity should also protect element location information, and context referential integrity under fine-grained security situation. The aim of this paper is to propose a model for XML data integrity considering XML data features. The paper presents an XML data integrity model named as CSR (content integrity, structure integrity, context referential integrity) based on a concatenated hash function. XML data content integrity is ensured using an iterative hash process, structure integrity is protected by hashing an absolute path string from root node, and context referential integrity is ensured by protecting context-related elements. Presented XML data integrity model can satisfy integrity requirements under situation of fine-grained security, and compatible with XML signature. Through evaluation, the integrity model presented has a higher efficiency on digest value-generation than the Merkle hash tree-based integrity model for XML data

    Building XML data warehouse based on frequent patterns in user queries

    Get PDF
    [Abstract]: With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users’ needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources

    XWeB: the XML Warehouse Benchmark

    Full text link
    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems
    • …
    corecore