29 research outputs found

    A fractional number based labeling scheme for dynamic XML updating

    Get PDF
    Recently, XML query processing based on labeling schemes has been proposed.Based on labeling schemes, the structural relationship between XML nodes can be determined quickly without the need of accessing the XML document.However, labeling schemes have to re label the pre-existing nodes or re-calculate the label values when a new node is inserted into the XML document during the update process.In this paper, we propose a novel labeling scheme based on fractional numbers.The key feature of fractional numbers is that infinite number of fractional numbers can be inserted between any two unequal fractional numbers.Therefore, the problem of re-labeling the pre-existing nodes during the XML updating can be solved if the XML nodes are label by the fractional numbers

    Investigation into Indexing XML Data Techniques

    Get PDF
    The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues. Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow

    Strategies and Approaches for Generating Identical Extensive XML Tree Instances

    Get PDF
    In recent years, XML has become the de facto internet wire language. Data may be organized and given context with the use of XML. A well-organized document facilitates the transformation of raw data into actionable intelligence. In B2B1 applications, the XML data is sent and created. This implies the need for fast query processing on XML data. The processing of XML tree sample queries (XTPQ) that provide an efficient response (also known as sample matching) is a topic of active study in the XML database field.DOM (Parser) may be used to transform an XML document into a tree representation. Extensible Markup Language (XML) query languages like XPath and XQuery use tree samples (twigs) to express query results.XML query processing focuses mostly on effectively locating all instances of twig 1 samples inside an XML database. Numerous techniques for matching such tree samples have been presented in recent years. In this study, we survey recent developments in XTPQ processing. This summary will begin by introducing several algorithms for twig sample matching and then go on to provide some background on holistic techniques to process XTPQ

    SCOOTER: A compact and scalable dynamic labeling scheme for XML updates

    Get PDF
    Although dynamic labeling schemes for XML have been the focus of recent research activity, there are significant challenges still to be overcome. In particular, though there are labeling schemes that ensure a compact label representation when creating an XML document, when the document is subject to repeated and arbitrary deletions and insertions, the labels grow rapidly and consequently have a significant impact on query and update performance. We review the outstanding issues todate and in this paper we propose SCOOTER - a new dynamic labeling scheme for XML. The new labeling scheme can completely avoid relabeling existing labels. In particular, SCOOTER can handle frequently skewed insertions gracefully. Theoretical analysis and experimental results confirm the scalability, compact representation, efficient growth rate and performance of SCOOTER in comparison to existing dynamic labeling schemes

    Dynamic and Multi-functional Labeling Schemes

    Full text link
    We investigate labeling schemes supporting adjacency, ancestry, sibling, and connectivity queries in forests. In the course of more than 20 years, the existence of log⁥n+O(log⁥log⁥)\log n + O(\log \log) labeling schemes supporting each of these functions was proven, with the most recent being ancestry [Fraigniaud and Korman, STOC '10]. Several multi-functional labeling schemes also enjoy lower or upper bounds of log⁥n+Ί(log⁥log⁥n)\log n + \Omega(\log \log n) or log⁥n+O(log⁥log⁥n)\log n + O(\log \log n) respectively. Notably an upper bound of log⁥n+5log⁥log⁥n\log n + 5\log \log n for adjacency+siblings and a lower bound of log⁥n+log⁥log⁥n\log n + \log \log n for each of the functions siblings, ancestry, and connectivity [Alstrup et al., SODA '03]. We improve the constants hidden in the OO-notation. In particular we show a log⁥n+2log⁥log⁥n\log n + 2\log \log n lower bound for connectivity+ancestry and connectivity+siblings, as well as an upper bound of log⁥n+3log⁥log⁥n+O(log⁥log⁥log⁥n)\log n + 3\log \log n + O(\log \log \log n) for connectivity+adjacency+siblings by altering existing methods. In the context of dynamic labeling schemes it is known that ancestry requires Ί(n)\Omega(n) bits [Cohen, et al. PODS '02]. In contrast, we show upper and lower bounds on the label size for adjacency, siblings, and connectivity of 2log⁥n2\log n bits, and 3log⁥n3 \log n to support all three functions. There exist efficient adjacency labeling schemes for planar, bounded treewidth, bounded arboricity and interval graphs. In a dynamic setting, we show a lower bound of Ί(n)\Omega(n) for each of those families.Comment: 17 pages, 5 figure

    Desirable properties for XML update mechanisms

    Get PDF
    The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality

    A Clustering-based Scheme for Labeling XML Trees

    Get PDF
    Summary Tree labeling plays a key role in XML query processing. In this paper, we propose a new labeling scheme, called Clusteringbased Labeling. Unlike all previous labeling methods, In this labeling scheme elements are separated into various groups, and a label is assigned to a group of elements instead of one element. Based on Clustering-based Labeling we design a new relational schema, similar to OrdPath scheme, for storing XML documents in relational database. Grouping Sibling nodes into one record reduces number of relational records needed for XML document storage. Our experimental results shows that our storing scheme significantly is better than tree well-known relational XML storing methods in terms of number of stored records, document reconstruction time and query processing performance

    The Emergence Computation of Overflow in Dynamic XML Tree Based on Prefix and Interval Labelling Schemes

    Get PDF
    Despite the fact that dynamic XML labelling schemes have been investigated widely, some challenges still need to be tackled. Dynamic XML documents are subject to change. An efficient dynamic labelling scheme is able to maintain the node relationships throughout continuous changes to the XML tree structure. Such a scheme generates labels for new nodes to avoid the need to relabel the whole tree. The main problem for dynamic XML is overflow that occurs when the label length of the new node is over the reserved space limit. There has not been sufficient analysis to determine the class of labelling scheme which faces this problem in the early stages of update. To this end a series of experiments were performed when updating the Nasa XML database, which contains real data. Five sets of new nodes (50, 100, 400, 800, 1200) were inserted into this dataset using two versions of XML node indexing system: a Prefix and an Interval labelling scheme. It was found that Interval falls victim to the problem of overflow after the insertion of only 100 nodes whereas Prefix has no problem even when adding 1200 nodes

    IMAX: incremental maintenance of schema-based XML statistics

    Get PDF
    Journal ArticleCurrent approaches for estimating the cardinality of XML queries are applicable to a static scenario wherein the underlying XML data does not change subsequent to the collection of statistics on the repository. However, in practice, many XML-based applications are dynamic and involve frequent updates to the data. In this paper, we investigate efficient strategies for incrementally maintaining statistical summaries as and when updates are applied to the data. Specifically, we propose algorithms that handle both the addition of new documents as well as random insertions in the existing document trees. We also show, through a detailed performance evaluation, that our incremental techniques are significantly faster than the naive recomputation approach; and that estimation accuracy can be maintained even with a fixed memory budget
    corecore