8 research outputs found

    TypEx : a type based approach to XML stream querying

    Get PDF
    We consider the topic of query evaluation over semistructured information streams, and XML data streams in particular. Streaming evaluation methods are necessarily eventdriven, which is in tension with high-level query models; in general, the more expressive the query language, the harder it is to translate queries into an event-based implementation with finite resource bounds

    Extracting partition statistics from semistructured data

    Get PDF
    The effective grouping, or partitioning, of semistructured data is of fundamental importance when providing support for queries. Partitions allow items within the data set that share common structural properties to be identified efficiently. This allows queries that make use of these properties, such as branching path expressions, to be accelerated. Here, we evaluate the effectiveness of several partitioning techniques by establishing the number of partitions that each scheme can identify over a given data set. In particular, we explore the use of parameterised indexes, based upon the notion of forward and backward bisimilarity, as a means of partitioning semistructured data; demonstrating that even restricted instances of such indexes can be used to identify the majority of relevant partitions in the data

    EMT and stemness: flexible processes tuned by alternative splicing in development and cancer progression

    Full text link

    Compact Data Structures for Querying XML

    Get PDF
    XML is of growing importance in a range of computer applications. In addition to being a document exchange format it is now commonly used for data storage and retrieval as well. While XML offers great potential to unite data exchange and storage, it is expensive to process data stored in textual format. The document object model (DOM) defines convenient means to access XML data, but many implementations struggle with performance limitations. The data model implied by the standard needs new data structures and adapted query algorithms to enable native XML databases to perform acceptably. This introduction describes a preliminary data structure developed with these objectives in mind. It also gives an overview of our future research directions and anticipated problems

    A resource efficient hybrid data structure for twig queries

    No full text
    Abstract. Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semistructured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This paper introduces an integration of the two approaches using numbering schemes to connect the separate elements, the key strength of this hybrid technique is that both structural and value predicates can be resolved in one graph, while further allowing for compression of the resulting data structure. Performance measures that show advantages of using this hybrid structure are presented, together with an analysis of query resolution using a number of different index granularities. As the current trend is towards the requirement for working with larger semi-structured data sets this work allows for the utilisation of these data sets whilst reducing both the bandwidth and storage space necessary.

    Extracting partition statistics from semistructured data

    No full text
    The effective grouping, or partitioning, of semistructured data is of fundamental importance when providing support for queries. Partitions allow items within the data set that share common structural properties to be identified efficiently. This allows queries that make use of these properties, such as branching path expressions, to be accelerated. Here, we evaluate the effectiveness of several partitioning techniques by establishing the number of partitions that each scheme can identify over a given data set. In particular, we explore the use of parameterised indexes, based upon the notion of forward and backward bisimilarity, as a means of partitioning semistructured data; demonstrating that even restricted instances of such indexes can be used to identify the majority of relevant partitions in the data.

    A Model For Querying Semistructured Data Through The Exploitation

    No full text
    Introduction and Model Much research has been undertaken in order to speed up the processing of semistructured data in general and XML in particular. Many approaches for storage, compression, indexing and querying exist, e.g. [1, 2]. We do not present yet another such algorithm but a unifying model in which these algorithm can be understood. The key idea behind this research is the assumption, that most practical queries are based on a particular pattern of data that can be deduced from the query and which can then be captured using a regular structure amendable to efficient processing techniques. To aid understanding, we divide the problem of query optimisation into five distinct steps, which are presented in Figure 1 and described below: 1. Design of an index for a particular query class: Given a particular query pattern or class, indices can be devised, which support easy evaluation. If the index is covering, we can even dispose of the original data and resolve the query entirel
    corecore