60 research outputs found

    TOQL: Temporal Ontology Querying Language

    Full text link

    Indexing XML Documents Using Tree Paths Automaton

    Get PDF
    An XML document can be viewed as a tree in a natural way. Processing tree data structures usually requires a pushdown automaton as a model of computation. Therefore, it is interesting that a finite automaton can be used to solve the XML index problem. In this paper, we attempt to support a significant fragment of XPath queries which may use any combination of child (i.e., /) and descendant-or-self (i.e., //) axis. A systematic approach to the construction of such XML index, which is a finite automaton called Tree Paths Automaton, is presented. Given an XML tree model T, the tree is first of all preprocessed by means of its linear fragments called string paths. Since only path queries are considered, the branching structure of the XML tree model can be omitted. For individual string paths, smaller Tree Paths Automata are built, and they are afterwards combined to form the index. The searching phase uses the index, reads an input query Q of size m, and computes the list of positions of all occurrences of Q in the tree T. The searching is performed in time O(m) and does not depend on the size of the XML document. Although the number of queries is clearly exponential in the number of nodes of the XML tree model, the size of the index seems to be, according to our experimental results, usually only about 2.5 times larger than the size of the original document

    Indexing collections of XML documents with arbitrary links

    Get PDF
    In recent years, the popularity of XML has increased significantly. XML is the extensible markup language of the World Wide Web Consortium (W3C). XML is used to represent data in many areas, such as traditional database management systems, e-business environments, and the World Wide Web. XML data, unlike relational and object-oriented data, has no fixed schema known in advance and is stored separately from the data. XML data is self-describing and can model heterogeneity more naturally than relational or object-oriented data models. Moreover, XML data usually has XLinks or XPointers to data in other documents (e.g., global-links). In addition to XLink or XPointer links, the XML standard allows to add internal-links between different elements in the same XML document using the ID/IDREF attributes. The rise in popularity of XML has generated much interest in query processing over graph-structured data. In order to facilitate efficient evaluation of path expressions, structured indexes have been proposed. However, most variants of structured indexes ignore global- or interior-document references. They assume a tree-like structure of XML-documents, which do not contain such global-and internal-links. Extending these indexes to work with large XML graphs considering of global- or internal-document links, firstly requires a lot of computing power for the creation process. Secondly, this would also require a great deal of space in which to store the indexes. As a latter demonstrates, the efficient evaluation of ancestors-descendants queries over arbitrary graphs with long paths is indeed a complex issue. This thesis proposes the HID index (2-Hop cover path Index based on DAG) is based on the concept of a two-hop cover for a directed graph. The algorithms proposed for the HID index creation, in effect, scales down the original graph size substantially. As a result, a directed acyclic graph (DAG) with a smaller number of nodes and edges will emerge. This reduces the number of computing steps required for building the index. In addition to this, computing time and space will be reduced as well. The index also permits to efficiently evaluate ancestors-descendants relationships. Moreover, the proposed index has an advantage over other comparable indexes: it is optimized for descendants- or-self queries on arbitrary graphs with link relationship, a task that would stress any index structures. Our experiments with real life XML data show that, the HID index provides better performance than other indexes

    CBSE: an implementation case study

    Get PDF
    Over the last couple of years, the shift towards component based software engineering (CBSE) methods has become a cost effective way to get an application to implementation stage much earKer. Adoption of Component Based Development methods acknowledges the use of third party components wherever possible to reduce the cost of software development, shorten the development phase and provide a richer set of processing options for the end user. The use of these tools is particularly relevant in Web based applications, where commercial off the shelf (COTS) products are so prevalent. However, there are a number of risks associated with the use of component based development methods. This thesis investigates these risks within the context of a software engineering project and attempts to provide a means to minimise and or at least manage the risk potential when using component based development method

    Mu-Calculus Based Resolution of XPath Decision Problems

    Get PDF
    XPath is the standard declarative notation for navigating XML data and returning a set of matching nodes. In the context of XSLT/XQuery analysis, query optimization, and XML type checking, XPath decision problems arise naturally. They notably include XPath containment (whether or not for any tree the result of a particular query is included in the result of a second one), and XPath satisfiability (whether or not an expression yields a non-empty result), in the presence (or the absence) of XML DTDs. In this paper, we propose a unifying logic for XML, namely the alternation-free modal mu-calculus with converse. We show how to translate major XML concepts such as XPath and DTDs into this logic. Based on these embeddings, we show how XPath decision problems can be solved using a state-of-the-art EXPTIME decision procedure for mu-calculus satisfiability. We provide preliminary experiments which shed light, for the first time, on the cost of solving XPath decision problems in practice

    The many faces of publish/subscribe

    Get PDF
    Well adapted to the loosely coupled nature of distributed interaction in large-scale applications, the publish/subscribe communication paradigm has recently received increasing attention. With systems based on the publish/subscribe interaction scheme, subscribers register their interest in an event, or a pattern of events, and are subsequently asynchronously notified of events generated by publishers. Many variants of the paradigm have recently been proposed, each variant being specifically adapted to some given application or network model. This paper factors out the common denominator underlying these variants: full decoupling of the communicating entities in time, space, and synchronization. We use these three decoupling dimensions to better identify commonalities and divergences with traditional interaction paradigms. The many variations on the theme of publish/subscribe are classified and synthesized. In particular, their respective benefits and shortcomings are discussed both in terms of interfaces and implementations

    HIGH PERFORMANCE XPATH EVALUATION IN XML STREAMS

    Get PDF
    This thesis presents methods for efficiently evaluating structural queries over tree-structured data streams. A data stream usually consists of a sequence of items that arrive in an order determined by the source. An application that uses such data cannot revisit an earlier item in the stream unless it buffers the item itself. Naive buffering methods are not practical due to the high throughput and indefinite length of data streams. Compared with the flat, relational-like data model for data streams that has received recent attention, processing a tree-structured XML data stream poses additional challenges, since a data item cannot, in general, be interpreted without taking structural information into account. In this thesis, we focus on the evaluation of XPath queries on streaming XML. As a W3C standard, XPath has become a core XML technology not only as a standalone query language but also as the foundation of XQuery and XSLT. Features such as subqueries and reverse axes make XPath a powerful query language but they also complicate XPath query processing. We present our work on XSQ, a streaming XPath query engine. Our methods are based on a novel segment-based evaluation scheme. XSQ uses very little memory and is able to process unbounded and unsegmented streaming data because it does not build a DOM tree in memory. It also provides high throughput by only processing the relevant portions of the data and low response time by returning results as early as possible. XSQ is the first streaming system to support complex XPath features such as multiple predicates, closure axes, aggregations, reverse axes, and subqueries. We also describe our work on XPaSS, an XPath-based publish-subscribe system that simultaneously evaluates a large number of XPath queries over XML streams. Unlike other similar systems that filter pre-segmented documents as results, XPaSS returns only the precisely delineated data specified by a user query. It uses a segment-sharing scheme instead of prefix- and suffix-sharing that are commonly used. In our experiments, XPaSS supports up to one million XPath subscriptions using a modest PC-class server, with a throughput comparable to that of the simpler filtering systems

    XKeyMatch : um algoritmo semântico para detecção de diferenças entre documentos XML /

    Get PDF
    Orientadora: Carmem S.HaraInclui apêndiceDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 2006Inclui bibliografi
    • …
    corecore