6,266 research outputs found
ApproXFILTER - an approximative XML filter
Publish/subscribe systems filter published documents and inform their subscribers about documents matching their interests. Recent systems have focussed on documents or messages sent in XML format. Subscribers have to be familiar with the underlying XML format to create meaningful subscriptions. A service might support several providers with slightly differing formats, e.g., several publishers of books. This makes the definition of a successful subscription almost impossible. We propose the use of an approximative language for subscriptions.We introduce the design our ApproXFILTER algorithm for approximative filtering
in a pub/sub system. We present the results of our analysis of a prototypical implementation
Investigation into Indexing XML Data Techniques
The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues.
Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the
size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow
Validation of streaming XML documents with abstract state machines
The exact validation of streaming XML documents can be realised by using visibly push-down automata (VPA) that are defined by Extended Document Type Definitions (EDTD). It is straightforward to represent such an automaton as an Abstract State Machine (AS
Use-cases on evolution
This report presents a set of use cases for evolution and reactivity for data in the Web and
Semantic Web. This set is organized around three different case study scenarios, each of them
is related to one of the three different areas of application within Rewerse. Namely, the scenarios
are: “The Rewerse Information System and Portal”, closely related to the work of A3
– Personalised Information Systems; “Organizing Travels”, that may be related to the work
of A1 – Events, Time, and Locations; “Updates and evolution in bioinformatics data sources”
related to the work of A2 – Towards a Bioinformatics Web
MonetDB/XQuery: a fast XQuery processor powered by a relational engine
Relational XQuery systems try to re-use mature relational data management infrastructures to create fast and scalable XML database technology. This paper describes the main features, key contributions, and lessons learned while implementing such a system. Its architecture consists of (i) a range-based encoding of XML documents into relational tables, (ii) a compilation technique that translates XQuery into a basic relational algebra, (iii) a restricted (order) property-aware peephole relational query optimization strategy, and (iv) a mapping from XML update statements into relational updates. Thus, this system implements all essential XML database functionalities (rather than a single feature) such that we can learn from the full consequences of our architectural decisions. While implementing this system, we had to extend the state-of-the-art with a number of new technical contributions, such as loop-lifted staircase join and efficient relational query evaluation strategies for XQuery theta-joins with existential semantics. These contributions as well as the architectural lessons learned are also deemed valuable for other relational back-end engines. The performance and scalability of the resulting system is evaluated on the XMark benchmark up to data sizes of 11GB. The performance section also provides an extensive benchmark comparison of all major XMark results published previously, which confirm that the goal of purely relational XQuery processing, namely speed and scalability, was met
Handling of current time in native XML databases
The introduction of Native XML databases opens many research questions related to the data models used to represent and manipulate data, including temporal data in XML. Increasing use of XML for Valid Web pages warrants an adequate treatment of now in Native XML databases. In this study, we examined how to represent and manipulate now-relative temporal data. We identify different approaches being used to represent current time in XML temporal databases, and introduce the notion of storing variables such as `now' or `UC' as strings in XML native databases. All approaches are empirically evaluated on a query that time-slices the timeline at the current time. The experimental results indicate that the proposed extension offers several advantages over other approaches: better semantics, less storage space and better response time
Recommended from our members
Pentagonal scheme for dynamic XML prefix labelling
In XML databases, the indexing process is based on a labelling or
numbering scheme and generally used to label an XML document to
perform an XML query using the path node information. Moreover, a
labelling scheme helps to capture the structural relationships during the
processing of queries without the need to access the physical document.
Two of the main problems for labelling XML schemes are duplicated
labels and the cost efficiency of labelling time and size. This research
presents a novel dynamic XML labelling scheme, called the Pentagonal
labelling scheme, in which data are represented as ordered XML nodes
with relationships between them. The update of these nodes from large scale XML documents has been widely investigated and represents a
challenging research problem as it means relabelling a whole tree. Our
algorithms provide an efficient dynamic XML labelling scheme that
supports data updates without duplicating labels or relabelling old nodes.
Our work evaluates the labelling process in terms of size and time, and
evaluates the labelling scheme’s ability to handle several insertions in
XML documents. The findings indicate that the Pentagonal scheme
shows a better initial labelling time performance than the compared
schemes, particularly when using large XML datasets. Moreover, it
efficiently supports random skewed updates, has fast calculations and
uncomplicated implementations so efficiently handles updates. Also, it
proved its capability in terms of the query performance and in determining
the relationships.Libyan governmen
XML content warehousing: Improving sociological studies of mailing lists and web data
In this paper, we present the guidelines for an XML-based approach for the
sociological study of Web data such as the analysis of mailing lists or
databases available online. The use of an XML warehouse is a flexible solution
for storing and processing this kind of data. We propose an implemented
solution and show possible applications with our case study of profiles of
experts involved in W3C standard-setting activity. We illustrate the
sociological use of semi-structured databases by presenting our XML Schema for
mailing-list warehousing. An XML Schema allows many adjunctions or crossings of
data sources, without modifying existing data sets, while allowing possible
structural evolution. We also show that the existence of hidden data implies
increased complexity for traditional SQL users. XML content warehousing allows
altogether exhaustive warehousing and recursive queries through contents, with
far less dependence on the initial storage. We finally present the possibility
of exporting the data stored in the warehouse to commonly-used advanced
software devoted to sociological analysis
- …