54,323 research outputs found

    Contorsion: A Semantic XPath Processor

    Get PDF
    AbstractThis work describes the architecture of Contorsion, a semantic XPath processor that acts over an RDF mapping of XML. It contributes to a recent research trend that defines an XML-to-RDF mapping allowing XML documents interoperate at the semantic level. We use a model-mapping approach to represent instances of XML and XML Schema in RDF. This representation retains the node order, in contrast with the usual structure-mapping approach. The processor can be fed with an unlimited set of XML schemas and/or RDFS/OWL ontologies. The queries are resolved taking in consideration the structural and semantic connections descrived in the schemas and ontologies. Such behaviour, schema-awareness and semantic integration, can be useful for exploiting schema and ontology hierarchies in XPath queries

    From XML schema to relations: a cost-based approach to XML storage

    Get PDF
    Journal ArticleAs Web applications manipulate an increasing amount of XML, there is a growing interest in storing XML data in relational databases. Due to the mismatch between the complexity of XML's tree structure and the simplicity of flat relational tables, there are many ways to store the same document in an RDBMS, and a number of heuristic techniques have been proposed. These techniques typically define fixed mappings and do not take application characteristics into account. However, a fixed mapping is unlikely to work well for all possible applications. In contrast, LegoDB is a cost-based XML storage mapping engine that explores a space of possible XML-to-relational mappings and selects the best mapping for a given application. LegoDB leverages current XML and relational technologies: 1) it models the target application with an XML Schema, XML data statistics, and an XQuery workload; 2) the space of configurations is generated through XML-Schema rewritings; and 3) the best among the derived configurations is selected using cost estimates obtained through a standard relational optimizer. In this paper, we describe the LegoDB storage engine and provide experimental results that demonstrate the effectiveness of this approach

    Inferring functional dependencies for XML storage

    Get PDF
    XML allows redundancy of data with its hierarchical structure where its elements may be nested and repeated. This will make the same information appear in more than one place; in fact it allows the same elements appear at different sub-trees. With this capability, XML is easier to understand and to parse, while to recover this information would require less joins. This is in contrast to relational data for which the normalized theory has been developed for eliminating data redundancy. Therefore how to detect redundancy in XML data is important before mapping can be done. In this paper, we use functional dependencies to detect data redundancies in XML documents. Based on inferring other functional dependencies from the given ones, we proposed an algorithm for mapping XML DTDs to relational schemas. The result is a “good relational schema” in terms of reducing data redundancy and preserving the semantic constraints

    A method for mapping XML-based specifications between development methodologies

    Get PDF
    The Unified Modeling Language (UML) is widely used by software engineers as the basis of analysis and design in software development. However, UML ignores human factors in the course of software development because of its strong emphasis on the internal structure and functionality of the application. This thesis presents a method of mapping human-computer interaction (HCI) requirement specifications generated by usability engineering (UE) methodologies (e.g. Putting Usability First (PUF)) into UML specifications. These two sets of requirement specification are specified, using Extensible Markup Language (XML) so that HCI requirement specifications can be integrated into UML ones. A Mapping Tool was developed to facilitate the creation of mappings between PUF XML tags and XMI tags. The Mapping Tool was used to create mappings between PUF and UML requirement specifications. This mapping process and its outputs were evaluated to demonstrate that the tool worked. The results of the evaluation show that the HCI requirement specification represented by the PUF XML tags can improve the UML specification by adding them into the XMI tags

    Toward a Generic Mapping Language for Transformations between RDF and Data Interchange Formats

    Full text link
    While there exist approaches to integrate heterogeneous data using semantic models, such semantic models can typically not be used by existing software tools. Many software tools - especially in engineering - only have options to import and export data in more established data interchange formats such as XML or JSON. Thus, if an information which is included in a semantic model needs to be used in a such a software tool, automatic approaches for mapping semantic information into an interchange format are needed. We aim to develop a generic mapping approach that allows users to create transformations of semantic information into a data interchange format with an arbitrary structure which can be defined by a user. This mapping approach is currently being elaborated. In this contribution, we report our initial steps targeted to transformations from RDF into XML. At first, a mapping language is introduced which allows to define automated mappings from ontologies to XML. Furthermore, a mapping algorithm capable of executing mappings defined in this language is presented. An evaluation is done with a use case in which engineering information needs to be used in a 3D modeling tool

    VAMANA : A High Performance, Scalable and Cost Driven XPath Engine

    Get PDF
    Many applications are migrating or beginning to make use native XML data. We anticipate that queries will emerge that emphasize the structural semantics of XML query languages like XPath and XQuery. This brings a need for an efficient query engine and database management system tailored for XML data similar to traditional relational engines. While mapping large XML documents into relational database systems while possible, poses difficulty in mapping XML queries to the less powerful relational query language SQL and creates a data model mismatch between relational tables and semi-structured XML data. Hence native solutions to efficiently store and query XML data are being developed recently. However, most of these systems thus far fail to demonstrate scalability with large document sizes, to provide robust support for the XPath query language nor to adequately address costing with respect to query optimization. In this thesis, we propose a novel cost-driven XPath engine to support the scalable evaluation of ad-hoc XPath expressions called VAMANA. VAMANA makes use of an efficient XML repository for storing and indexing large XML documents called the Multi-Axis Storage Structure (MASS) developed at WPI. VAMANA extensively uses indexes for query evaluation by considering index-only plans. To the best of our knowledge, it is the only XML query engine that supports an index plan approach for large XML documents. Our index-oriented query plans allow queries to be evaluated while reading only a fraction of the data, as all tuples for a particular context node are clustered together. The pipelined query framework minimizes the cost of handing intermediate data during query processing. Unlike other native solutions, VAMANA provides support for all 13 XPath axes. Our schema independent cost model provides dynamically calculated statistics that are then used for intelligent cost-based transformations, further improving performance. Our optimization strategy for increasing execution time performance is affirmed through our experimental studies on XMark benchmark data. VAMANA query execution is significantly faster than leading available XML query engines

    Updating semi-structured data

    Get PDF
    The Web has had a tremendous success with its support for the rapid and inexpensive exchange of information. A considerable body of data exchange is in the form of semi- structured data such as the eXtensible Markup Language (XML). XML, an effective standard to represent and exchange semi-structured data on the Web, is used ubiquitously in almost all areas of information technology. Most researchers in the XML area have concentrated on storing, querying and publishing XML while not many have paid attention to updating XML; thus the XML update area is not fully developed. We propose a solution for updating XML as a representation of semi-structured data. XML is updated through an object-relational database (ORDB) to exploit the maturity of the relational engine and the newer object features of the OR technology. The engine is used to enforce constraints during the updating of the XML whereas the object features are used to handle the XML hierarchical structure. Updating XML via ORDB makes it easier to join XML documents in an update and in turn joins of XML documents make it possible to keep non-redundant data in multiple XML documents. This thesis contributes a solution for the update of XML documents via an ORDB to advance our understanding of the XML update area. Rules for mapping XML structure and constraints to an ORDB schema are presented and a mechanism to handle XML cardinality constraint is provided. An XML update language, an extension to XQuery, has been designed and this language is translated into the standard SQL executed on an ORDB. To handle the recursive nature of XML, a recursive function updating XML data is translated into SQL commands equipped with a programming capability. A method is developed to reflect the changes from the ORDB to XML documents. A prototype of the solution has been implemented to help validate our approach. Experimental study to evaluate the performance of XML update processing based on the prototype has been conducted. The experimental results show that updating multiple XML documents storing non-redundant data yields a better performance than updating a single XML document storing redundant data; an ORDB can take advantage of this by caching data to a greater extent than a native XML database. The solution of updating XML documents via an ORDB can solve some problems in existing update methods as follows. Firstly, the preservation of XML constraints is handled by the ORDB engine. Secondly, non-redundant data is stored in linked XML documents; thus the problem of data inconsistency and low performance caused by data redundancy are solved. Thirdly, joins of XML documents are converted to joins of tables in SQL. Fourthly, fields or tables involved in regular path expressions can be tackled in a short time by using mapping data. Finally, a recursive function is translated into SQL commands equipped with a programming capability

    Storing Linked XML documents in Object-Relational DBMS

    Get PDF
    Currently, several researchers have proposed mapping both structure and constraints of XML documents to an object-relational database (ORDB). However these researches cannot be conducted because of the limited range of constraints in available object-relational DBMSs. We therefore propose mapping rules that are practicable in available technologies. Normally, an XML document is treated as a database so much data redundancy occurs. To solve this problem, we keep non-redundant data in several separate XML documents, link the data dispersed in these documents together by a mechanism called ‘rlink’ and then map this mechanism to ORDB. Finally we perform a case study in Oracle9i to illustrate the mapping of XML to ORDB according to our rules. Our contribution is that we find that mapping linked XML documents to traditional databases such as (O)RDB makes it easier to join several documents and to update several documents in one update command
    corecore