The proliferation of XML as a standard for data representation and exchange in diverse, next-generation Web applications has created an emphatic need for effective XML data-integration tools. For several real-life scenarios, such XML data integration needs to be DTD-directed – in other words, the target, integrated XML database must conform to a prespecified, user- or application-defined DTD. In this paper, we propose a novel formalism, XML Integration Grammars (XIGs), for specifying DTD-directed integration of XML data. Abstractly, an XIG maps data from multiple XML sources to a target XML document that conforms to a predefined DTD. An XIG extracts source XML data via queries expressed in a fragment of XQuery, and controls target document generation with tree-valued attributes and the target DTD. The novelty of XIGs consists in not only their automatic support for DTD-conformance but also in their composability: an XIG may embed local and remote XIGs in its definition, and invoke these XIGs during its evaluation. This yields an important modularity property for our XIGs that allows one to divide a complex integration task into manageable sub-tasks and conquer each of them separately. To efficiently evaluate XIGs we provide algorithms for merging XML queries in an XIG and for scheduling queries and embedded XIGs. These lead to an effective framework, as well as a design tool for XQuery, for effectively specifying and computing complex, DTD-directed XML integration
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.