Search CORE

2,222 research outputs found

Document Type De�nition (DTD) Metrics

Author: Basci D.
Misra Sanjay
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we present two complexity metrics for the assessment of schema quality written in Document Type De�finition (DTD) language. Both "Entropy (E) metric: E(DTD)" and "Distinct Structured Element Repetition Scale (DSERS) metric: DSERS(DTD)" are intended to measure the structural complexity of schemas in DTD language. These metrics exploit a directed graph representation of schema document and consider the complexity of schema due to its similar structured elements and the occurrences of these elements. The empirical and theoretical validations of these metrics prove the robustness of the metrics

Covenant University Repository

XML document design via GN-DTD

Author: Wang Bing
Zainol Zurinahni
Publication venue: EuroJournals
Publication date: 31/12/2010
Field of study

Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples

Repository@Hull - Worktribe

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

XML Schema Clustering with Semantic and Hierarchical Similarity Measures

Author: Iryadi Wina
Nayak Richi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis

Crossref

Queensland University of Technology ePrints Archive

A General Approach for Securely Querying and Updating XML Data

Author: Imine Abdessamad
Mahfoud Houari
Publication venue
Publication date: 01/01/2012
Field of study

Over the past years several works have proposed access control models for XML data where only read-access rights over non-recursive DTDs are considered. A few amount of works have studied the access rights for updates. In this paper, we present a general model for specifying access control on XML data in the presence of update operations of W3C XQuery Update Facility. Our approach for enforcing such updates specifications is based on the notion of query rewriting where each update operation defined over arbitrary DTD (recursive or not) is rewritten to a safe one in order to be evaluated only over XML data which can be updated by the user. We investigate in the second part of this report the secure of XML updating in the presence of read-access rights specified by a security views. For an XML document, a security view represents for each class of users all and only the parts of the document these users are able to see. We show that an update operation defined over a security view can cause disclosure of sensitive data hidden by this view if it is not thoroughly rewritten with respect to both read and update access rights. Finally, we propose a security view based approach for securely updating XML in order to preserve the confidentiality and integrity of XML data.Comment: No. RR-7870 (2012

arXiv.org e-Print Archive

CiteSeerX

HAL - Université de Franche-Comté

INRIA a CCSD electronic archive server

Repairing Inconsistent XML Write-Access Control Policies

Author: Bravo Loreto
Cheney James
Fundulaki Irini
Publication venue
Publication date: 01/01/2007
Field of study

XML access control policies involving updates may contain security flaws, here called inconsistencies, in which a forbidden operation may be simulated by performing a sequence of allowed operations. This paper investigates the problem of deciding whether a policy is consistent, and if not, how its inconsistencies can be repaired. We consider policies expressed in terms of annotated DTDs defining which operations are allowed or denied for the XML trees that are instances of the DTD. We show that consistency is decidable in PTIME for such policies and that consistent partial policies can be extended to unique "least-privilege" consistent total policies. We also consider repair problems based on deleting privileges to restore consistency, show that finding minimal repairs is NP-complete, and give heuristics for finding repairs.Comment: 25 pages. To appear in Proceedings of DBPL 200

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

An effective data placement strategy for XML documents

Author: Lü K
Zhu Y
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2001
Field of study

As XML is increasingly being used in Web applications, new technologies need to be investigated for processing XML documents with high performance. Parallelism is a promising solution for structured document processing and data placement is a major factor for system performance improvement in parallel processing. This paper describes an effective XML document data placement strategy. The new strategy is based on a multilevel graph partitioning algorithm with the consideration of the unique features of XML documents and query distributions. A new algorithm, which is based on XML query schemas to derive the weighted graph from the labelled directed graph presentation of XML documents, is also proposed. Performance analysis on the algorithm presented in the paper shows that the new data placement strategy exhibits low workload skew and a high degree of parallelism

Brunel University Research Archive