2,222 research outputs found
Document Type De�nition (DTD) Metrics
In this paper, we present two complexity metrics for the assessment of schema quality written in Document Type De�finition (DTD) language. Both "Entropy (E) metric: E(DTD)" and "Distinct Structured Element Repetition Scale (DSERS) metric: DSERS(DTD)" are intended to measure the structural complexity of schemas in DTD language. These metrics exploit a directed graph representation of schema document and consider the complexity of schema due to its similar structured elements and the occurrences of these
elements. The empirical and theoretical validations of these metrics prove the robustness of the metrics
XML document design via GN-DTD
Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
A General Approach for Securely Querying and Updating XML Data
Over the past years several works have proposed access control models for XML
data where only read-access rights over non-recursive DTDs are considered. A
few amount of works have studied the access rights for updates. In this paper,
we present a general model for specifying access control on XML data in the
presence of update operations of W3C XQuery Update Facility. Our approach for
enforcing such updates specifications is based on the notion of query rewriting
where each update operation defined over arbitrary DTD (recursive or not) is
rewritten to a safe one in order to be evaluated only over XML data which can
be updated by the user. We investigate in the second part of this report the
secure of XML updating in the presence of read-access rights specified by a
security views. For an XML document, a security view represents for each class
of users all and only the parts of the document these users are able to see. We
show that an update operation defined over a security view can cause disclosure
of sensitive data hidden by this view if it is not thoroughly rewritten with
respect to both read and update access rights. Finally, we propose a security
view based approach for securely updating XML in order to preserve the
confidentiality and integrity of XML data.Comment: No. RR-7870 (2012
Repairing Inconsistent XML Write-Access Control Policies
XML access control policies involving updates may contain security flaws,
here called inconsistencies, in which a forbidden operation may be simulated by
performing a sequence of allowed operations. This paper investigates the
problem of deciding whether a policy is consistent, and if not, how its
inconsistencies can be repaired. We consider policies expressed in terms of
annotated DTDs defining which operations are allowed or denied for the XML
trees that are instances of the DTD. We show that consistency is decidable in
PTIME for such policies and that consistent partial policies can be extended to
unique "least-privilege" consistent total policies. We also consider repair
problems based on deleting privileges to restore consistency, show that finding
minimal repairs is NP-complete, and give heuristics for finding repairs.Comment: 25 pages. To appear in Proceedings of DBPL 200
Recommended from our members
An effective data placement strategy for XML documents
As XML is increasingly being used in Web applications, new
technologies need to be investigated for processing XML documents with high
performance. Parallelism is a promising solution for structured document
processing and data placement is a major factor for system performance
improvement in parallel processing. This paper describes an effective XML
document data placement strategy. The new strategy is based on a multilevel
graph partitioning algorithm with the consideration of the unique features of
XML documents and query distributions. A new algorithm, which is based on
XML query schemas to derive the weighted graph from the labelled directed
graph presentation of XML documents, is also proposed. Performance analysis
on the algorithm presented in the paper shows that the new data placement
strategy exhibits low workload skew and a high degree of parallelism
- …