10 research outputs found
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
Algorithms for generation of path-methods in object-oriented databases
A path-method is a mechanism in object-oriented databases (OODBs) to retrieve or to update information relevant to one class that is not stored with that class but with some other class. A path-method is a method which traverses from one class through a chain of connections between classes to access information at another class. However, it is a difficult task for a user to write path-methods, because it might require comprehensive knowledge of many classes of the conceptual schema, while a typical user has often incomplete or even inconsistent knowledge of the schema.
This dissertation proposes an approach to the generation of path-methods in an OODB to solve this problem. We have developed the Path-Method Generator (P MG) system, which generates path-methods according to a naive user\u27s requests. PMG is based on access weights which reflect the relative frequency of the connections and precomputed access relevance between every pair of classes of the OODB computed from access weights of the connections. We present specific rules for access weight assignment, efficient algorithms to compute access relevance in a single OODB, and a variety of traversal algorithms based on access weights and precomputed access relevance. Experiments with a university environment OODB and a sample of path-methods identify some of these algorithms as very successful in generating most of the desired path-methods. Thus, the PMG system is an efficient tool for aiding the user with the difficult task of querying and updating a large OODB.
The path-method generation in an interoperable multi object-oriented database (IM-OODB) is even more difficult than for a single OODB, since a user has to be familiar with several OODBs. We use a hierarchical approach for deriving efficient online algorithms for the computation of access relevance in an IM-OODB, based on precomputed access relevance for each autonomous OODB. In an IM-OODB the access relevance is used as guide in generating path-methods between the classes of different OODBs
Semantic vs. Structural Resemblance of Classes
We present an approach to determine the similarity of classes which utilizes fuzzy and incomplete terminological knowledge together with schema knowledge. We clearly distinguish between semantic similarity determining the degree of resemblance according to real world semantics, and structural correspondence explaining how classes can actually be interrelated. To compute the semantic similarity we introduce the notion of semantic relevance and apply fuzzy set theory to reason about both terminological knowledge and schema knowledge. 1 Introduction The identification of similar or corresponding concepts forms one of the main steps when investigating different world models and relating them to each other. Apart from its long tradition in document retrieval, this issue has also been investigated in more structured frameworks such as schema independent query formulation, e.g., [Mot90], or database integration, where for a survey you may look at [SL90]. As argued in [GPN91], there should b..