4,953 research outputs found
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
Information Integration - the process of integration, evolution and versioning
At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud
In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources
Automated schema matching techniques: an exploratory study
Manual schema matching is a problem for many database applications that use multiple data sources including data warehousing and e-commerce applications. Current research attempts to address this problem by developing algorithms to automate aspects of the schema-matching task. In this paper, an approach using an external dictionary facilitates automated discovery of the semantic meaning of database schema terms. An experimental study was conducted to evaluate the performance and accuracy of five schema-matching techniques with the proposed approach, called SemMA. The proposed approach and results are compared with two existing semi-automated schema-matching approaches and suggestions for future research are made
Data integration through service-based mediation for web-enabled information systems
The Web and its underlying platform technologies have often been used to integrate existing software and information systems. Traditional techniques for data representation and transformations between documents are not sufficient to support a flexible and maintainable data integration solution that meets the requirements of modern complex Web-enabled software and information systems. The difficulty
arises from the high degree of complexity of data structures, for example in business and technology applications, and from the constant change of data and its
representation. In the Web context, where the Web platform is used to integrate different organisations or software systems, additionally the problem of heterogeneity
arises. We introduce a specific data integration solution for Web applications such as Web-enabled information systems. Our contribution is an integration technology
framework for Web-enabled information systems comprising, firstly, a data integration technique based on the declarative specification of transformation rules and the construction of connectors that handle the integration and, secondly, a mediator architecture based on information services and the constructed connectors to handle the integration process
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
Design of the shared Environmental Information System (SEIS) and development of a web-based GIS interface
Chapter 5The Shared Environmental Information System (SEIS) is a collaborative initiative of
the European Commission (EC) and the European Environment Agency (EEA) aimed to
establish an integrated and shared EU-wide environmental information system together
with the Member States.
SEIS presents the European vision on environmental information interoperability. It is
a set of high-level principles & workflow-processes that organize the collection, exchange,
and use of environmental data & information aimed to:
⢠Modernise the way in which information required by environmental legislation is
made available to member states or EC instruments;
⢠Streamline reporting processes and repeal overlaps or obsolete reporting obligations;
⢠Stimulate similar developments at international conventions;
⢠Standardise according to INSPIRE when possible; and
⢠Introduce the SDI (spatial database infrastructure) principle EU-wide.
SEIS is a system and workflow of operations that offers technical capabilities geared to
meet concept expectations. In that respect, SEIS shows the way and sets up the workflow
effectively in a standardise way (e.g, INSPIRE) to:
⢠Collect Data from Spatial Databases, in situ sensors, statistical databases, earth
observation readings (e.g., EOS, GMES), marine observation using standard data
transfer protocols (ODBC, SOS, ft p, etc).
⢠Harmonise collected data (including data check/data integrity) according to best
practices proven to perform well, according to the INSPIRE Directive 2007/2/EC
(1) Annexes I: II: III: plus INSPIRE Implementation Rules for data not specified in
above mentioned Annexes.
⢠Harmonise collected data according to WISE (Water Information System from
Europe) or Ozone-web.
⢠Process, aggregate harmonise data so to extract information in a format understandable
by wider audiences (e.g., Eurostat, enviro-indicators).
⢠Document information to fulfi l national reporting obligations towards EU bodies
(e.g., the JRC, EEA, DGENV, Eurostat)
⢠Store and publish information for authorised end-users (e.g., citizens, institutions).
This paper presents the development and integration of the SEIS-Malta Geoportal.
The first section outlines EU Regulations on INSPIRE and Aarhus Directives. The second
covers the architecture and the implementation of SEIS-Malta Geoportal. The third
discusses the results and successful implementation of the Geoportal.peer-reviewe
- âŚ