471 research outputs found
XML for Domain Viewpoints
Within research institutions like CERN (European Organization for Nuclear
Research) there are often disparate databases (different in format, type and
structure) that users need to access in a domain-specific manner. Users may
want to access a simple unit of information without having to understand detail
of the underlying schema or they may want to access the same information from
several different sources. It is neither desirable nor feasible to require
users to have knowledge of these schemas. Instead it would be advantageous if a
user could query these sources using his or her own domain models and
abstractions of the data. This paper describes the basis of an XML (eXtended
Markup Language) framework that provides this functionality and is currently
being developed at CERN. The goal of the first prototype was to explore the
possibilities of XML for data integration and model management. It shows how
XML can be used to integrate data sources. The framework is not only applicable
to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference
on Systemics & Informatics, Florid
XML Schema Clustering with Semantic and Hierarchical Similarity Measures
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis
A Framework for XML-based Integration of Data, Visualization and Analysis in a Biomedical Domain
Biomedical data are becoming increasingly complex and heterogeneous in nature. The data are stored in distributed information systems, using a variety of data models, and are processed by increasingly more complex tools that analyze and visualize them. We present in this paper our framework for integrating biomedical research data and tools into a unique Web front end. Our framework is applied to the University of Washington’s Human Brain Project. Specifically, we present solutions to four integration tasks: definition of complex mappings from relational sources to XML, distributed XQuery processing, generation of heterogeneous output formats, and the integration of heterogeneous data visualization and analysis tools
MOMIS: Exploiting agents to support information integration
Information overloading introduced by the large amount of data that is spread over the Internet must be faced in an appropriate way. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for today's technologies related to information management. In the area of information integration, this paper proposes an approach based on mobile software agents integrated in the MOMIS (Mediator envirOnment for Multiple Information Sources) infrastructure, which enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The exploitation of mobile agents in MOMIS can significantly increase the flexibility of the system. In fact, their characteristics of autonomy and adaptability well suit the distributed and open environments, such as the Internet. The aim of this paper is to show the advantages of the introduction in the MOMIS infrastructure of intelligent and mobile software agents for the autonomous management and coordination of integration and query processing over heterogeneous data sources
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
A Framework to Enable the Semantic Inferencing and Querying of Multimedia Content
Cultural institutions, broadcasting companies, academic, scientific and defence organisations are producing vast quantities of digital multimedia content. With this growth in audiovisual material comes the need for standardised representations encapsulating the rich semantic meaning required to enable the automatic filtering, machine processing, interpretation and assimilation of multimedia resources. Additionally generating high-level descriptions is difficult and manual creation is expensive although significant progress has been made in recent years on automatic segmentation and low-level feature recognition for multimedia. Within this paper we describe the application of semantic web technologies to enable the generation of high-level, domain-specific, semantic descriptions of multimedia content from low-level, automatically-extracted features. By applying the knowledge reasoning capabilities provided by ontologies and inferencing rules to large, multimedia data sets generated by scientific research communities, we hope to expedite solutions to the complex scientific problems they face
A messaging system to handle semantic dissonance
Enterprises have been compelled to share their data internally and externally, but creating a consistent view of enterprise data has been challenging. Within a typical enterprise, each division uses its own domain specific data model and schema, and different enterprises obviously use their own data models and schema. Integrating these diverse data models and schemas, which have both syntactic and semantic differences, tends to be complex, slow, and inaccurate. Syntactic differences, i.e., differences in names or layout, have received substantial attention in research. Semantic dissonance simply means that the structure may be similar (or even the same) but the meaning associated with the attributes that define each structure are different, has received less attention in the world of practical software development. A practical messaging system for handling semantic dissonance has been developed. The system utilizes the Resource Description Framework (RDF) and SOAP XML Messaging Specification. It is implemented using Jena, a Java API for RDF, and the Apache SOAP, an Open Source SOAP server and client. This report describes the messaging system, its implementation, its strengths and limitations in handling semantic dissonance
- …