14,377 research outputs found
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
Using XML and XSLT for flexible elicitation of mental-health risk knowledge
Current tools for assessing risks associated with mental-health problems require assessors to make high-level judgements based on clinical experience. This paper describes how new technologies can enhance qualitative research methods to identify lower-level cues underlying these judgements, which can be collected by people without a specialist mental-health background.
Methods and evolving results: Content analysis of interviews with 46 multidisciplinary mental-health experts exposed the cues and their interrelationships, which were represented by a mind map using software that stores maps as XML. All 46 mind maps were integrated into a single XML knowledge structure and analysed by a Lisp program to generate quantitative information about the numbers of experts associated with each part of it. The knowledge was refined by the experts, using software developed in Flash to record their collective views within the XML itself. These views specified how the XML should be transformed by XSLT, a technology for rendering XML, which resulted in a validated hierarchical knowledge structure associating patient cues with risks.
Conclusions: Changing knowledge elicitation requirements were accommodated by flexible transformations of XML data using XSLT, which also facilitated generation of multiple data-gathering tools suiting different assessment circumstances and levels of mental-health knowledge
A Monitoring System for the BaBar INFN Computing Cluster
Monitoring large clusters is a challenging problem. It is necessary to
observe a large quantity of devices with a reasonably short delay between
consecutive observations. The set of monitored devices may include PCs, network
switches, tape libraries and other equipments. The monitoring activity should
not impact the performances of the system. In this paper we present PerfMC, a
monitoring system for large clusters. PerfMC is driven by an XML configuration
file, and uses the Simple Network Management Protocol (SNMP) for data
collection. SNMP is a standard protocol implemented by many networked
equipments, so the tool can be used to monitor a wide range of devices. System
administrators can display informations on the status of each device by
connecting to a WEB server embedded in PerfMC. The WEB server can produce
graphs showing the value of different monitored quantities as a function of
time; it can also produce arbitrary XML pages by applying XSL Transformations
to an internal XML representation of the cluster's status. XSL Transformations
may be used to produce HTML pages which can be displayed by ordinary WEB
browsers. PerfMC aims at being relatively easy to configure and operate, and
highly efficient. It is currently being used to monitor the Italian
Reprocessing farm for the BaBar experiment, which is made of about 200 dual-CPU
Linux machines.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 10 pages, LaTeX, 4 eps figures. PSN
MOET00
Recommended from our members
A practical mandatory access control model for XML databases
A practical mandatory access control (MAC) model for XML databases is presented in this paper. The
label type and label access policy can be defined according to the requirements of different applications. In order to
preserve the integrity of data in XML databases, a constraint between a read-access rule and a write-access rule in
label access policy is introduced. Rules for label assignment and propagation are presented to alleviate the workload
of label assignments. Furthermore, a solution for resolving conflicts in label assignments is proposed. Rules for
update-related operations, rules for exceptional privileges of ordinary users and the administrator are also proposed
to preserve the security of operations in XML databases. The MAC model, we proposed in this study, has been
implemented in an XML database. Test results demonstrated that our approach provides rational and scalable
performance
Text Mining Infrastructure in R
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classification and string kernels.
- âŠ