14,529 research outputs found
XML Matchers: approaches and challenges
Schema Matching, i.e. the process of discovering semantic correspondences
between concepts adopted in different data source schemas, has been a key topic
in Database and Artificial Intelligence research areas for many years. In the
past, it was largely investigated especially for classical database models
(e.g., E/R schemas, relational databases, etc.). However, in the latest years,
the widespread adoption of XML in the most disparate application fields pushed
a growing number of researchers to design XML-specific Schema Matching
approaches, called XML Matchers, aiming at finding semantic matchings between
concepts defined in DTDs and XSDs. XML Matchers do not just take well-known
techniques originally designed for other data models and apply them on
DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical
structure of a DTD/XSD) to improve the performance of the Schema Matching
process. The design of XML Matchers is currently a well-established research
area. The main goal of this paper is to provide a detailed description and
classification of XML Matchers. We first describe to what extent the
specificities of DTDs/XSDs impact on the Schema Matching task. Then we
introduce a template, called XML Matcher Template, that describes the main
components of an XML Matcher, their role and behavior. We illustrate how each
of these components has been implemented in some popular XML Matchers. We
consider our XML Matcher Template as the baseline for objectively comparing
approaches that, at first glance, might appear as unrelated. The introduction
of this template can be useful in the design of future XML Matchers. Finally,
we analyze commercial tools implementing XML Matchers and introduce two
challenging issues strictly related to this topic, namely XML source clustering
and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure
Flow-based reputation with uncertainty: Evidence-Based Subjective Logic
The concept of reputation is widely used as a measure of trustworthiness
based on ratings from members in a community. The adoption of reputation
systems, however, relies on their ability to capture the actual trustworthiness
of a target. Several reputation models for aggregating trust information have
been proposed in the literature. The choice of model has an impact on the
reliability of the aggregated trust information as well as on the procedure
used to compute reputations. Two prominent models are flow-based reputation
(e.g., EigenTrust, PageRank) and Subjective Logic based reputation. Flow-based
models provide an automated method to aggregate trust information, but they are
not able to express the level of uncertainty in the information. In contrast,
Subjective Logic extends probabilistic models with an explicit notion of
uncertainty, but the calculation of reputation depends on the structure of the
trust network and often requires information to be discarded. These are severe
drawbacks.
In this work, we observe that the `opinion discounting' operation in
Subjective Logic has a number of basic problems. We resolve these problems by
providing a new discounting operator that describes the flow of evidence from
one party to another. The adoption of our discounting rule results in a
consistent Subjective Logic algebra that is entirely based on the handling of
evidence. We show that the new algebra enables the construction of an automated
reputation assessment procedure for arbitrary trust networks, where the
calculation no longer depends on the structure of the network, and does not
need to throw away any information. Thus, we obtain the best of both worlds:
flow-based reputation and consistent handling of uncertainties
Adaptation of WASH Services Delivery to Climate Change and Other Sources of Risk and Uncertainty
This report urges WASH sector practitioners to take more seriously the threat of climate change and the consequences it could have on their work. By considering climate change within a risk and uncertainty framework, the field can use the multitude of approaches laid out here to adequately protect itself against a range of direct and indirect impacts. Eleven methods and tools for this specific type of risk management are described, including practical advice on how to implement them successfully
The Role of GIS to Enable Public-Sector Decision Making Under Conditions of Uncertainty
Uncertainty is inherent in environmental planning and decision making. For example, water managers in arid regions are attuned to the uncertainty of water supply due to prolonged periods of drought. To contend with multiple sources and forms of uncertainty, resource managers implement strategies and tools to aid in the exploration and interpretation of data and scenarios. Various GIS capabilities, such as statistical analysis, modeling and visualization are available to decision makers who face the challenge of making decisions under conditions of deep uncertainty. While significant research has lead to the inclusion and representation of uncertainty in GIS, existing GIS literature does not address how decision makers implement and utilize GIS as an assistive technology to contend with deep uncertainty. We address this gap through a case study of water managers in the Phoenix Metropolitan Area, examining how they engage with GIS in making decisions and coping with uncertainty. Findings of a qualitative analysis of water mangers reveal the need to distinguish between implicit and explicit uncertainty. Implicit uncertainty is linked to the decision-making process, and while understood, it is not displayed or revealed separately from the data. In contrast, explicit uncertainty is conceived as separate from the process and is something that can be described or displayed. Developed from twelve interviews with Phoenix-area water managers in 2005, these distinctions of uncertainty clarify the use of GIS in decision making. Findings show that managers use the products of GIS for exploring uncertainty (e.g., cartographic products). Uncertainty visualization emerged as a current practice, but definitions of what constitutes such visualizations were not consistent across decision makers. Additionally, uncertainty was a common and even sometimes helpful element of decision making; rather than being a hindrance, it is seen as an essential component of the process. These findings contradict prior research relating to uncertainty visualization where decision makers often express discomfort with the presence of uncertainty.
Crowdsourcing for Top-K Query Processing over Uncertain Data
Querying uncertain data has become a prominent application due to the proliferation of user-generated content from social media and of data streams from sensors. When data ambiguity cannot be reduced algorithmically, crowdsourcing proves a viable approach, which consists of posting tasks to humans and harnessing their judgment for improving the confidence about data values or relationships. This paper tackles the problem of processing top- K queries over uncertain data with the help of crowdsourcing for quickly converging to the realordering of relevant results. Several offline and online approaches for addressing questions to a crowd are defined and contrasted on both synthetic and real data sets, with the aim of minimizing the crowd interactions necessary to find the realordering of the result set
Flow-based reputation: more than just ranking
The last years have seen a growing interest in collaborative systems like
electronic marketplaces and P2P file sharing systems where people are intended
to interact with other people. Those systems, however, are subject to security
and operational risks because of their open and distributed nature. Reputation
systems provide a mechanism to reduce such risks by building trust
relationships among entities and identifying malicious entities. A popular
reputation model is the so called flow-based model. Most existing reputation
systems based on such a model provide only a ranking, without absolute
reputation values; this makes it difficult to determine whether entities are
actually trustworthy or untrustworthy. In addition, those systems ignore a
significant part of the available information; as a consequence, reputation
values may not be accurate. In this paper, we present a flow-based reputation
metric that gives absolute values instead of merely a ranking. Our metric makes
use of all the available information. We study, both analytically and
numerically, the properties of the proposed metric and the effect of attacks on
reputation values
Emerging trust implications of data-rich systems
Pervasive technologies are enabling an increasingly data-rich world that is mediated through a broad spectrum of often highly interdependent systems. The data science surrounding these systems is rapidly transforming nearly every aspect of our lives. But how trustworthy are the systems and data upon which we have come to rely? This article explores the complex collaborations and interdependencies that mediate trust-formation and examines six challenges in generating and sustaining trust in the context of data-rich systems
- …