14,529 research outputs found

    XML Matchers: approaches and challenges

    Full text link
    Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

    Flow-based reputation with uncertainty: Evidence-Based Subjective Logic

    Full text link
    The concept of reputation is widely used as a measure of trustworthiness based on ratings from members in a community. The adoption of reputation systems, however, relies on their ability to capture the actual trustworthiness of a target. Several reputation models for aggregating trust information have been proposed in the literature. The choice of model has an impact on the reliability of the aggregated trust information as well as on the procedure used to compute reputations. Two prominent models are flow-based reputation (e.g., EigenTrust, PageRank) and Subjective Logic based reputation. Flow-based models provide an automated method to aggregate trust information, but they are not able to express the level of uncertainty in the information. In contrast, Subjective Logic extends probabilistic models with an explicit notion of uncertainty, but the calculation of reputation depends on the structure of the trust network and often requires information to be discarded. These are severe drawbacks. In this work, we observe that the `opinion discounting' operation in Subjective Logic has a number of basic problems. We resolve these problems by providing a new discounting operator that describes the flow of evidence from one party to another. The adoption of our discounting rule results in a consistent Subjective Logic algebra that is entirely based on the handling of evidence. We show that the new algebra enables the construction of an automated reputation assessment procedure for arbitrary trust networks, where the calculation no longer depends on the structure of the network, and does not need to throw away any information. Thus, we obtain the best of both worlds: flow-based reputation and consistent handling of uncertainties

    Adaptation of WASH Services Delivery to Climate Change and Other Sources of Risk and Uncertainty

    Get PDF
    This report urges WASH sector practitioners to take more seriously the threat of climate change and the consequences it could have on their work. By considering climate change within a risk and uncertainty framework, the field can use the multitude of approaches laid out here to adequately protect itself against a range of direct and indirect impacts. Eleven methods and tools for this specific type of risk management are described, including practical advice on how to implement them successfully

    The Role of GIS to Enable Public-Sector Decision Making Under Conditions of Uncertainty

    Get PDF
    Uncertainty is inherent in environmental planning and decision making. For example, water managers in arid regions are attuned to the uncertainty of water supply due to prolonged periods of drought. To contend with multiple sources and forms of uncertainty, resource managers implement strategies and tools to aid in the exploration and interpretation of data and scenarios. Various GIS capabilities, such as statistical analysis, modeling and visualization are available to decision makers who face the challenge of making decisions under conditions of deep uncertainty. While significant research has lead to the inclusion and representation of uncertainty in GIS, existing GIS literature does not address how decision makers implement and utilize GIS as an assistive technology to contend with deep uncertainty. We address this gap through a case study of water managers in the Phoenix Metropolitan Area, examining how they engage with GIS in making decisions and coping with uncertainty. Findings of a qualitative analysis of water mangers reveal the need to distinguish between implicit and explicit uncertainty. Implicit uncertainty is linked to the decision-making process, and while understood, it is not displayed or revealed separately from the data. In contrast, explicit uncertainty is conceived as separate from the process and is something that can be described or displayed. Developed from twelve interviews with Phoenix-area water managers in 2005, these distinctions of uncertainty clarify the use of GIS in decision making. Findings show that managers use the products of GIS for exploring uncertainty (e.g., cartographic products). Uncertainty visualization emerged as a current practice, but definitions of what constitutes such visualizations were not consistent across decision makers. Additionally, uncertainty was a common and even sometimes helpful element of decision making; rather than being a hindrance, it is seen as an essential component of the process. These findings contradict prior research relating to uncertainty visualization where decision makers often express discomfort with the presence of uncertainty.

    Crowdsourcing for Top-K Query Processing over Uncertain Data

    Get PDF
    Querying uncertain data has become a prominent application due to the proliferation of user-generated content from social media and of data streams from sensors. When data ambiguity cannot be reduced algorithmically, crowdsourcing proves a viable approach, which consists of posting tasks to humans and harnessing their judgment for improving the confidence about data values or relationships. This paper tackles the problem of processing top- K queries over uncertain data with the help of crowdsourcing for quickly converging to the realordering of relevant results. Several offline and online approaches for addressing questions to a crowd are defined and contrasted on both synthetic and real data sets, with the aim of minimizing the crowd interactions necessary to find the realordering of the result set

    Flow-based reputation: more than just ranking

    Full text link
    The last years have seen a growing interest in collaborative systems like electronic marketplaces and P2P file sharing systems where people are intended to interact with other people. Those systems, however, are subject to security and operational risks because of their open and distributed nature. Reputation systems provide a mechanism to reduce such risks by building trust relationships among entities and identifying malicious entities. A popular reputation model is the so called flow-based model. Most existing reputation systems based on such a model provide only a ranking, without absolute reputation values; this makes it difficult to determine whether entities are actually trustworthy or untrustworthy. In addition, those systems ignore a significant part of the available information; as a consequence, reputation values may not be accurate. In this paper, we present a flow-based reputation metric that gives absolute values instead of merely a ranking. Our metric makes use of all the available information. We study, both analytically and numerically, the properties of the proposed metric and the effect of attacks on reputation values

    Emerging trust implications of data-rich systems

    Get PDF
    Pervasive technologies are enabling an increasingly data-rich world that is mediated through a broad spectrum of often highly interdependent systems. The data science surrounding these systems is rapidly transforming nearly every aspect of our lives. But how trustworthy are the systems and data upon which we have come to rely? This article explores the complex collaborations and interdependencies that mediate trust-formation and examines six challenges in generating and sustaining trust in the context of data-rich systems
    corecore