7,964 research outputs found

    Qualitative Effects of Knowledge Rules in Probabilistic Data Integration

    Get PDF
    One of the problems in data integration is data overlap: the fact that different data sources have data on the same real world entities. Much development time in data integration projects is devoted to entity resolution. Often advanced similarity measurement techniques are used to remove semantic duplicates from the integration result or solve other semantic conflicts, but it proofs impossible to get rid of all semantic problems in data integration. An often-used rule of thumb states that about 90% of the development effort is devoted to solving the remaining 10% hard cases. In an attempt to significantly decrease human effort at data integration time, we have proposed an approach that stores any remaining semantic uncertainty and conflicts in a probabilistic database enabling it to already be meaningfully used. The main development effort in our approach is devoted to defining and tuning knowledge rules and thresholds. Rules and thresholds directly impact the size and quality of the integration result. We measure integration quality indirectly by measuring the quality of answers to queries on the integrated data set in an information retrieval-like way. The main contribution of this report is an experimental investigation of the effects and sensitivity of rule definition and threshold tuning on the integration quality. This proves that our approach indeed reduces development effort — and not merely shifts the effort to rule definition and threshold tuning — by showing that setting rough safe thresholds and defining only a few rules suffices to produce a ‘good enough’ integration that can be meaningfully used

    Detecting stochastic dominance for poset-valued random variables as an example of linear programming on closure systems

    Get PDF
    In this paper we develop a linear programming method for detecting stochastic dominance for random variables with values in a partially ordered set (poset) based on the upset-characterization of stochastic dominance. The proposed detection-procedure is based on a descriptively interpretable statistic, namely the maximal probability-difference of an upset. We show how our method is related to the general task of maximizing a linear function on a closure system. Since closure systems are describable via their valid formal implications, we can use here ingredients of formal concept analysis. We also address the question of inference via resampling and via conservative bounds given by the application of Vapnik-Chervonenkis theory, which also allows for an adequate pruning of the envisaged closure system that allows for the regularization of the test statistic (by paying a price of less conceptual rigor). We illustrate the developed methods by applying them to a variety of data examples, concretely to multivariate inequality analysis, item impact and differential item functioning in item response theory and to the analysis of distributional differences in spatial statistics. The power of regularization is illustrated with a data example in the context of cognitive diagnosis models

    Detecting stochastic dominance for poset-valued random variables as an example of linear programming on closure systems

    Get PDF
    In this paper we develop a linear programming method for detecting stochastic dominance for random variables with values in a partially ordered set (poset) based on the upset-characterization of stochastic dominance. The proposed detection-procedure is based on a descriptively interpretable statistic, namely the maximal probability-difference of an upset. We show how our method is related to the general task of maximizing a linear function on a closure system. Since closure systems are describable via their valid formal implications, we can use here ingredients of formal concept analysis. We also address the question of inference via resampling and via conservative bounds given by the application of Vapnik-Chervonenkis theory, which also allows for an adequate pruning of the envisaged closure system that allows for the regularization of the test statistic (by paying a price of less conceptual rigor). We illustrate the developed methods by applying them to a variety of data examples, concretely to multivariate inequality analysis, item impact and differential item functioning in item response theory and to the analysis of distributional differences in spatial statistics. The power of regularization is illustrated with a data example in the context of cognitive diagnosis models

    Some contributions to decision making in complex information settings with imprecise probabilities and incomplete preferences

    Get PDF

    Conformal anomaly and the vector coupling in dense matter

    Full text link
    We construct an effective chiral Lagrangian for hadrons implemented by the conformal invariance and discuss the properties of nuclear matter at high density. The model is formulated based on two alternative assignment, "naive" and mirror, of chirality to the nucleons. It is shown that taking the dilaton limit, in which the mended symmetry of Weinberg is manifest, the vector-meson Yukawa coupling becomes suppressed and the symmetry energy becomes softer as one approaches the chiral phase transition. This leads to softer equations of state (EoS) and could accommodate the EoS without any exotica consistent with the recent measurement of a 1.97±0.04M1.97 \pm 0.04\,M_\odot neutron star.Comment: v2:10 pages, 2 figures, typos corrected, a rough estimate of m0 adde

    Framing the Facebook Oversight Board: Rough Justice in the Wild Web?

    Get PDF
    L'articolo analizza la struttura del Facebook Oversight Board e i problemi connessi con la costituzione di giurisdizioni private, interne agli ISP, anche alla luce del DSA della U
    corecore