16,079 research outputs found

    Integrating and Ranking Uncertain Scientific Data

    Get PDF
    Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application is protein function prediction. In this paper we show that: (i) explicit modeling of uncertainties as probabilities increases our ability to predict less-known or previously unknown functions (though it does not improve predicting the well-known). This suggests that probabilistic uncertainty models offer utility for scientific knowledge discovery; (ii) small perturbations in the input probabilities tend to produce only minor changes in the quality of our result rankings. This suggests that our methods are robust against slight variations in the way uncertainties are transformed into probabilities; and (iii) several techniques allow us to evaluate our probabilistic rankings efficiently. This suggests that probabilistic query evaluation is not as hard for real-world problems as theory indicates

    When, Where and How to Perform Efficiency Estimation

    Get PDF
    In this paper we compare two flexible estimators of technical efficiency in a cross-sectional setting: the nonparametric kernel SFA estimator of Fan, Li and Weersink (1996) to the nonparametric bias corrected DEA estimator of Kneip, Simar and Wilson (2008). We assess the finite sample performance of each estimator via Monte Carlo simulations and empirical examples. We find that the reliability of efficiency scores critically hinges upon the ratio of the variation in efficiency to the variation in noise. These results should be a valuable resource to both academic researchers and practitioners.nonparametric kernel, technical efficiency, bootstrap

    When, where and how to perform efficiency estimation

    Get PDF
    In this paper we compare two flexible estimators of technical efficiency in a cross-sectional setting: the nonparametric kernel SFA estimator of Fan, Li and Weersink (1996) to the nonparametric bias corrected DEA estimator of Kneip, Simar and Wilson (2008). We assess the finite sample performance of each estimator via Monte Carlo simulations and empirical examples. We find that the reliability of efficiency scores critically hinges upon the ratio of the variation in efficiency to the variation in noise. These results should be a valuable resource to both academic researchers and practitioners.Bootstrap, Nonparametric Kernel, Technical Efficiency
    • …
    corecore