44,107 research outputs found

    Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

    Full text link
    Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five benchmark datasets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple datasets. While we believe these results show that existing UQ methods are not sufficient for all common use-cases and demonstrate the benefits of further research, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others

    Circulating antigen tests and urine reagent strips for diagnosis of active schistosomiasis in endemic areas

    Get PDF
    Background: Point-of-care (POC) tests for diagnosing schistosomiasis include tests based on circulating antigen detection and urine reagent strip tests. If they had sufficient diagnostic accuracy they could replace conventional microscopy as they provide a quicker answer and are easier to use. Objectives: To summarise the diagnostic accuracy of: a) urine reagent strip tests in detecting activeSchistosoma haematobium infection, with microscopy as the reference standard; and b) circulating antigen tests for detecting active Schistosoma infection in geographical regions endemic for Schistosoma mansoni or S. haematobium or both, with microscopy as the reference standard. Search methods: We searched the electronic databases MEDLINE, EMBASE, BIOSIS, MEDION, and Health Technology Assessment (HTA) without language restriction up to 30 June 2014. Selection criteria We included studies that used microscopy as the reference standard: for S. haematobium, microscopy of urine prepared by filtration, centrifugation, or sedimentation methods; and for S. mansoni, microscopy of stool by Kato-Katz thick smear. We included studies on participants residing in endemic areas only. Data collection and analysis: Two review authors independently extracted data, assessed quality of the data using QUADAS-2, and performed meta-analysis where appropriate. Using the variability of test thresholds, we used the hierarchical summary receiver operating characteristic (HSROC) model for all eligible tests (except the circulating cathodic antigen (CCA) POC for S. mansoni, where the bivariate random-effects model was more appropriate). We investigated heterogeneity, and carried out indirect comparisons where data were sufficient. Results for sensitivity and specificity are presented as percentages with 95% confidence intervals (CI). Main results; We included 90 studies; 88 from field settings in Africa. The median S. haematobiuminfection prevalence was 41% (range 1% to 89%) and 36% for S. mansoni (range 8% to 95%). Study design and conduct were poorly reported against current standards. Tests for S. haematobium Urine reagent test strips versus microscopy Compared to microscopy, the detection of microhaematuria on test strips had the highest sensitivity and specificity (sensitivity 75%, 95% CI 71% to 79%; specificity 87%, 95% CI 84% to 90%; 74 studies, 102,447 participants). For proteinuria, sensitivity was 61% and specificity was 82% (82,113 participants); and for leukocyturia, sensitivity was 58% and specificity 61% (1532 participants). However, the difference in overall test accuracy between the urine reagent strips for microhaematuria and proteinuria was not found to be different when we compared separate populations (P = 0.25), or when direct comparisons within the same individuals were performed (paired studies; P = 0.21). When tests were evaluated against the higher quality reference standard (when multiple samples were analysed), sensitivity was marginally lower for microhaematuria (71% vs 75%) and for proteinuria (49% vs 61%). The specificity of these tests was comparable. Antigen assay Compared to microscopy, the CCA test showed considerable heterogeneity; meta-analytic sensitivity estimate was 39%, 95% CI 6% to 73%; specificity 78%, 95% CI 55% to 100% (four studies, 901 participants). Tests for S. mansoni Compared to microscopy, the CCA test meta-analytic estimates for detecting S. mansoni at a single threshold of trace positive were: sensitivity 89% (95% CI 86% to 92%); and specificity 55% (95% CI 46% to 65%; 15 studies, 6091 participants) Against a higher quality reference standard, the sensitivity results were comparable (89% vs 88%) but specificity was higher (66% vs 55%). For the CAA test, sensitivity ranged from 47% to 94%, and specificity from 8% to 100% (four studies, 1583 participants). Authors' conclusions: Among the evaluated tests for S. haematobium infection, microhaematuria correctly detected the largest proportions of infections and non-infections identified by microscopy. The CCA POC test for S. mansoni detects a very large proportion of infections identified by microscopy, but it misclassifies a large proportion of microscopy negatives as positives in endemic areas with a moderate to high prevalence of infection, possibly because the test is potentially more sensitive than microscopy

    Reconstructing dynamical networks via feature ranking

    Full text link
    Empirical data on real complex systems are becoming increasingly available. Parallel to this is the need for new methods of reconstructing (inferring) the topology of networks from time-resolved observations of their node-dynamics. The methods based on physical insights often rely on strong assumptions about the properties and dynamics of the scrutinized network. Here, we use the insights from machine learning to design a new method of network reconstruction that essentially makes no such assumptions. Specifically, we interpret the available trajectories (data) as features, and use two independent feature ranking approaches -- Random forest and RReliefF -- to rank the importance of each node for predicting the value of each other node, which yields the reconstructed adjacency matrix. We show that our method is fairly robust to coupling strength, system size, trajectory length and noise. We also find that the reconstruction quality strongly depends on the dynamical regime

    Can Who-Edits-What Predict Edit Survival?

    Get PDF
    As the number of contributors to online peer-production systems grows, it becomes increasingly important to predict whether the edits that users make will eventually be beneficial to the project. Existing solutions either rely on a user reputation system or consist of a highly specialized predictor that is tailored to a specific peer-production system. In this work, we explore a different point in the solution space that goes beyond user reputation but does not involve any content-based feature of the edits. We view each edit as a game between the editor and the component of the project. We posit that the probability that an edit is accepted is a function of the editor's skill, of the difficulty of editing the component and of a user-component interaction term. Our model is broadly applicable, as it only requires observing data about who makes an edit, what the edit affects and whether the edit survives or not. We apply our model on Wikipedia and the Linux kernel, two examples of large-scale peer-production systems, and we seek to understand whether it can effectively predict edit survival: in both cases, we provide a positive answer. Our approach significantly outperforms those based solely on user reputation and bridges the gap with specialized predictors that use content-based features. It is simple to implement, computationally inexpensive, and in addition it enables us to discover interesting structure in the data.Comment: Accepted at KDD 201
    corecore