259,816 research outputs found

    An Epistemic Approach to the Formal Specification of Statistical Machine Learning

    Full text link
    We propose an epistemic approach to formalizing statistical properties of machine learning. Specifically, we introduce a formal model for supervised learning based on a Kripke model where each possible world corresponds to a possible dataset and modal operators are interpreted as transformation and testing on datasets. Then we formalize various notions of the classification performance, robustness, and fairness of statistical classifiers by using our extension of statistical epistemic logic (StatEL). In this formalization, we show relationships among properties of classifiers, and relevance between classification performance and robustness. As far as we know, this is the first work that uses epistemic models and logical formulas to express statistical properties of machine learning, and would be a starting point to develop theories of formal specification of machine learning.Comment: Accepted in Software and Systems Modeling https://rdcu.be/b7ssR This paper is the journal version of the SEFM'19 conference paper arxiv:1907.1032

    Robust detection and attribution of climate change under interventions

    Full text link
    Fingerprints are key tools in climate change detection and attribution (D&A) that are used to determine whether changes in observations are different from internal climate variability (detection), and whether observed changes can be assigned to specific external drivers (attribution). We propose a direct D&A approach based on supervised learning to extract fingerprints that lead to robust predictions under relevant interventions on exogenous variables, i.e., climate drivers other than the target. We employ anchor regression, a distributionally-robust statistical learning method inspired by causal inference that extrapolates well to perturbed data under the interventions considered. The residuals from the prediction achieve either uncorrelatedness or mean independence with the exogenous variables, thus guaranteeing robustness. We define D&A as a unified hypothesis testing framework that relies on the same statistical model but uses different targets and test statistics. In the experiments, we first show that the CO2 forcing can be robustly predicted from temperature spatial patterns under strong interventions on the solar forcing. Second, we illustrate attribution to the greenhouse gases and aerosols while protecting against interventions on the aerosols and CO2 forcing, respectively. Our study shows that incorporating robustness constraints against relevant interventions may significantly benefit detection and attribution of climate change

    Model checking the evolution of gene regulatory networks

    Get PDF
    The behaviour of gene regulatory networks (GRNs) is typically analysed using simulation-based statistical testing-like methods. In this paper, we demonstrate that we can replace this approach by a formal verification-like method that gives higher assurance and scalability. We focus on Wagner’s weighted GRN model with varying weights, which is used in evolutionary biology. In the model, weight parameters represent the gene interaction strength that may change due to genetic mutations. For a property of interest, we synthesise the constraints over the parameter space that represent the set of GRNs satisfying the property. We experimentally show that our parameter synthesis procedure computes the mutational robustness of GRNs—an important problem of interest in evolutionary biology—more efficiently than the classical simulation method. We specify the property in linear temporal logic. We employ symbolic bounded model checking and SMT solving to compute the space of GRNs that satisfy the property, which amounts to synthesizing a set of linear constraints on the weights

    Sickle cell disease classification using deep learning

    Get PDF
    This paper presents a transfer and deep learning based approach to the classification of Sickle Cell Disease (SCD). Five transfer learning models such as ResNet-50, AlexNet, MobileNet, VGG-16 and VGG-19, and a sequential convolutional neural network (CNN) have been implemented for SCD classification. ErythrocytesIDB dataset has been used for training and testing the models. In order to make up for the data insufficiency of the erythrocytesIDB dataset, advanced image augmentation techniques are employed to ensure the robustness of the dataset, enhance dataset diversity and improve the accuracy of the models. An ablation experiment using Random Forest and Support Vector Machine (SVM) classifiers along with various hyperparameter tweaking was carried out to determine the contribution of different model elements on their predicted accuracy. A rigorous statistical analysis was carried out for evaluation and to further evaluate the model's robustness, an adversarial attack test was conducted. The experimental results demonstrate compelling performance across all models. After performing the statistical tests, it was observed that MobileNet showed a significant improvement (p = 0.0229), while other models (ResNet-50, AlexNet, VGG-16, VGG-19) did not (p > 0.05). Notably, the ResNet-50 model achieves remarkable precision, recall, and F1-score values of 100 % for circular, elongated, and other cell shapes when experimented with a smaller dataset. The AlexNet model achieves a balanced precision (98 %) and recall (99 %) for circular and elongated shapes. Meanwhile, the other models showcase competitive performance. [Abstract copyright: © 2023 The Authors. Published by Elsevier Ltd.
    • …
    corecore