51 research outputs found

    Truly Unordered Probabilistic Rule Sets for Multi-class Classification

    Full text link
    Rule set learning has long been studied and has recently been frequently revisited due to the need for interpretable models. Still, existing methods have several shortcomings: 1) most recent methods require a binary feature matrix as input, while learning rules directly from numeric variables is understudied; 2) existing methods impose orders among rules, either explicitly or implicitly, which harms interpretability; and 3) currently no method exists for learning probabilistic rule sets for multi-class target variables (there is only one for probabilistic rule lists). We propose TURS, for Truly Unordered Rule Sets, which addresses these shortcomings. We first formalize the problem of learning truly unordered rule sets. To resolve conflicts caused by overlapping rules, i.e., instances covered by multiple rules, we propose a novel approach that exploits the probabilistic properties of our rule sets. We next develop a two-phase heuristic algorithm that learns rule sets by carefully growing rules. An important innovation is that we use a surrogate score to take the global potential of the rule set into account when learning a local rule. Finally, we empirically demonstrate that, compared to non-probabilistic and (explicitly or implicitly) ordered state-of-the-art methods, our method learns rule sets that not only have better interpretability but also better predictive performance.Comment: Camera ready version for ECMLPKDD 2022, with Supplementary Material

    Robust subgroup discovery

    Get PDF
    We introduce the problem of robust subgroup discovery, i.e., finding a set of interpretable descriptions of subsets that 1) stand out with respect to one or more target attributes, 2) are statistically robust, and 3) non-redundant. Many attempts have been made to mine either locally robust subgroups or to tackle the pattern explosion, but we are the first to address both challenges at the same time from a global modelling perspective. First, we formulate the broad model class of subgroup lists, i.e., ordered sets of subgroups, for univariate and multivariate targets that can consist of nominal or numeric variables, and that includes traditional top-1 subgroup discovery in its definition. This novel model class allows us to formalise the problem of optimal robust subgroup discovery using the Minimum Description Length (MDL) principle, where we resort to optimal Normalised Maximum Likelihood and Bayesian encodings for nominal and numeric targets, respectively. Second, as finding optimal subgroup lists is NP-hard, we propose SSD++, a greedy heuristic that finds good subgroup lists and guarantees that the most significant subgroup found according to the MDL criterion is added in each iteration, which is shown to be equivalent to a Bayesian one-sample proportions, multinomial, or t-test between the subgroup and dataset marginal target distributions plus a multiple hypothesis testing penalty. We empirically show on 54 datasets that SSD++ outperforms previous subgroup set discovery methods in terms of quality and subgroup list size.Comment: For associated code, see https://github.com/HMProenca/RuleList ; submitted to Data Mining and Knowledge Discovery Journa

    CHIRPS: Explaining random forest classification

    Get PDF
    Modern machine learning methods typically produce “black box” models that are opaque to interpretation. Yet, their demand has been increasing in the Human-in-the-Loop pro-cesses, that is, those processes that require a human agent to verify, approve or reason about the automated decisions before they can be applied. To facilitate this interpretation, we propose Collection of High Importance Random Path Snippets (CHIRPS); a novel algorithm for explaining random forest classification per data instance. CHIRPS extracts a decision path from each tree in the forest that contributes to the majority classification, and then uses frequent pattern mining to identify the most commonly occurring split conditions. Then a simple, conjunctive form rule is constructed where the antecedent terms are derived from the attributes that had the most influence on the classification. This rule is returned alongside estimates of the rule’s precision and coverage on the training data along with counter-factual details. An experimental study involving nine data sets shows that classification rules returned by CHIRPS have a precision at least as high as the state of the art when evaluated on unseen data (0.91–0.99) and offer a much greater coverage (0.04–0.54). Furthermore, CHIRPS uniquely controls against under- and over-fitting solutions by maximising novel objective functions that are better suited to the local (per instance) explanation setting

    Vouw: geometric pattern mining using the MDL principle

    Get PDF
    Algorithms and the Foundations of Software technolog

    Preditcting Treatment Outcome Using Interpretable Models for Patients with Head and Neck Cancer

    Get PDF
    Head and neck cancer accounts for around 3 % of cancers worldwide, resulting in many deaths each year. The increasing number of patients receiving a cancer diagnosis increases the demand for accurate diagnosis and effective treatment. Intra-tumor heterogeneity is said to be one of the issues in cancer therapy, an issue that needs to be solved. Radiomics pave the way for extracting features based on the shape, size, and texture of the entire tumor. Radiomics extracts features from tumors based on the gray levels in a medical image. The process of radiomics is intended to capture texture and heterogeneity in the tumor that would be impossible to deduce from a simple tumor biopsy. Feature extraction by radiomics has been proven to enrich clinical datasets with valuable features that positively impact the performance of predictive models. This thesis investigates the use of clinical and radiomics features for predicting treatment outcomes of head and neck cancer patients using interpretable models. The radiomics algorithm extracts first-order statistical, shape, and texture features from PET and CT images of each patient. The 139 patients in the training dataset were from Oslo University Hospital (OUS), whereas the 99 patients in the test set were from the MAASTRO clinic in the Netherlands. All the clinical features, together with the radiomics features, counted 388 features in total. Feature selection through the repeated elastic net technique (RENT) was performed to exclude irrelevant features from the dataset. Seven different tree-based machine learning algorithms were fitted to the data, and the performance was validated by the accuracy, ROC AUC, Matthews correlation coefficient, F1 score for class 1, and F1 score for class 0. The models were tested on the external MAASTRO dataset, and the overall best-performing models were interpreted. On the external dataset from the MAASTRO clinic, the highest-performing models obtained an MCC of 0.37 for OS prediction and 0.44 for DFS prediction. For both OS and DFS, the highest predictions were made on only the clinical data. Transparency in machine learning models greatly benefits decision-makers in clinical settings, as every prediction can be reasoned for. Predicting treatment outcomes for head and neck patients is highly possible with interpretable models. To determine if the methods used in this thesis are suited for predicting treatment outcomes for head and neck cancer patients, it is necessary to test the methods and models on more datasets
    • …
    corecore