1,917 research outputs found

    Robust Decision Trees Against Adversarial Examples

    Full text link
    Although adversarial examples and model robustness have been extensively studied in the context of linear models and neural networks, research on this issue in tree-based models and how to make tree-based models robust against adversarial examples is still limited. In this paper, we show that tree based models are also vulnerable to adversarial examples and develop a novel algorithm to learn robust trees. At its core, our method aims to optimize the performance under the worst-case perturbation of input features, which leads to a max-min saddle point problem. Incorporating this saddle point objective into the decision tree building procedure is non-trivial due to the discrete nature of trees --- a naive approach to finding the best split according to this saddle point objective will take exponential time. To make our approach practical and scalable, we propose efficient tree building algorithms by approximating the inner minimizer in this saddle point problem, and present efficient implementations for classical information gain based trees as well as state-of-the-art tree boosting models such as XGBoost. Experimental results on real world datasets demonstrate that the proposed algorithms can substantially improve the robustness of tree-based models against adversarial examples

    Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers

    Full text link
    In this paper, we present a black-box attack against API call based machine learning malware classifiers, focusing on generating adversarial sequences combining API calls and static features (e.g., printable strings) that will be misclassified by the classifier without affecting the malware functionality. We show that this attack is effective against many classifiers due to the transferability principle between RNN variants, feed forward DNNs, and traditional machine learning classifiers such as SVM. We also implement GADGET, a software framework to convert any malware binary to a binary undetected by malware classifiers, using the proposed attack, without access to the malware source code.Comment: Accepted as a conference paper at RAID 201

    Efficient Monte Carlo Integration Using Boosted Decision Trees and Generative Deep Neural Networks

    Get PDF
    New machine learning based algorithms have been developed and tested for Monte Carlo integration based on generative Boosted Decision Trees and Deep Neural Networks. Both of these algorithms exhibit substantial improvements compared to existing algorithms for non-factorizable integrands in terms of the achievable integration precision for a given number of target function evaluations. Large scale Monte Carlo generation of complex collider physics processes with improved efficiency can be achieved by implementing these algorithms into commonly used matrix element Monte Carlo generators once their robustness is demonstrated and performance validated for the relevant classes of matrix elements

    Verifying Robustness of Gradient Boosted Models

    Full text link
    Gradient boosted models are a fundamental machine learning technique. Robustness to small perturbations of the input is an important quality measure for machine learning models, but the literature lacks a method to prove the robustness of gradient boosted models. This work introduces VeriGB, a tool for quantifying the robustness of gradient boosted models. VeriGB encodes the model and the robustness property as an SMT formula, which enables state of the art verification tools to prove the model's robustness. We extensively evaluate VeriGB on publicly available datasets and demonstrate a capability for verifying large models. Finally, we show that some model configurations tend to be inherently more robust than others

    Pulling Out All the Tops with Computer Vision and Deep Learning

    Full text link
    We apply computer vision with deep learning -- in the form of a convolutional neural network (CNN) -- to build a highly effective boosted top tagger. Previous work (the "DeepTop" tagger of Kasieczka et al) has shown that a CNN-based top tagger can achieve comparable performance to state-of-the-art conventional top taggers based on high-level inputs. Here, we introduce a number of improvements to the DeepTop tagger, including architecture, training, image preprocessing, sample size and color pixels. Our final CNN top tagger outperforms BDTs based on high-level inputs by a factor of ∼2\sim 2--3 or more in background rejection, over a wide range of tagging efficiencies and fiducial jet selections. As reference points, we achieve a QCD background rejection factor of 500 (60) at 50\% top tagging efficiency for fully-merged (non-merged) top jets with pTp_T in the 800--900 GeV (350--450 GeV) range. Our CNN can also be straightforwardly extended to the classification of other types of jets, and the lessons learned here may be useful to others designing their own deep NNs for LHC applications.Comment: 33 pages, 11 figure
    • …
    corecore