230 research outputs found

    Theoretical Foundations of Adversarially Robust Learning

    Full text link
    Despite extraordinary progress, current machine learning systems have been shown to be brittle against adversarial examples: seemingly innocuous but carefully crafted perturbations of test examples that cause machine learning predictors to misclassify. Can we learn predictors robust to adversarial examples? and how? There has been much empirical interest in this contemporary challenge in machine learning, and in this thesis, we address it from a theoretical perspective. In this thesis, we explore what robustness properties can we hope to guarantee against adversarial examples and develop an understanding of how to algorithmically guarantee them. We illustrate the need to go beyond traditional approaches and principles such as empirical risk minimization and uniform convergence, and make contributions that can be categorized as follows: (1) introducing problem formulations capturing aspects of emerging practical challenges in robust learning, (2) designing new learning algorithms with provable robustness guarantees, and (3) characterizing the complexity of robust learning and fundamental limitations on the performance of any algorithm.Comment: PhD Thesi

    Information Bottleneck

    Get PDF
    The celebrated information bottleneck (IB) principle of Tishby et al. has recently enjoyed renewed attention due to its application in the area of deep learning. This collection investigates the IB principle in this new context. The individual chapters in this collection: • provide novel insights into the functional properties of the IB; • discuss the IB principle (and its derivates) as an objective for training multi-layer machine learning structures such as neural networks and decision trees; and • offer a new perspective on neural network learning via the lens of the IB framework. Our collection thus contributes to a better understanding of the IB principle specifically for deep learning and, more generally, of information–theoretic cost functions in machine learning. This paves the way toward explainable artificial intelligence

    Neural Image Compression: Generalization, Robustness, and Spectral Biases

    Full text link
    Recent advances in neural image compression (NIC) have produced models that are starting to outperform classic codecs. While this has led to growing excitement about using NIC in real-world applications, the successful adoption of any machine learning system in the wild requires it to generalize (and be robust) to unseen distribution shifts at deployment. Unfortunately, current research lacks comprehensive datasets and informative tools to evaluate and understand NIC performance in real-world settings. To bridge this crucial gap, first, this paper presents a comprehensive benchmark suite to evaluate the out-of-distribution (OOD) performance of image compression methods. Specifically, we provide CLIC-C and Kodak-C by introducing 15 corruptions to the popular CLIC and Kodak benchmarks. Next, we propose spectrally-inspired inspection tools to gain deeper insight into errors introduced by image compression methods as well as their OOD performance. We then carry out a detailed performance comparison of several classic codecs and NIC variants, revealing intriguing findings that challenge our current understanding of the strengths and limitations of NIC. Finally, we corroborate our empirical findings with theoretical analysis, providing an in-depth view of the OOD performance of NIC and its dependence on the spectral properties of the data. Our benchmarks, spectral inspection tools, and findings provide a crucial bridge to the real-world adoption of NIC. We hope that our work will propel future efforts in designing robust and generalizable NIC methods. Code and data will be made available at https://github.com/klieberman/ood_nic.Comment: NeurIPS 202

    Recall Distortion in Neural Network Pruning and the Undecayed Pruning Algorithm

    Get PDF
    Pruning techniques have been successfully used in neural networks to trade accuracy for sparsity. However, the impact of network pruning is not uniform: prior work has shown that the recall for underrepresented classes in a dataset may be more negatively affected. In this work, we study such relative distortions in recall by hypothesizing an intensification effect that is inherent to the model. Namely, that pruning makes recall relatively worse for a class with recall below accuracy and, conversely, that it makes recall relatively better for a class with recall above accuracy. In addition, we propose a new pruning algorithm aimed at attenuating such effect. Through statistical analysis, we have observed that intensification is less severe with our algorithm but nevertheless more pronounced with relatively more difficult tasks, less complex models, and higher pruning ratios. More surprisingly, we conversely observe a de-intensification effect with lower pruning ratios, which indicates that moderate pruning may have a corrective effect to such distortions

    Novel Concepts and Designs for Adversarial Attacks and Defenses

    Get PDF
    Albeit displaying remarkable performance across a range of tasks, Deep Neural Networks (DNNs) are highly vulnerable to adversarial examples which are carefully created to deceive these networks. This thesis first demonstrates that DNNs are vulnerable against adversarial attacks even when the attacker is unaware of the model architecture or the training data used to train the model and then proposes a number of novel approaches to improve the robustness of DNNs against challenging adversarial perturbations. Specifically for adversarial attacks, our work highlights how targeted and untargeted adversarial functions can be learned without access to the original data distribution, training mechanism, or label space of an attacked computer vision system. We demonstrate state-of-the-art cross-domain transferability of adversarial perturbations learned from paintings, cartoons, and medical scans to models trained on natural image datasets (such as ImageNet). In this manner, our work highlights an important vulnerability of deep neural networks which makes their deployment challenging in a real-world scenario. Against the threat of these adversarial attacks, we develop novel defense mechanisms that can be deployed with or without retraining the deep neural networks. To this end, we design two plug-and-play defense methods that can protect off-the-shelf pre-trained models without retraining. Specifically, we propose Neural Representation Purifier (NRP) and Local Gradient Smoothing (LGS) to defend against constrained and unconstrained attacks, respectively. NRP learns to purify adversarial noise spread across entire the input image, however, it still struggles against unconstrained attacks where an attacker hides an adversarial sticker preferably in the background without disturbing the original salient image information. We develop a mechanism to smooth local gradients in an input image to stabilize abnormal adversarial patterns introduced by the unconstrained attacks such as an adversarial patch. Robustifying model's parameter space that is retraining the model on adversarial examples is of equal importance. However, current adversarial training methods not only lose performance on the clean image samples (images without the adversarial noise) but also show poor generalization to natural image corruptions. We propose a style-based adversarial training that enhances the model robustness to adversarial attacks as well as natural corruptions. A model trained on our proposed stylized adversarial training shows better generalization to shifts in data distribution including natural image corruptions such as fog, rain, and contrast. One drawback of adversarial training is the loss of accuracy on clean image samples especially when the model size is small. To address this limitation, we design a meta-learning-based approach that takes advantage of universal (instance-agnostic) as well as local (instance-specific) perturbations to train small neural networks with feature regularization that leads to better robustness with minimal drop in performance on clean image samples. Adversarial training is useful if it can be deployed against unseen adversarial attacks. However, evaluating a certain adversarial training mechanism remains a challenging feat due to gradient masking, a phenomenon where adversarial robustness is high due to failed attack optimization. Finally, we develop a generic attack algorithm based on a novel guidance mechanism in order to expose any elusive robustness due to gradient masking. In short, this thesis outlines new methods to expose the vulnerability of DNNs against adversarial perturbations and then sets out to propose novel defense techniques with special advantages over state-of-the-art methods e.g., task-agnostic behavior, good performance against natural perturbations, and less impact on model accuracy on clean image samples

    Meta-Learning in Neural Networks: A Survey

    Get PDF
    The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research

    Coresets for Wasserstein Distributionally Robust Optimization Problems

    Full text link
    Wasserstein distributionally robust optimization (\textsf{WDRO}) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of \textsf{WDRO} can be prohibitive in practice since solving its ``minimax'' formulation requires a great amount of computation. Recently, several fast \textsf{WDRO} training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale \textsf{WDRO}s is still quite limited, to the best of our knowledge. \textit{Coreset} is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the ϵ\epsilon-coreset for the general \textsf{WDRO} problems. Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original \textsf{WDRO} objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of \textsf{WDRO}. Finally, we implement our coreset approach and illustrate its effectiveness for several \textsf{WDRO} problems in the experiments

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity
    • …
    corecore