11 research outputs found

    Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

    Get PDF
    Variational autoencoders (VAEs) are latent variable models that can generate complex objects and provide meaningful latent representations. Moreover, they could be further used in downstream tasks such as classification. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attack construction proposed previously and present a solution to alleviate the effect of these attacks. Our method utilizes the Markov Chain Monte Carlo (MCMC) technique in the inference step that we motivate with a theoretical analysis. Thus, we do not incorporate any extra costs during training, and the performance on non-attacked inputs is not decreased. We validate our approach on a variety of datasets (MNIST, Fashion MNIST, Color MNIST, CelebA) and VAE configurations (β\beta-VAE, NVAE, β\beta-TCVAE), and show that our approach consistently improves the model robustness to adversarial attacks

    Ataques adversários em autoencoders variacionais : um estudo qualitativo e quantitativo

    Get PDF
    Orientador: Eduardo Alves do Valle JúniorDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Desde os tempos do antigo Egito até a sociedade digital atual, a humanidade está pre-ocupada com a Segurança da Informação. Essa preocupação vem se deslocando para ocampo da Inteligência Artificial, uma vez que suas aplicações estão sendo utilizadas ematividades que podem causar diversos danos sociais. Um laboratório especializado empesquisa de segurança de ponta demonstrou que era possível controlar remotamente umcarro autônomo com a utilização de ataques adversários. Neste projeto, pretendemos ex-plorar esses ataques em Variational Autoencoders.Hoje, os ataques adversários são um perigo bem conhecido das Redes Neurais Profundas.São entradas maliciosas que enganam os modelos de aprendizado de máquina. Propomosum esquema para atacar os autoencoders, bem como uma estrutura de avaliação quanti-tativa ¿ uma nova métrica chamada Área Sob a Curva de Distorção ¿ Distorção ¿ quese correlaciona bem com a avaliação qualitativa dos ataques avançando um pequeno passona literatura de ataques adversários a modelos generativos. Avaliamos ¿ com experimen-tos estatisticamente validados ¿ a resistência a ataques de três autoencoders variacionais(simples, convolucional e DRAW) em três conjuntos de dados (MNIST, SVHN, CelebA),mostrando que tanto a recorrência quanto o mecanismo de atenção do DRAW levam amaior resistência. Compartilhamos o código completo usado para executar nossos exper-imentos em .Como os autoencoders são propostos para compactação de dados ¿ um cenário em quesua segurança é fundamental ¿, esperamos que mais atenção seja dada aos ataques ad-versários nelesAbstract: From the times of ancient Egypt to the nowadays digital society, humankind is concernedwith Information Security. This concern has been shifting to the Artificial Intelligence fieldsince its applications are being used in activities which can cause several social damages. Alaboratory specialized in cutting-edge security research demonstrated that it was possibleto control an autonomous car remotely with the utilization of adversarial attacks. In thisproject, we aim at exploring those attacks on Variational Autoencoders.Today, adversarial attacks are a well-known danger of Deep Neural Networks. They aremalicious inputs that derail machine-learning models. We propose a scheme to attackautoencoders, as well as a quantitative evaluation framework ¿ a novel metric calledArea Under the Distortion¿Distortion Curve ¿ that correlates well with the qualitativeassessment of the attacks advancing a small step the literature in subject of adversarialattacks on generative models. We assess ¿ with statistically validated experiments ¿ theresistance to attacks of three variational autoencoders (simple, convolutional, and DRAW)in three datasets (MNIST, SVHN, CelebA), showing that both DRAW¿s recurrence andattention mechanism lead to better resistance. We shared the complete code used to runour experiments in .As variational autoencoders are proposed for compressing data ¿ a scenario in whichtheir safety is paramount ¿ we expect more attention will be given to adversarial attackson themMestradoEngenharia de ComputaçãoMestre em Engenharia Elétrica2017/03706-2FAPES

    Towards a Theoretical Understanding of the Robustness of Variational Autoencoders

    Full text link
    We make inroads into understanding the robustness of Variational Autoencoders (VAEs) to adversarial attacks and other input perturbations. While previous work has developed algorithmic approaches to attacking and defending VAEs, there remains a lack of formalization for what it means for a VAE to be robust. To address this, we develop a novel criterion for robustness in probabilistic models: rr-robustness. We then use this to construct the first theoretical results for the robustness of VAEs, deriving margins in the input space for which we can provide guarantees about the resulting reconstruction. Informally, we are able to define a region within which any perturbation will produce a reconstruction that is similar to the original reconstruction. To support our analysis, we show that VAEs trained using disentangling methods not only score well under our robustness metrics, but that the reasons for this can be interpreted through our theoretical results.Comment: 8 page

    Adversarial robustness of amortized Bayesian inference

    Full text link
    Bayesian inference usually requires running potentially costly inference procedures separately for every new observation. In contrast, the idea of amortized Bayesian inference is to initially invest computational cost in training an inference network on simulated data, which can subsequently be used to rapidly perform inference (i.e., to return estimates of posterior distributions) for new observations. This approach has been applied to many real-world models in the sciences and engineering, but it is unclear how robust the approach is to adversarial perturbations in the observed data. Here, we study the adversarial robustness of amortized Bayesian inference, focusing on simulation-based estimation of multi-dimensional posterior distributions. We show that almost unrecognizable, targeted perturbations of the observations can lead to drastic changes in the predicted posterior and highly unrealistic posterior predictive samples, across several benchmark tasks and a real-world example from neuroscience. We propose a computationally efficient regularization scheme based on penalizing the Fisher information of the conditional density estimator, and show how it improves the adversarial robustness of amortized Bayesian inference
    corecore