191 research outputs found
The Devil's Advocate: Shattering the Illusion of Unexploitable Data using Diffusion Models
Protecting personal data against exploitation of machine learning models is
crucial. Recently, availability attacks have shown great promise to provide an
extra layer of protection against the unauthorized use of data to train neural
networks. These methods aim to add imperceptible noise to clean data so that
the neural networks cannot extract meaningful patterns from the protected data,
claiming that they can make personal data "unexploitable." This paper provides
a strong countermeasure against such approaches, showing that unexploitable
data might only be an illusion. In particular, we leverage the power of
diffusion models and show that a carefully designed denoising process can
counteract the effectiveness of the data-protecting perturbations. We
rigorously analyze our algorithm, and theoretically prove that the amount of
required denoising is directly related to the magnitude of the data-protecting
perturbations. Our approach, called AVATAR, delivers state-of-the-art
performance against a suite of recent availability attacks in various
scenarios, outperforming adversarial training even under distribution mismatch
between the diffusion model and the protected data. Our findings call for more
research into making personal data unexploitable, showing that this goal is far
from over. Our implementation is available at this repository:
https://github.com/hmdolatabadi/AVATAR.Comment: Accepted to the 2024 IEEE Conference on Secure and Trustworthy
Machine Learning (SatML
Adversarial Coreset Selection for Efficient Robust Training
Neural networks are vulnerable to adversarial attacks: adding well-crafted,
imperceptible perturbations to their input can modify their output. Adversarial
training is one of the most effective approaches to training robust models
against such attacks. Unfortunately, this method is much slower than vanilla
training of neural networks since it needs to construct adversarial examples
for the entire training data at every iteration. By leveraging the theory of
coreset selection, we show how selecting a small subset of training data
provides a principled approach to reducing the time complexity of robust
training. To this end, we first provide convergence guarantees for adversarial
coreset selection. In particular, we show that the convergence bound is
directly related to how well our coresets can approximate the gradient computed
over the entire training data. Motivated by our theoretical analysis, we
propose using this gradient approximation error as our adversarial coreset
selection objective to reduce the training set size effectively. Once built, we
run adversarial training over this subset of the training data. Unlike existing
methods, our approach can be adapted to a wide variety of training objectives,
including TRADES, -PGD, and Perceptual Adversarial Training. We conduct
extensive experiments to demonstrate that our approach speeds up adversarial
training by 2-3 times while experiencing a slight degradation in the clean and
robust accuracy.Comment: Accepted to the International Journal of Computer Vision (IJCV).
Extended version of the ECCV2022 paper: arXiv:2112.00378. arXiv admin note:
substantial text overlap with arXiv:2112.0037
Modeling heterogeneous variance–Covariance components in two-level models
Applications of multilevel models to continuous outcomes nearly always assume constant residual variance and constant random effects variances and covar-iances. However, modeling heterogeneity of variance can prove a useful indi-cator of model misspecification and in some educational and behavioral studies, it may even be of direct substantive interest. The purpose of this article is to review, describe, and illustrate a set of recent extensions to two-level models that allow the residual and random effects variance–covariance components to be specified as functions of predictors. These predictors can then be entered with random coefficients to allow the Level-1 heteroscedastic relationships to vary across Level-2 units. We demonstrate by simulation that ignoring Level-2 variability in residual variances leads the Level-1 variance function regres-sion coefficients to be estimated with spurious precision. We discuss software options for fitting these extensions, and we illustrate them by reanalyzing the classic High School and Beyond data and two-level school effects models pre-sented by Raudenbush and Bryk.AQ
Moving shape dynamics: A signal processing perspective
This paper provides a new perspective on human motion analysis, namely regarding human motions in video as general discrete time signals. While this seems an intuitive idea, research on human motion analysis has attracted little attention from the signal processing community. Sophisticated signal processing techniques create important opportunities for new solutions to the problem of human motion analysis. This paper investigates how the deformations of human silhouettes (or shapes) during articulated motion can be used as discriminating features to implicitly capture motion dynamics. In particular, we demonstrate the applicability of two widely used signal transform methods, namely the Discrete Fourier Transform (DFT) and Discrete Wavelet Transform (DWT), for characterization and recognition of human motion sequences. Experimental results show the effectiveness of the proposed method on two stateof-the-art data sets. 1
- …