111 research outputs found
Dataset Distillation with Convexified Implicit Gradients
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the
state-of-the-art. To this end, we first formulate dataset distillation as a
bi-level optimization problem. Then, we show how implicit gradients can be
effectively used to compute meta-gradient updates. We further equip the
algorithm with a convexified approximation that corresponds to learning on top
of a frozen finite-width neural tangent kernel. Finally, we improve bias in
implicit gradients by parameterizing the neural network to enable analytical
computation of final-layer parameters given the body parameters. RCIG
establishes the new state-of-the-art on a diverse series of dataset
distillation tasks. Notably, with one image per class, on resized ImageNet,
RCIG sees on average a 108% improvement over the previous state-of-the-art
distillation algorithm. Similarly, we observed a 66% gain over SOTA on
Tiny-ImageNet and 37% on CIFAR-100
Kernel Graph Convolutional Neural Networks
Graph kernels have been successfully applied to many graph classification
problems. Typically, a kernel is first designed, and then an SVM classifier is
trained based on the features defined implicitly by this kernel. This two-stage
approach decouples data representation from learning, which is suboptimal. On
the other hand, Convolutional Neural Networks (CNNs) have the capability to
learn their own features directly from the raw data during training.
Unfortunately, they cannot handle irregular data such as graphs. We address
this challenge by using graph kernels to embed meaningful local neighborhoods
of the graphs in a continuous vector space. A set of filters is then convolved
with these patches, pooled, and the output is then passed to a feedforward
network. With limited parameter tuning, our approach outperforms strong
baselines on 7 out of 10 benchmark datasets.Comment: Accepted at ICANN '1
A Differentially Private Framework for Deep Learning with Convexified Loss Functions
Differential privacy (DP) has been applied in deep learning for preserving
privacy of the underlying training sets. Existing DP practice falls into three
categories - objective perturbation, gradient perturbation and output
perturbation. They suffer from three main problems. First, conditions on
objective functions limit objective perturbation in general deep learning
tasks. Second, gradient perturbation does not achieve a satisfactory
privacy-utility trade-off due to over-injected noise in each epoch. Third, high
utility of the output perturbation method is not guaranteed because of the
loose upper bound on the global sensitivity of the trained model parameters as
the noise scale parameter. To address these problems, we analyse a tighter
upper bound on the global sensitivity of the model parameters. Under a
black-box setting, based on this global sensitivity, to control the overall
noise injection, we propose a novel output perturbation framework by injecting
DP noise into a randomly sampled neuron (via the exponential mechanism) at the
output layer of a baseline non-private neural network trained with a
convexified loss function. We empirically compare the privacy-utility
trade-off, measured by accuracy loss to baseline non-private models and the
privacy leakage against black-box membership inference (MI) attacks, between
our framework and the open-source differentially private stochastic gradient
descent (DP-SGD) approaches on six commonly used real-world datasets. The
experimental evaluations show that, when the baseline models have observable
privacy leakage under MI attacks, our framework achieves a better
privacy-utility trade-off than existing DP-SGD implementations, given an
overall privacy budget for a large number of queries.Comment: This paper has been accepted by the IEEE Transactions on Information
Forensics & Security. Early access of IEEE Explore will be available soo
- …