38 research outputs found

    Privacy-Preserving Image Sharing via Sparsifying Layers on Convolutional Groups

    Full text link
    We propose a practical framework to address the problem of privacy-aware image sharing in large-scale setups. We argue that, while compactness is always desired at scale, this need is more severe when trying to furthermore protect the privacy-sensitive content. We therefore encode images, such that, from one hand, representations are stored in the public domain without paying the huge cost of privacy protection, but ambiguated and hence leaking no discernible content from the images, unless a combinatorially-expensive guessing mechanism is available for the attacker. From the other hand, authorized users are provided with very compact keys that can easily be kept secure. This can be used to disambiguate and reconstruct faithfully the corresponding access-granted images. We achieve this with a convolutional autoencoder of our design, where feature maps are passed independently through sparsifying transformations, providing multiple compact codes, each responsible for reconstructing different attributes of the image. The framework is tested on a large-scale database of images with public implementation available.Comment: Accepted as an oral presentation for ICASSP 202

    Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

    Get PDF
    The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, sometimes even better than, the original dense networks. Sparsity promises to reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field

    Hardware-Friendly Model Compression techniques for Deep Learning Accelerators

    Get PDF
    The objective of the proposed research is to introduce solutions to make energy-efficient \gls{dnn} accelerators to be deployable on edge devices through developing hardware-aware \gls{dnn} compression methods. The rising popularity of intelligent mobile devices and the computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. In particular, we proposed four compression techniques for energy and memory efficient \gls{dnn} computing. In the first method, \gls{lgps}, we present a hardware-aware pruning method where the locations of non-zero weights are derived in real-time from a \gls{lfsr}. Using the proposed method, we demonstrate a total saving of energy and area up to 63.96\% and 64.23\% for VGG-16 network on down-sampled ImageNet, respectively for iso-compression-rate and iso-accuracy. Secondly, We achieved ultra-low bit-precision deep learning model by developing a quantization scheme through knowledge distillation and gradual quantization for pruned network. Thirdly, we propose a novel model compression scheme that allows inference to be carried out using bit-level sparsity, which can be efficiently implemented using in-memory computing macros. We introduce a method called BitS-Net to leverage the benefits of bit-sparsity (where the number of zeros is more than number of ones in binary representation of weight/activation values) when applied to Compute-In-Memory (\gls{cim}) with Resistive Random Access Memory (\gls{rram}) to develop energy efficient \gls{dnn} accelerators operating in the inference mode. We demonstrate that BitS-Net improves the energy efficiency by up to 5x for ResNet models on the ImageNet dataset. In the last part, to achieve highly energy-efficient DNN, we introduce a novel twofold sparsity method (Twofold Sparsity, twofoldS-Net) to sparsify the DNN models in bit- and network-level, simultaneously. We added two separate regularizations to the loss function in order to achieve bit- and network-level sparsity at the same time. We sparsify the model in network-level, by adding a mask generated by \gls{lfsr}. For bit-level sparsity, we quantize the network to 8-bit representation in two's complement format. During inference we take advantage of \gls{cim} architecture and \gls{lfsr} indexing. We have shown that by using our proposed method we are able to sparsify the network and design a highly energy-efficient deep learning accelerator to eventually bring \gls{ai} to our daily lives.Ph.D

    Visual privacy attacks and defenses in deep learning: a survey

    Full text link
    corecore