126 research outputs found
Exploring the Optimal Learning Technique for IBM Neurosynaptic System to Overcome Quantization Loss
Inspired by the fact that human brain is much more efficient than any nowadays computers, neuromorphic computing is aim at performing near human brain ability of processing huge amount of data in an extreme short time. For the hardware part, neuromorphic computing is also extended to systems facilitating the computation of neural network and machine learning algorithms. Recently, IBM Neurosynaptic system is one of the well-known project that dedicated on energy-efficient neural network applications. However, However, one of the known issues in TrueNorth design is the limited precision of synaptic weights, each of which can be selected from only four integers. To improve the computation accuracy and reduce the incurred hardware cost, in this work, we investigate seven different regularization functions in the cost function of the learning process on TrueNorth platform. Our experimental results proved that the proposed techniques considerably improve the computation accuracy of TrueNorth platform and reduce the incurred hardware and performance overheads
ZiCo-BC: A Bias Corrected Zero-Shot NAS for Vision Tasks
Zero-Shot Neural Architecture Search (NAS) approaches propose novel
training-free metrics called zero-shot proxies to substantially reduce the
search time compared to the traditional training-based NAS. Despite the success
on image classification, the effectiveness of zero-shot proxies is rarely
evaluated on complex vision tasks such as semantic segmentation and object
detection. Moreover, existing zero-shot proxies are shown to be biased towards
certain model characteristics which restricts their broad applicability. In
this paper, we empirically study the bias of state-of-the-art (SOTA) zero-shot
proxy ZiCo across multiple vision tasks and observe that ZiCo is biased towards
thinner and deeper networks, leading to sub-optimal architectures. To solve the
problem, we propose a novel bias correction on ZiCo, called ZiCo-BC. Our
extensive experiments across various vision tasks (image classification, object
detection and semantic segmentation) show that our approach can successfully
search for architectures with higher accuracy and significantly lower latency
on Samsung Galaxy S10 devices.Comment: Accepted at ICCV-Workshop on Resource-Efficient Deep Learning, 202
MAT: A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks
Some recent works revealed that deep neural networks (DNNs) are vulnerable to
so-called adversarial attacks where input examples are intentionally perturbed
to fool DNNs. In this work, we revisit the DNN training process that includes
adversarial examples into the training dataset so as to improve DNN's
resilience to adversarial attacks, namely, adversarial training. Our
experiments show that different adversarial strengths, i.e., perturbation
levels of adversarial examples, have different working zones to resist the
attack. Based on the observation, we propose a multi-strength adversarial
training method (MAT) that combines the adversarial training examples with
different adversarial strengths to defend adversarial attacks. Two training
structures - mixed MAT and parallel MAT - are developed to facilitate the
tradeoffs between training time and memory occupation. Our results show that
MAT can substantially minimize the accuracy degradation of deep learning
systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.Comment: 6 pages, 4 figures, 2 table
DONNAv2 -- Lightweight Neural Architecture Search for Vision tasks
With the growing demand for vision applications and deployment across edge
devices, the development of hardware-friendly architectures that maintain
performance during device deployment becomes crucial. Neural architecture
search (NAS) techniques explore various approaches to discover efficient
architectures for diverse learning tasks in a computationally efficient manner.
In this paper, we present the next-generation neural architecture design for
computationally efficient neural architecture distillation - DONNAv2 .
Conventional NAS algorithms rely on a computationally extensive stage where an
accuracy predictor is learned to estimate model performance within search
space. This building of accuracy predictors helps them predict the performance
of models that are not being finetuned. Here, we have developed an elegant
approach to eliminate building the accuracy predictor and extend DONNA to a
computationally efficient setting. The loss metric of individual blocks forming
the network serves as the surrogate performance measure for the sampled models
in the NAS search stage. To validate the performance of DONNAv2 we have
performed extensive experiments involving a range of diverse vision tasks
including classification, object detection, image denoising, super-resolution,
and panoptic perception network (YOLOP). The hardware-in-the-loop experiments
were carried out using the Samsung Galaxy S10 mobile platform. Notably, DONNAv2
reduces the computational cost of DONNA by 10x for the larger datasets.
Furthermore, to improve the quality of NAS search space, DONNAv2 leverages a
block knowledge distillation filter to remove blocks with high inference costs.Comment: Accepted at ICCV-Workshop on Resource-Efficient Deep Learning, 202
AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture
Resource is an important constraint when deploying Deep Neural Networks
(DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based
search approach, which limits the flexibility of network patterns in learned
cell structures. Moreover, due to the topology-agnostic nature of existing
works, including both cell-based and node-based approaches, the search process
is time consuming and the performance of found architecture may be sub-optimal.
To address these problems, we propose AutoShrink, a topology-aware Neural
Architecture Search(NAS) for searching efficient building blocks of neural
architectures. Our method is node-based and thus can learn flexible network
patterns in cell structures within a topological search space. Directed Acyclic
Graphs (DAGs) are used to abstract DNN architectures and progressively optimize
the cell structure through edge shrinking. As the search space intrinsically
reduces as the edges are progressively shrunk, AutoShrink explores more
flexible search space with even less search time. We evaluate AutoShrink on
image classification and language tasks by crafting ShrinkCNN and ShrinkRNN
models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34%
Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of
state-of-the-art (SOTA) models. Specifically, both ShrinkCNN and ShrinkRNN are
crafted within 1.5 GPU hours, which is 7.2x and 6.7x faster than the crafting
time of SOTA CNN and RNN models, respectively
LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning
Distributed learning systems have enabled training large-scale models over
large amount of data in significantly shorter time. In this paper, we focus on
decentralized distributed deep learning systems and aim to achieve differential
privacy with good convergence rate and low communication cost. To achieve this
goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic
Averaging Stochastic Gradient Descent), which is driven by a novel
Leader-Follower topology and a differential privacy model.We provide a
theoretical analysis of the convergence rate and the trade-off between the
performance and privacy in the private setting.The experimental results show
that LEASGD outperforms state-of-the-art decentralized learning algorithm DPSGD
by achieving steadily lower loss within the same iterations and by reducing the
communication cost by 30%. In addition, LEASGD spends less differential privacy
budget and has higher final accuracy result than DPSGD under private setting
- …