Search CORE

121 research outputs found

Exploring the Optimal Learning Technique for IBM Neurosynaptic System to Overcome Quantization Loss

Author: Cheng Hsin-Pai
Publication venue
Publication date: 13/06/2017
Field of study

Inspired by the fact that human brain is much more efficient than any nowadays computers, neuromorphic computing is aim at performing near human brain ability of processing huge amount of data in an extreme short time. For the hardware part, neuromorphic computing is also extended to systems facilitating the computation of neural network and machine learning algorithms. Recently, IBM Neurosynaptic system is one of the well-known project that dedicated on energy-efficient neural network applications. However, However, one of the known issues in TrueNorth design is the limited precision of synaptic weights, each of which can be selected from only four integers. To improve the computation accuracy and reduce the incurred hardware cost, in this work, we investigate seven different regularization functions in the cost function of the learning process on TrueNorth platform. Our experimental results proved that the proposed techniques considerably improve the computation accuracy of TrueNorth platform and reduce the incurred hardware and performance overheads

D-Scholarship@Pitt

ZiCo-BC: A Bias Corrected Zero-Shot NAS for Vision Tasks

Author: Bhardwaj Kartikeya
Cheng Hsin-Pai
Li Zhuojin
Priyadarshi Sweta
Publication venue
Publication date: 26/09/2023
Field of study

Zero-Shot Neural Architecture Search (NAS) approaches propose novel training-free metrics called zero-shot proxies to substantially reduce the search time compared to the traditional training-based NAS. Despite the success on image classification, the effectiveness of zero-shot proxies is rarely evaluated on complex vision tasks such as semantic segmentation and object detection. Moreover, existing zero-shot proxies are shown to be biased towards certain model characteristics which restricts their broad applicability. In this paper, we empirically study the bias of state-of-the-art (SOTA) zero-shot proxy ZiCo across multiple vision tasks and observe that ZiCo is biased towards thinner and deeper networks, leading to sub-optimal architectures. To solve the problem, we propose a novel bias correction on ZiCo, called ZiCo-BC. Our extensive experiments across various vision tasks (image classification, object detection and semantic segmentation) show that our approach can successfully search for architectures with higher accuracy and significantly lower latency on Samsung Galaxy S10 devices.Comment: Accepted at ICCV-Workshop on Resource-Efficient Deep Learning, 202

arXiv.org e-Print Archive

MAT: A Multi-strength Adversarial Training Method to Mitigate Adversarial Attacks

Author: Chen Yiran
Cheng Hsin-Pai
Li Hai
Li Sicheng
Song Chang
Wu Chunpeng
Wu Qing
Yang Huanrui
Publication venue
Publication date: 11/05/2018
Field of study

Some recent works revealed that deep neural networks (DNNs) are vulnerable to so-called adversarial attacks where input examples are intentionally perturbed to fool DNNs. In this work, we revisit the DNN training process that includes adversarial examples into the training dataset so as to improve DNN's resilience to adversarial attacks, namely, adversarial training. Our experiments show that different adversarial strengths, i.e., perturbation levels of adversarial examples, have different working zones to resist the attack. Based on the observation, we propose a multi-strength adversarial training method (MAT) that combines the adversarial training examples with different adversarial strengths to defend adversarial attacks. Two training structures - mixed MAT and parallel MAT - are developed to facilitate the tradeoffs between training time and memory occupation. Our results show that MAT can substantially minimize the accuracy degradation of deep learning systems to adversarial attacks on MNIST, CIFAR-10, CIFAR-100, and SVHN.Comment: 6 pages, 4 figures, 2 table

arXiv.org e-Print Archive

Crossref

DONNAv2 -- Lightweight Neural Architecture Search for Vision tasks

Author: Cheng Hsin-Pai
Ganapathy Viswanath
Jiang Tianyu
Krishna Sendil
Patel Chirag
Priyadarshi Sweta
Publication venue
Publication date: 26/09/2023
Field of study

With the growing demand for vision applications and deployment across edge devices, the development of hardware-friendly architectures that maintain performance during device deployment becomes crucial. Neural architecture search (NAS) techniques explore various approaches to discover efficient architectures for diverse learning tasks in a computationally efficient manner. In this paper, we present the next-generation neural architecture design for computationally efficient neural architecture distillation - DONNAv2 . Conventional NAS algorithms rely on a computationally extensive stage where an accuracy predictor is learned to estimate model performance within search space. This building of accuracy predictors helps them predict the performance of models that are not being finetuned. Here, we have developed an elegant approach to eliminate building the accuracy predictor and extend DONNA to a computationally efficient setting. The loss metric of individual blocks forming the network serves as the surrogate performance measure for the sampled models in the NAS search stage. To validate the performance of DONNAv2 we have performed extensive experiments involving a range of diverse vision tasks including classification, object detection, image denoising, super-resolution, and panoptic perception network (YOLOP). The hardware-in-the-loop experiments were carried out using the Samsung Galaxy S10 mobile platform. Notably, DONNAv2 reduces the computational cost of DONNA by 10x for the larger datasets. Furthermore, to improve the quality of NAS search space, DONNAv2 leverages a block knowledge distillation filter to remove blocks with high inference costs.Comment: Accepted at ICCV-Workshop on Resource-Efficient Deep Learning, 202

arXiv.org e-Print Archive

AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture

Author: Chen Yiran
Cheng Hsin-Pai
Huang Chengyu
Li Hai
Li Zhenwen
Yan Feng
Zhang Tunhou
Publication venue
Publication date: 20/11/2019
Field of study

Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, due to the topology-agnostic nature of existing works, including both cell-based and node-based approaches, the search process is time consuming and the performance of found architecture may be sub-optimal. To address these problems, we propose AutoShrink, a topology-aware Neural Architecture Search(NAS) for searching efficient building blocks of neural architectures. Our method is node-based and thus can learn flexible network patterns in cell structures within a topological search space. Directed Acyclic Graphs (DAGs) are used to abstract DNN architectures and progressively optimize the cell structure through edge shrinking. As the search space intrinsically reduces as the edges are progressively shrunk, AutoShrink explores more flexible search space with even less search time. We evaluate AutoShrink on image classification and language tasks by crafting ShrinkCNN and ShrinkRNN models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34% Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of state-of-the-art (SOTA) models. Specifically, both ShrinkCNN and ShrinkRNN are crafted within 1.5 GPU hours, which is 7.2x and 6.7x faster than the crafting time of SOTA CNN and RNN models, respectively

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning

Author: Cheng Hsin-Pai
Yu Patrick
Hu Haojing
Yan Feng
Li Shiyu
Li Hai
Chen Yiran
Publication venue
Publication date: 27/11/2018
Field of study

Distributed learning systems have enabled training large-scale models over large amount of data in significantly shorter time. In this paper, we focus on decentralized distributed deep learning systems and aim to achieve differential privacy with good convergence rate and low communication cost. To achieve this goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic Averaging Stochastic Gradient Descent), which is driven by a novel Leader-Follower topology and a differential privacy model.We provide a theoretical analysis of the convergence rate and the trade-off between the performance and privacy in the private setting.The experimental results show that LEASGD outperforms state-of-the-art decentralized learning algorithm DPSGD by achieving steadily lower loss within the same iterations and by reducing the communication cost by 30%. In addition, LEASGD spends less differential privacy budget and has higher final accuracy result than DPSGD under private setting

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications