Search CORE

4 research outputs found

Fast Neural Architecture Construction using EnvelopeNets

Author: Dutta Debo
Kamath Purushotham
Singh Abhishek
Publication venue
Publication date: 13/12/2018
Field of study

Fast Neural Architecture Construction (NAC) is a method to construct deep network architectures by pruning and expansion of a base network. In recent years, several automated search methods for neural network architectures have been proposed using methods such as evolutionary algorithms and reinforcement learning. These methods use a single scalar objective function (usually accuracy) that is evaluated after a full training and evaluation cycle. In contrast NAC directly compares the utility of different filters using statistics derived from filter featuremaps reach a state where the utility of different filters within a network can be compared and hence can be used to construct networks. The training epochs needed for filters within a network to reach this state is much less than the training epochs needed for the accuracy of a network to stabilize. NAC exploits this finding to construct convolutional neural nets (CNNs) with close to state of the art accuracy, in < 1 GPU day, faster than most of the current neural architecture search methods. The constructed networks show close to state of the art performance on the image classification problem on well known datasets (CIFAR-10, ImageNet) and consistently show better performance than hand constructed and randomly generated networks of the same depth, operators and approximately the same number of parameters.Comment: A shorter version of this paper appeared in the Workshop on MetaLearning 2018 (MetaLearning 2018 at NeurIPS 2018

arXiv.org e-Print Archive

EENA: Efficient Evolution of Neural Architecture

Author: An Zhulin
Xu Kaiqiang
Xu Yongjun
Yang Chuanguang
Zhao Erhu
Zhu Hui
Publication venue
Publication date: 26/08/2019
Field of study

Latest algorithms for automatic neural architecture search perform remarkable but are basically directionless in search space and computational expensive in training of every intermediate architecture. In this paper, we propose a method for efficient architecture search called EENA (Efficient Evolution of Neural Architecture). Due to the elaborately designed mutation and crossover operations, the evolution process can be guided by the information have already been learned. Therefore, less computational effort will be required while the searching and training time can be reduced significantly. On CIFAR-10 classification, EENA using minimal computational resources (0.65 GPU-days) can design highly effective neural architecture which achieves 2.56% test error with 8.47M parameters. Furthermore, the best architecture discovered is also transferable for CIFAR-100.Comment: Accepted by ICCV2019 Neural Architects Workshop (ICCVW

arXiv.org e-Print Archive

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

Author: Cai Han
Han Song
Zhu Ligeng
Publication venue
Publication date: 22/02/2019
Field of study

Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g.

10^4

GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on the target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale target tasks and target hardware platforms. We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set. Experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of directness and specialization. On CIFAR-10, our model achieves 2.08\% test error with only 5.7M parameters, better than the previous state-of-the-art architecture AmoebaNet-B, while using 6

\times

fewer parameters. On ImageNet, our model achieves 3.1\% better top-1 accuracy than MobileNetV2, while being 1.2

\times

faster with measured GPU latency. We also apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.Comment: ICLR 201

arXiv.org e-Print Archive

NASIB: Neural Architecture Search withIn Budget

Author: Dubey Shiv Ram
Dutta Debo
Garg Anubhav
Singh Abhishek
Zhou Jinan
Publication venue
Publication date: 18/10/2019
Field of study

Neural Architecture Search (NAS) represents a class of methods to generate the optimal neural network architecture and typically iterate over candidate architectures till convergence over some particular metric like validation loss. They are constrained by the available computation resources, especially in enterprise environments. In this paper, we propose a new approach for NAS, called NASIB, which adapts and attunes to the computation resources (budget) available by varying the exploration vs. exploitation trade-off. We reduce the expert bias by searching over an augmented search space induced by Superkernels. The proposed method can provide the architecture search useful for different computation resources and different domains beyond image classification of natural images where we lack bespoke architecture motifs and domain expertise. We show, on CIFAR10, that itis possible to search over a space that comprises of 12x more candidate operations than the traditional prior art in just 1.5 GPU days, while reaching close to state of the art accuracy. While our method searches over an exponentially larger search space, it could lead to novel architectures that require lesser domain expertise, compared to the majority of the existing methods

arXiv.org e-Print Archive