18 research outputs found
Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks
We propose to focus on the problem of discovering neural network
architectures efficient in terms of both prediction quality and cost. For
instance, our approach is able to solve the following tasks: learn a neural
network able to predict well in less than 100 milliseconds or learn an
efficient model that fits in a 50 Mb memory. Our contribution is a novel family
of models called Budgeted Super Networks (BSN). They are learned using gradient
descent techniques applied on a budgeted learning objective function which
integrates a maximum authorized cost, while making no assumption on the nature
of this cost. We present a set of experiments on computer vision problems and
analyze the ability of our technique to deal with three different costs: the
computation cost, the memory consumption cost and a distributed computation
cost. We particularly show that our model can discover neural network
architectures that have a better accuracy than the ResNet and Convolutional
Neural Fabrics architectures on CIFAR-10 and CIFAR-100, at a lower cost.Comment: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
(CVPR
FLOPs as a Direct Optimization Objective for Learning Sparse Neural Networks
There exists a plethora of techniques for inducing structured sparsity in
parametric models during the optimization process, with the final goal of
resource-efficient inference. However, few methods target a specific number of
floating-point operations (FLOPs) as part of the optimization objective,
despite many reporting FLOPs as part of the results. Furthermore, a
one-size-fits-all approach ignores realistic system constraints, which differ
significantly between, say, a GPU and a mobile phone -- FLOPs on the former
incur less latency than on the latter; thus, it is important for practitioners
to be able to specify a target number of FLOPs during model compression. In
this work, we extend a state-of-the-art technique to directly incorporate FLOPs
as part of the optimization objective and show that, given a desired FLOPs
requirement, different neural networks can be successfully trained for image
classification.Comment: 4 pages, accepted to the NIPS 2018 Workshop on Compact Deep Neural
Networks with Industrial Applications (CDNNRIA
Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search
Recent work in network quantization has substantially reduced the time and
space complexity of neural network inference, enabling their deployment on
embedded and mobile devices with limited computational and memory resources.
However, existing quantization methods often represent all weights and
activations with the same precision (bit-width). In this paper, we explore a
new dimension of the design space: quantizing different layers with different
bit-widths. We formulate this problem as a neural architecture search problem
and propose a novel differentiable neural architecture search (DNAS) framework
to efficiently explore its exponential search space with gradient-based
optimization. Experiments show we surpass the state-of-the-art compression of
ResNet on CIFAR-10 and ImageNet. Our quantized models with 21.1x smaller model
size or 103.9x lower computational cost can still outperform baseline quantized
or even full precision models
DARTS: Differentiable Architecture Search
This paper addresses the scalability challenge of architecture search by
formulating the task in a differentiable manner. Unlike conventional approaches
of applying evolution or reinforcement learning over a discrete and
non-differentiable search space, our method is based on the continuous
relaxation of the architecture representation, allowing efficient search of the
architecture using gradient descent. Extensive experiments on CIFAR-10,
ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in
discovering high-performance convolutional architectures for image
classification and recurrent architectures for language modeling, while being
orders of magnitude faster than state-of-the-art non-differentiable techniques.
Our implementation has been made publicly available to facilitate further
research on efficient architecture search algorithms.Comment: Published at ICLR 2019; Code and pretrained models available at
https://github.com/quark0/dart
Modeling Neural Architecture Search Methods for Deep Networks
There are many research works on the designing of architectures for the deep
neural networks (DNN), which are named neural architecture search (NAS)
methods. Although there are many automatic and manual techniques for NAS
problems, there is no unifying model in which these NAS methods can be explored
and compared. In this paper, we propose a general abstraction model for NAS
methods. By using the proposed framework, it is possible to compare different
design approaches for categorizing and identifying critical areas of interest
in designing DNN architectures. Also, under this framework, different methods
in the NAS area are summarized; hence a better view of their advantages and
disadvantages is possible.Comment: 6 pages, 7 figure
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
Designing accurate and efficient ConvNets for mobile devices is challenging
because the design space is combinatorially large. Due to this, previous neural
architecture search (NAS) methods are computationally expensive. ConvNet
architecture optimality depends on factors such as input resolution and target
devices. However, existing approaches are too expensive for case-by-case
redesigns. Also, previous work focuses primarily on reducing FLOPs, but FLOP
count does not always reflect actual latency. To address these, we propose a
differentiable neural architecture search (DNAS) framework that uses
gradient-based methods to optimize ConvNet architectures, avoiding enumerating
and training individual architectures separately as in previous methods.
FBNets, a family of models discovered by DNAS surpass state-of-the-art models
both designed manually and generated automatically. FBNet-B achieves 74.1%
top-1 accuracy on ImageNet with 295M FLOPs and 23.1 ms latency on a Samsung S8
phone, 2.4x smaller and 1.5x faster than MobileNetV2-1.3 with similar accuracy.
Despite higher accuracy and lower latency than MnasNet, we estimate FBNet-B's
search cost is 420x smaller than MnasNet's, at only 216 GPU-hours. Searched for
different resolutions and channel sizes, FBNets achieve 1.5% to 6.4% higher
accuracy than MobileNetV2. The smallest FBNet achieves 50.2% accuracy and 2.9
ms latency (345 frames per second) on a Samsung S8. Over a Samsung-optimized
FBNet, the iPhone-X-optimized model achieves a 1.4x speedup on an iPhone X
SparseMask: Differentiable Connectivity Learning for Dense Image Prediction
In this paper, we aim at automatically searching an efficient network
architecture for dense image prediction. Particularly, we follow the
encoder-decoder style and focus on designing a connectivity structure for the
decoder. To achieve that, we design a densely connected network with learnable
connections, named Fully Dense Network, which contains a large set of possible
final connectivity structures. We then employ gradient descent to search the
optimal connectivity from the dense connections. The search process is guided
by a novel loss function, which pushes the weight of each connection to be
binary and the connections to be sparse. The discovered connectivity achieves
competitive results on two segmentation datasets, while runs more than three
times faster and requires less than half parameters compared to the
state-of-the-art methods. An extensive experiment shows that the discovered
connectivity is compatible with various backbones and generalizes well to other
dense image prediction tasks.Comment: Accepted by ICCV 2019. Code is available at
https://github.com/wuhuikai/SparseMas
Single Path One-Shot Neural Architecture Search with Uniform Sampling
We revisit the one-shot Neural Architecture Search (NAS) paradigm and analyze
its advantages over existing NAS approaches. Existing one-shot method, however,
is hard to train and not yet effective on large scale datasets like ImageNet.
This work propose a Single Path One-Shot model to address the challenge in the
training. Our central idea is to construct a simplified supernet, where all
architectures are single paths so that weight co-adaption problem is
alleviated. Training is performed by uniform path sampling. All architectures
(and their weights) are trained fully and equally.
Comprehensive experiments verify that our approach is flexible and effective.
It is easy to train and fast to search. It effortlessly supports complex search
spaces (e.g., building blocks, channel, mixed-precision quantization) and
different search constraints (e.g., FLOPs, latency). It is thus convenient to
use for various needs. It achieves start-of-the-art performance on the large
dataset ImageNet.Comment: ECCV 202
Resolution Adaptive Networks for Efficient Inference
Adaptive inference is an effective mechanism to achieve a dynamic tradeoff
between accuracy and computational cost in deep networks. Existing works mainly
exploit architecture redundancy in network depth or width. In this paper, we
focus on spatial redundancy of input samples and propose a novel Resolution
Adaptive Network (RANet), which is inspired by the intuition that
low-resolution representations are sufficient for classifying "easy" inputs
containing large objects with prototypical features, while only some "hard"
samples need spatially detailed information. In RANet, the input images are
first routed to a lightweight sub-network that efficiently extracts
low-resolution representations, and those samples with high prediction
confidence will exit early from the network without being further processed.
Meanwhile, high-resolution paths in the network maintain the capability to
recognize the "hard" samples. Therefore, RANet can effectively reduce the
spatial redundancy involved in inferring high-resolution inputs. Empirically,
we demonstrate the effectiveness of the proposed RANet on the CIFAR-10,
CIFAR-100 and ImageNet datasets in both the anytime prediction setting and the
budgeted batch classification setting.Comment: CVPR 202
ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search
How to discover and evaluate the true strength of models quickly and
accurately is one of the key challenges in Neural Architecture Search (NAS). To
cope with this problem, we propose an Architecture-Driven Weight Prediction
(ADWP) approach for neural architecture search (NAS). In our approach, we first
design an architecture-intensive search space and then train a HyperNetwork by
inputting stochastic encoding architecture parameters. In the trained
HyperNetwork, weights of convolution kernels can be well predicted for neural
architectures in the search space. Consequently, the target architectures can
be evaluated efficiently without any finetuning, thus enabling us to search
fortheoptimalarchitectureinthespaceofgeneralnetworks (macro-search). Through
real experiments, we evaluate the performance of the models discovered by the
proposed AD-WPNAS and results show that one search procedure can be completed
in 4.0 GPU hours on CIFAR-10. Moreover, the discovered model obtains a test
error of 2.41% with only 1.52M parameters which is superior to the best
existing models