2,731 research outputs found
MobileNetV2: Inverted Residuals and Linear Bottlenecks
In this paper we describe a new mobile architecture, MobileNetV2, that
improves the state of the art performance of mobile models on multiple tasks
and benchmarks as well as across a spectrum of different model sizes. We also
describe efficient ways of applying these mobile models to object detection in
a novel framework we call SSDLite. Additionally, we demonstrate how to build
mobile semantic segmentation models through a reduced form of DeepLabv3 which
we call Mobile DeepLabv3.
The MobileNetV2 architecture is based on an inverted residual structure where
the input and output of the residual block are thin bottleneck layers opposite
to traditional residual models which use expanded representations in the input
an MobileNetV2 uses lightweight depthwise convolutions to filter features in
the intermediate expansion layer. Additionally, we find that it is important to
remove non-linearities in the narrow layers in order to maintain
representational power. We demonstrate that this improves performance and
provide an intuition that led to this design. Finally, our approach allows
decoupling of the input/output domains from the expressiveness of the
transformation, which provides a convenient framework for further analysis. We
measure our performance on Imagenet classification, COCO object detection, VOC
image segmentation. We evaluate the trade-offs between accuracy, and number of
operations measured by multiply-adds (MAdd), as well as the number of
parameter
ADaPTION: Toolbox and Benchmark for Training Convolutional Neural Networks with Reduced Numerical Precision Weights and Activation
Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs) are
useful for many practical tasks in machine learning. Synaptic weights, as well
as neuron activation functions within the deep network are typically stored
with high-precision formats, e.g. 32 bit floating point. However, since storage
capacity is limited and each memory access consumes power, both storage
capacity and memory access are two crucial factors in these networks. Here we
present a method and present the ADaPTION toolbox to extend the popular deep
learning library Caffe to support training of deep CNNs with reduced numerical
precision of weights and activations using fixed point notation. ADaPTION
includes tools to measure the dynamic range of weights and activations. Using
the ADaPTION tools, we quantized several CNNs including VGG16 down to 16-bit
weights and activations with only 0.8% drop in Top-1 accuracy. The
quantization, especially of the activations, leads to increase of up to 50% of
sparsity especially in early and intermediate layers, which we exploit to skip
multiplications with zero, thus performing faster and computationally cheaper
inference.Comment: 10 pages, 5 figure
- …