257 research outputs found

    Frequency Dropout: Feature-Level Regularization via Randomized Filtering

    Full text link
    Deep convolutional neural networks have shown remarkable performance on various computer vision tasks, and yet, they are susceptible to picking up spurious correlations from the training signal. So called `shortcuts' can occur during learning, for example, when there are specific frequencies present in the image data that correlate with the output predictions. Both high and low frequencies can be characteristic of the underlying noise distribution caused by the image acquisition rather than in relation to the task-relevant information about the image content. Models that learn features related to this characteristic noise will not generalize well to new data. In this work, we propose a simple yet effective training strategy, Frequency Dropout, to prevent convolutional neural networks from learning frequency-specific imaging features. We employ randomized filtering of feature maps during training which acts as a feature-level regularization. In this study, we consider common image processing filters such as Gaussian smoothing, Laplacian of Gaussian, and Gabor filtering. Our training strategy is model-agnostic and can be used for any computer vision task. We demonstrate the effectiveness of Frequency Dropout on a range of popular architectures and multiple tasks including image classification, domain adaptation, and semantic segmentation using both computer vision and medical imaging datasets. Our results suggest that the proposed approach does not only improve predictive accuracy but also improves robustness against domain shift.Comment: 15 page

    HCM: Hardware-Aware Complexity Metric for Neural Network Architectures

    Full text link
    Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages

    Evaluation of Parameter-Scaling for Efficient Deep Learning on Small Satellites

    Get PDF
    Parameter-scaling techniques change the number of parameters in a machine-learning model in an effort to make the network more amenable to different device types or accuracy requirements. This research compares the performance of two such techniques. NeuralScale is a neural architecture search method which claims to generate deep neural networks for devices that are resource-constrained. It shrinks a network to a target number of parameters by adjusting the width of layers independently to achieve a higher accuracy than previous methods. The novel NeuralScale algorithm is compared to the baseline uniform scaling of MobileNet-style models, where the width of each layer in the model is scaled uniformly across the network. Measurements of the latency and runtime memory required for inference were gathered on the NVIDIA Jetson TX2 and Jetson AGX Xavier embedded GPUs using NVIDIA TensorRT. Measurements were also gathered on the Raspberry Pi 4 embedded CPU featuring ARM Cortex-A72 cores using ONNX Runtime. VGG-11, MobileNetV2, Pre-Activation ResNet-18, and ResNet-50 were all scaled to 0.25×, 0.50×, 0.75×, and 1.00× the original number of parameters. On embedded GPUs, this research finds that NeuralScale models do offer higher accuracy, but they run slower and consume much more runtime memory during inference than their equivalent uniform-scaling models. On average, NeuralScale is 40% as efficient as uniform scaling in terms of accuracy per megabyte of runtime memory, and NeuralScale uses 2.7× the runtime memory per parameter as uniform scaling. On the embedded CPU, NeuralScale is slightly more efficient than uniform scaling in terms of accuracy per megabyte of memory, using essentially the same amount of memory per parameter. However, there is on average an over 2.5× increase in the latency for inference. Importantly, parameter count does not guarantee performance in terms of runtime-memory usage between the scaling methods on embedded GPUs, while latency grows significantly on embedded CPUs
    • …
    corecore