120,013 research outputs found

    Neural Architecture Search: Insights from 1000 Papers

    Full text link
    In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas, including computer vision, natural language understanding, speech recognition, and reinforcement learning. Specialized, high-performing neural architectures are crucial to the success of deep learning in these areas. Neural architecture search (NAS), the process of automating the design of neural architectures for a given task, is an inevitable next step in automating machine learning and has already outpaced the best human-designed architectures on many tasks. In the past few years, research in NAS has been progressing rapidly, with over 1000 papers released since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized and comprehensive guide to neural architecture search. We give a taxonomy of search spaces, algorithms, and speedup techniques, and we discuss resources such as benchmarks, best practices, other surveys, and open-source libraries

    Transfer Learning by Similarity Centred Architecture Evolution for Multiple Residential Load Forecasting

    Get PDF
    The development from traditional low voltage grids to smart systems has become extensive and adopted worldwide. Expanding the demand response program to cover the residential sector raises a wide range of challenges. Short term load forecasting for residential consumers in a neighbourhood could lead to a better understanding of low voltage consumption behaviour. Nevertheless, users with similar characteristics can present diversity in consumption patterns. Consequently, transfer learning methods have become a useful tool to tackle differences among residential time series. This paper proposes a method combining evolutionary algorithms for neural architecture search with transfer learning to perform short term load forecasting in a neighbourhood with multiple household load consumption. The approach centres its efforts on neural architecture search using evolutionary algorithms. The neural architecture evolution process retains the patterns of the centre-most house, and later the architecture weights are adjusted for each house in a multihouse set from a neighbourhood. In addition, a sensitivity analysis was conducted to ensure model performance. Experimental results on a large dataset containing hourly load consumption for ten houses in London, Ontario showed that the performance of the proposed approach performs better than the compared techniques. Moreover, the proposed method presents the average accuracy performance of 3.17 points higher than the state-of-the-art LSTM one shot method

    Towards Optimal Compression: Joint Pruning and Quantization

    Full text link
    Compression of deep neural networks has become a necessary stage for optimizing model inference on resource-constrained hardware. This paper presents FITCompress, a method for unifying layer-wise mixed precision quantization and pruning under a single heuristic, as an alternative to neural architecture search and Bayesian-based techniques. FITCompress combines the Fisher Information Metric, and path planning through compression space, to pick optimal configurations given size and operation constraints with single-shot fine-tuning. Experiments on ImageNet validate the method and show that our approach yields a better trade-off between accuracy and efficiency when compared to the baselines. Besides computer vision benchmarks, we experiment with the BERT model on a language understanding task, paving the way towards its optimal compression

    Assessing hyper parameter optimization and speedup for convolutional neural networks

    Get PDF
    The increased processing power of graphical processing units (GPUs) and the availability of large image datasets has fostered a renewed interest in extracting semantic information from images. Promising results for complex image categorization problems have been achieved using deep learning, with neural networks comprised of many layers. Convolutional neural networks (CNN) are one such architecture which provides more opportunities for image classification. Advances in CNN enable the development of training models using large labelled image datasets, but the hyper parameters need to be specified, which is challenging and complex due to the large number of parameters. A substantial amount of computational power and processing time is required to determine the optimal hyper parameters to define a model yielding good results. This article provides a survey of the hyper parameter search and optimization methods for CNN architectures
    corecore