2 research outputs found

    One-shot neural architecture search via novelty driven sampling

    Full text link
    One-Shot Neural architecture search (NAS) has received wide attentions due to its computational efficiency. Most state-of-the-art One-Shot NAS methods use the validation accuracy based on inheriting weights from the supernet as the stepping stone to search for the best performing architecture, adopting a bilevel optimization pattern with assuming this validation accuracy approximates to the test accuracy after re-training. However, recent works have found that there is no positive correlation between the above validation accuracy and test accuracy for these One-Shot NAS methods, and this reward based sampling for supernet training also entails the rich-get-richer problem. To handle this deceptive problem, this paper presents a new approach, Efficient Novelty-driven Neural Architecture Search, to sample the most abnormal architecture to train the supernet. Specifically, a single-path supernet is adopted, and only the weights of a single architecture sampled by our novelty search are optimized in each step to reduce the memory demand greatly. Experiments demonstrate the effectiveness and efficiency of our novelty search based architecture sampling method

    Automated Deep Learning: A Study on Neural Architecture Search

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Automated Deep Learning (AutoDL) aims to build a better deep learning model in a data-driven and automated manner, so that most practitioners in deep learning can also build a high-performance machine learning model, with being relieved from a labor-intensive and time-consuming neural network design process. AutoDL can bring new research ideas to deep neural networks, and lower the threshold of deep learning in various research areas through automated neural network design. This thesis focuses on the two specific research problems of neural architecture search (NAS) in the automated deep learning: one-shot NAS and differentiable NAS. In particular, this paper proposed a novelty driven sampling method and formulate the supernet training as a constrained continual learning optimization problem, to address the "rich-get-richer" problem and multi-model forgetting issue existing in one-shot NAS. As to the differentiable NAS, we leveraged a variational graph autoencoder to relieve the non-negligible incongruence, formulating the neural architecture search as a distribution learning problem to enhance exploration, and proposed the differentiable architecture search with stochastic implicit gradients to enable multi-step inner optimization
    corecore