Neural Architecture Search (NAS) has emerged as one of the effective methods
to design the optimal neural network architecture automatically. Although
neural architectures have achieved human-level performances in several tasks,
few of them are obtained from the NAS method. The main reason is the huge
search space of neural architectures, making NAS algorithms inefficient. This
work presents a novel architecture search algorithm, called GPT-NAS, that
optimizes neural architectures by Generative Pre-Trained (GPT) model. In
GPT-NAS, we assume that a generative model pre-trained on a large-scale corpus
could learn the fundamental law of building neural architectures. Therefore,
GPT-NAS leverages the generative pre-trained (GPT) model to propose reasonable
architecture components given the basic one. Such an approach can largely
reduce the search space by introducing prior knowledge in the search process.
Extensive experimental results show that our GPT-NAS method significantly
outperforms seven manually designed neural architectures and thirteen
architectures provided by competing NAS methods. In addition, our ablation
study indicates that the proposed algorithm improves the performance of finely
tuned neural architectures by up to about 12% compared to those without GPT,
further demonstrating its effectiveness in searching neural architectures