722 research outputs found
GP-NAS-ensemble: a model for NAS Performance Prediction
It is of great significance to estimate the performance of a given model
architecture without training in the application of Neural Architecture Search
(NAS) as it may take a lot of time to evaluate the performance of an
architecture. In this paper, a novel NAS framework called GP-NAS-ensemble is
proposed to predict the performance of a neural network architecture with a
small training dataset. We make several improvements on the GP-NAS model to
make it share the advantage of ensemble learning methods. Our method ranks
second in the CVPR2022 second lightweight NAS challenge performance prediction
track
NAR-Former V2: Rethinking Transformer for Universal Neural Network Representation Learning
As more deep learning models are being applied in real-world applications,
there is a growing need for modeling and learning the representations of neural
networks themselves. An efficient representation can be used to predict target
attributes of networks without the need for actual training and deployment
procedures, facilitating efficient network deployment and design. Recently,
inspired by the success of Transformer, some Transformer-based representation
learning frameworks have been proposed and achieved promising performance in
handling cell-structured models. However, graph neural network (GNN) based
approaches still dominate the field of learning representation for the entire
network. In this paper, we revisit Transformer and compare it with GNN to
analyse their different architecture characteristics. We then propose a
modified Transformer-based universal neural network representation learning
model NAR-Former V2. It can learn efficient representations from both
cell-structured networks and entire networks. Specifically, we first take the
network as a graph and design a straightforward tokenizer to encode the network
into a sequence. Then, we incorporate the inductive representation learning
capability of GNN into Transformer, enabling Transformer to generalize better
when encountering unseen architecture. Additionally, we introduce a series of
simple yet effective modifications to enhance the ability of the Transformer in
learning representation from graph structures. Our proposed method surpasses
the GNN-based method NNLP by a significant margin in latency estimation on the
NNLQP dataset. Furthermore, regarding accuracy prediction on the NASBench101
and NASBench201 datasets, our method achieves highly comparable performance to
other state-of-the-art methods.Comment: 9 pages, 2 figures, 6 tables. Code is available at
https://github.com/yuny220/NAR-Former-V
Efficacy of Neural Prediction-Based NAS for Zero-Shot NAS Paradigm
In prediction-based Neural Architecture Search (NAS), performance indicators
derived from graph convolutional networks have shown significant success. These
indicators, achieved by representing feed-forward structures as component
graphs through one-hot encoding, face a limitation: their inability to evaluate
architecture performance across varying search spaces. In contrast, handcrafted
performance indicators (zero-shot NAS), which use the same architecture with
random initialization, can generalize across multiple search spaces. Addressing
this limitation, we propose a novel approach for zero-shot NAS using deep
learning. Our method employs Fourier sum of sines encoding for convolutional
kernels, enabling the construction of a computational feed-forward graph with a
structure similar to the architecture under evaluation. These encodings are
learnable and offer a comprehensive view of the architecture's topological
information. An accompanying multi-layer perceptron (MLP) then ranks these
architectures based on their encodings. Experimental results show that our
approach surpasses previous methods using graph convolutional networks in terms
of correlation on the NAS-Bench-201 dataset and exhibits a higher convergence
rate. Moreover, our extracted feature representation trained on each
NAS-Benchmark is transferable to other NAS-Benchmarks, showing promising
generalizability across multiple search spaces. The code is available at:
https://github.com/minh1409/DFT-NPZS-NASComment: 12 pages, 6 figure
GENNAPE: Towards Generalized Neural Architecture Performance Estimators
Predicting neural architecture performance is a challenging task and is
crucial to neural architecture design and search. Existing approaches either
rely on neural performance predictors which are limited to modeling
architectures in a predefined design space involving specific sets of operators
and connection rules, and cannot generalize to unseen architectures, or resort
to zero-cost proxies which are not always accurate. In this paper, we propose
GENNAPE, a Generalized Neural Architecture Performance Estimator, which is
pretrained on open neural architecture benchmarks, and aims to generalize to
completely unseen architectures through combined innovations in network
representation, contrastive pretraining, and fuzzy clustering-based predictor
ensemble. Specifically, GENNAPE represents a given neural network as a
Computation Graph (CG) of atomic operations which can model an arbitrary
architecture. It first learns a graph encoder via Contrastive Learning to
encourage network separation by topological features, and then trains multiple
predictor heads, which are soft-aggregated according to the fuzzy membership of
a neural network. Experiments show that GENNAPE pretrained on NAS-Bench-101 can
achieve superior transferability to 5 different public neural network
benchmarks, including NAS-Bench-201, NAS-Bench-301, MobileNet and ResNet
families under no or minimum fine-tuning. We further introduce 3 challenging
newly labelled neural network benchmarks: HiAML, Inception and Two-Path, which
can concentrate in narrow accuracy ranges. Extensive experiments show that
GENNAPE can correctly discern high-performance architectures in these families.
Finally, when paired with a search algorithm, GENNAPE can find architectures
that improve accuracy while reducing FLOPs on three families.Comment: AAAI 2023 Oral Presentation; includes supplementary materials with
more details on introduced benchmarks; 14 Pages, 6 Figures, 10 Table
A Survey on Surrogate-assisted Efficient Neural Architecture Search
Neural architecture search (NAS) has become increasingly popular in the deep
learning community recently, mainly because it can provide an opportunity to
allow interested users without rich expertise to benefit from the success of
deep neural networks (DNNs). However, NAS is still laborious and time-consuming
because a large number of performance estimations are required during the
search process of NAS, and training DNNs is computationally intensive. To solve
the major limitation of NAS, improving the efficiency of NAS is essential in
the design of NAS. This paper begins with a brief introduction to the general
framework of NAS. Then, the methods for evaluating network candidates under the
proxy metrics are systematically discussed. This is followed by a description
of surrogate-assisted NAS, which is divided into three different categories,
namely Bayesian optimization for NAS, surrogate-assisted evolutionary
algorithms for NAS, and MOP for NAS. Finally, remaining challenges and open
research questions are discussed, and promising research topics are suggested
in this emerging field.Comment: 18 pages, 7 figure
Neural Architecture Search for Image Segmentation and Classification
Deep learning (DL) is a class of machine learning algorithms that relies on deep neural networks (DNNs) for computations. Unlike traditional machine learning algorithms, DL can learn from raw data directly and effectively. Hence, DL has been successfully applied to tackle many real-world problems. When applying DL to a given problem, the primary task is designing the optimum DNN. This task relies heavily on human expertise, is time-consuming, and requires many trial-and-error experiments.
This thesis aims to automate the laborious task of designing the optimum DNN by exploring the neural architecture search (NAS) approach. Here, we propose two new NAS algorithms for two real-world problems: pedestrian lane detection for assistive navigation and hyperspectral image segmentation for biosecurity scanning. Additionally, we also introduce a new dataset-agnostic predictor of neural network performance, which can be used to speed-up NAS algorithms that require the evaluation of candidate DNNs
- …