181 research outputs found
ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation
Single-path based differentiable neural architecture search has great
strengths for its low computational cost and memory-friendly nature. However,
we surprisingly discover that it suffers from severe searching instability
which has been primarily ignored, posing a potential weakness for a wider
application. In this paper, we delve into its performance collapse issue and
propose a new algorithm called RObustifying Memory-Efficient NAS (ROME).
Specifically, 1) for consistent topology in the search and evaluation stage, we
involve separate parameters to disentangle the topology from the operations of
the architecture. In such a way, we can independently sample connections and
operations without interference; 2) to discount sampling unfairness and
variance, we enforce fair sampling for weight update and apply a gradient
accumulation mechanism for architecture parameters. Extensive experiments
demonstrate that our proposed method has strong performance and robustness,
where it mostly achieves state-of-the-art results on a large number of standard
benchmarks.Comment: Observe new collapse in memory efficient NAS and address it using
ROM
IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance
Among existing Neural Architecture Search methods, DARTS is known for its
efficiency and simplicity. This approach applies continuous relaxation of
network representation to construct a weight-sharing supernet and enables the
identification of excellent subnets in just a few GPU days. However,
performance collapse in DARTS results in deteriorating architectures filled
with parameter-free operations and remains a great challenge to the robustness.
To resolve this problem, we reveal that the fundamental reason is the biased
estimation of the candidate importance in the search space through theoretical
and experimental analysis, and more precisely select operations via
information-based measurements. Furthermore, we demonstrate that the excessive
concern over the supernet and inefficient utilization of data in bi-level
optimization also account for suboptimal results. We adopt a more realistic
objective focusing on the performance of subnets and simplify it with the help
of the information-based measurements. Finally, we explain theoretically why
progressively shrinking the width of the supernet is necessary and reduce the
approximation error of optimal weights in DARTS. Our proposed method, named
IS-DARTS, comprehensively improves DARTS and resolves the aforementioned
problems. Extensive experiments on NAS-Bench-201 and DARTS-based search space
demonstrate the effectiveness of IS-DARTS.Comment: accepted by AAAI2024, paper + supplementary, 11 page
- …