Deep convolutional neural network (CNN) training via iterative optimization
has had incredible success in finding optimal parameters. However, modern CNN
architectures often contain millions of parameters. Thus, any given model for a
single architecture resides in a massive parameter space. Models with similar
loss could have drastically different characteristics such as adversarial
robustness, generalizability, and quantization robustness. For deep learning on
the edge, quantization robustness is often crucial. Finding a model that is
quantization-robust can sometimes require significant efforts. Recent works
using Graph Hypernetworks (GHN) have shown remarkable performance predicting
high-performant parameters of varying CNN architectures. Inspired by these
successes, we wonder if the graph representations of GHN-2 can be leveraged to
predict quantization-robust parameters as well, which we call GHN-Q. We conduct
the first-ever study exploring the use of graph hypernetworks for predicting
parameters of unseen quantized CNN architectures. We focus on a reduced CNN
search space and find that GHN-Q can in fact predict quantization-robust
parameters for various 8-bit quantized CNNs. Decent quantized accuracies are
observed even with 4-bit quantization despite GHN-Q not being trained on it.
Quantized finetuning of GHN-Q at lower bitwidths may bring further improvements
and is currently being explored.Comment: Updated Figure 1 and added additional results in Table 1. Initial
extended abstract version accepted at Edge Intelligence Workshop 2022 for
poster presentatio