Prior work on Private Inference (PI)--inferences performed directly on
encrypted input--has focused on minimizing a network's ReLUs, which have been
assumed to dominate PI latency rather than FLOPs. Recent work has shown that
FLOPs for PI can no longer be ignored and have high latency penalties. In this
paper, we develop DeepReShape, a network redesign technique that tailors
architectures to PI constraints, optimizing for both ReLUs and FLOPs for the
first time. The {\em key insight} is that a strategic allocation of channels
such that the network's ReLUs are aligned in their criticality order
simultaneously optimizes ReLU and FLOPs efficiency. DeepReShape automates
network development with an efficient process, and we call generated networks
HybReNets. We evaluate DeepReShape using standard PI benchmarks and demonstrate
a 2.1\% accuracy gain with a 5.2× runtime improvement at iso-ReLU on
CIFAR-100 and an 8.7× runtime improvement at iso-accuracy on
TinyImageNet. Furthermore, we demystify the input network selection in prior
ReLU optimizations and shed light on the key network attributes enabling PI
efficiency.Comment: 37 pages, 23 Figures, and 17 Table