As neural networks continue their reach into nearly every aspect of software
operations, the details of those networks become an increasingly sensitive
subject. Even those that deploy neural networks embedded in physical devices
may wish to keep the inner working of their designs hidden -- either to protect
their intellectual property or as a form of protection from adversarial inputs.
The specific problem we address is how, through heavy system stack, given noisy
and imperfect memory traces, one might reconstruct the neural network
architecture including the set of layers employed, their connectivity, and
their respective dimension sizes. Considering both the intra-layer architecture
features and the inter-layer temporal association information introduced by the
DNN design empirical experience, we draw upon ideas from speech recognition to
solve this problem. We show that off-chip memory address traces and PCIe events
provide ample information to reconstruct such neural network architectures
accurately. We are the first to propose such accurate model extraction
techniques and demonstrate an end-to-end attack experimentally in the context
of an off-the-shelf Nvidia GPU platform with full system stack. Results show
that the proposed techniques achieve a high reverse engineering accuracy and
improve the one's ability to conduct targeted adversarial attack with success
rate from 14.6\%∼25.5\% (without network architecture knowledge) to 75.9\%
(with extracted network architecture)