15 research outputs found
PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment
Two-party computation (2PC) is promising to enable privacy-preserving deep
learning (DL). However, the 2PC-based privacy-preserving DL implementation
comes with high comparison protocol overhead from the non-linear operators.
This work presents PASNet, a novel systematic framework that enables low
latency, high energy efficiency & accuracy, and security-guaranteed 2PC-DL by
integrating the hardware latency of the cryptographic building block into the
neural architecture search loss function. We develop a cryptographic hardware
scheduler and the corresponding performance model for Field Programmable Gate
Arrays (FPGA) as a case study. The experimental results demonstrate that our
light-weighted model PASNet-A and heavily-weighted model PASNet-B achieve 63 ms
and 228 ms latency on private inference on ImageNet, which are 147 and 40 times
faster than the SOTA CryptGPU system, and achieve 70.54% & 78.79% accuracy and
more than 1000 times higher energy efficiency.Comment: DAC 2023 accepeted publication, short version was published on AAAI
2023 workshop on DL-Hardware Co-Design for AI Acceleration: RRNet: Towards
ReLU-Reduced Neural Network for Two-party Computation Based Private Inferenc
PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference
The rapid growth and deployment of deep learning (DL) has witnessed emerging
privacy and security concerns. To mitigate these issues, secure multi-party
computation (MPC) has been discussed, to enable the privacy-preserving DL
computation. In practice, they often come at very high computation and
communication overhead, and potentially prohibit their popularity in large
scale systems. Two orthogonal research trends have attracted enormous interests
in addressing the energy efficiency in secure deep learning, i.e., overhead
reduction of MPC comparison protocol, and hardware acceleration. However, they
either achieve a low reduction ratio and suffer from high latency due to
limited computation and communication saving, or are power-hungry as existing
works mainly focus on general computing platforms such as CPUs and GPUs.
In this work, as the first attempt, we develop a systematic framework,
PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware
acceleration, by integrating hardware latency of the cryptographic building
block into the DNN loss function to achieve high energy efficiency, accuracy,
and security guarantee. Instead of heuristically checking the model sensitivity
after a DNN is well-trained (through deleting or dropping some non-polynomial
operators), our key design principle is to em enforce exactly what is assumed
in the DNN design -- training a DNN that is both hardware efficient and secure,
while escaping the local minima and saddle points and maintaining high
accuracy. More specifically, we propose a straight through polynomial
activation initialization method for cryptographic hardware friendly trainable
polynomial activation function to replace the expensive 2P-ReLU operator. We
develop a cryptographic hardware scheduler and the corresponding performance
model for Field Programmable Gate Arrays (FPGA) platform