In the wake of the burgeoning expansion of generative artificial intelligence
(AI) services, the computational demands inherent to these technologies
frequently necessitate cloud-powered computational offloading, particularly for
resource-constrained mobile devices. These services commonly employ prompts to
steer the generative process, and both the prompts and the resultant content,
such as text and images, may harbor privacy-sensitive or confidential
information, thereby elevating security and privacy risks. To mitigate these
concerns, we introduce Λ-Split, a split computing framework to
facilitate computational offloading while simultaneously fortifying data
privacy against risks such as eavesdropping and unauthorized access. In
Λ-Split, a generative model, usually a deep neural network (DNN), is
partitioned into three sub-models and distributed across the user's local
device and a cloud server: the input-side and output-side sub-models are
allocated to the local, while the intermediate, computationally-intensive
sub-model resides on the cloud server. This architecture ensures that only the
hidden layer outputs are transmitted, thereby preventing the external
transmission of privacy-sensitive raw input and output data. Given the
black-box nature of DNNs, estimating the original input or output from
intercepted hidden layer outputs poses a significant challenge for malicious
eavesdroppers. Moreover, Λ-Split is orthogonal to traditional
encryption-based security mechanisms, offering enhanced security when deployed
in conjunction. We empirically validate the efficacy of the Λ-Split
framework using Llama 2 and Stable Diffusion XL, representative large language
and diffusion models developed by Meta and Stability AI, respectively. Our
Λ-Split implementation is publicly accessible at
https://github.com/nishio-laboratory/lambda_split.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl