In this paper, we propose a privacy-preserving method with a secret key for
convolutional neural network (CNN)-based speech classification tasks. Recently,
many methods related to privacy preservation have been developed in image
classification research fields. In contrast, in speech classification research
fields, little research has considered these risks. To promote research on
privacy preservation for speech classification, we provide an encryption method
with a secret key in CNN-based speech classification systems. The encryption
method is based on a random matrix with an invertible inverse. The encrypted
speech data with a correct key can be accepted by a model with an encrypted
kernel generated using an inverse matrix of a random matrix. Whereas the
encrypted speech data is strongly distorted, the classification tasks can be
correctly performed when a correct key is provided. Additionally, in this
paper, we evaluate the difficulty of reconstructing the original information
from the encrypted spectrograms and waveforms. In our experiments, the proposed
encryption methods are performed in automatic speech recognition~(ASR) and
automatic speaker verification~(ASV) tasks. The results show that the encrypted
data can be used completely the same as the original data when a correct secret
key is provided in the transformer-based ASR and x-vector-based ASV with
self-supervised front-end systems. The robustness of the encrypted data against
reconstruction attacks is also illustrated.Comment: To appear in the 31st European Signal Processing Conference (EUSIPCO
2023