Enhancing steganography for hiding pixels inside audio signals

Abstract

Multimodal steganography consists of concealing a signal into another one of a different medium, such that the latter is only very slightly distorted and the hidden information can be later recovered. A previous work employed deep learning techniques to this end by hiding an image inside an audio signal's spectrogram in a way that the encoding of one is independent of the other. In this work we explore the way in which images were being encoded previously and present a collection of improvements that produce a significant increase in the quality of the system. These mainly consist in encoding the image in a smarter way such that more information is able to be transmitted in a container of the same size. We also explore the possibility of using the short-time Fourier transform phase as an alternative to the magnitude and to randomly permute the signal to break the structure of the noise. Finally, we report results when using a larger container signal and outline possible directions for future work

    Similar works