2 research outputs found
Recommended from our members
Transformations for linguistic steganography
Linguistic steganography is a form of covert communication using natural language to conceal the existence of the hidden message. It is usually achieved by systematically making changes to a cover text, such that the manipulations, namely the very act
of communication, are undetectable to an outside observer (human or computer). In this thesis, we explore three possible linguistic transformations — lexical substitution, adjective deletion and word ordering — which are able to generate alternatives for a
cover text. For each transformation, we propose different transformation checkers in order to certify the naturalness of a modified sentence.
Our lexical substitution checkers are based on contextual n-gram counts and the αskew divergence of those counts derived from the Google n-gram corpus. For adjective deletion, we propose an n-gram count method similar to the substitution n-gram
checker and a support vector machine classifier using n-gram counts and other measures to classify deletable and undeletable adjectives in context. As for word ordering, we train a maximum entropy classifier using some syntactic features to determine the
naturalness of a sentence permutation.
The proposed transformation checkers were evaluated by human judged data, and the evaluation results are presented using precision and recall curves. The precision and recall of a transformation checker can be interpreted as the security level and the embedding capacity of the stegosystem, respectively. The results show that the proposed transformation checkers can provide a confident security level and reasonable embedding capacity for the steganography application.
In addition to the transformation checkers, we demonstrate possible data encoding methods for each of the linguistic transformations. For lexical substitution, we propose a novel encoding method based on vertex colouring. For adjective deletion, we
not only illustrate its usage in the steganography application, but also show that the adjective deletion technique can be applied to a secret sharing scheme, where the secret message is encoded in two different versions of the carrier text, with different adjectives deleted in each version. For word ordering, we propose a ranking-based encoding method and also show how the technique can be integrated into existing translation based embedding methods