22 research outputs found
Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding
Linguistic steganography (LS) aims to embed secret information into a highly
encoded text for covert communication. It can be roughly divided to two main
categories, i.e., modification based LS (MLS) and generation based LS (GLS).
Unlike MLS that hides secret data by slightly modifying a given text without
impairing the meaning of the text, GLS uses a trained language model to
directly generate a text carrying secret data. A common disadvantage for MLS
methods is that the embedding payload is very low, whose return is well
preserving the semantic quality of the text. In contrast, GLS allows the data
hider to embed a high payload, which has to pay the high price of
uncontrollable semantics. In this paper, we propose a novel LS method to modify
a given text by pivoting it between two different languages and embed secret
data by applying a GLS-like information encoding strategy. Our purpose is to
alter the expression of the given text, enabling a high payload to be embedded
while keeping the semantic information unchanged. Experimental results have
shown that the proposed work not only achieves a high embedding payload, but
also shows superior performance in maintaining the semantic consistency and
resisting linguistic steganalysis
Novel linguistic steganography based on character-level text generation
With the development of natural language processing, linguistic steganography has become a research hotspot in the field of information security. However, most existing linguistic steganographic methods may suffer from the low embedding capacity problem. Therefore, this paper proposes a character-level linguistic steganographic method (CLLS) to embed the secret information into characters instead of words by employing a long short-term memory (LSTM) based language model. First, the proposed method utilizes the LSTM model and large-scale corpus to construct and train a character-level text generation model. Through training, the best evaluated model is obtained as the prediction model of generating stego text. Then, we use the secret information as the control information to select the right character from predictions of the trained character-level text generation model. Thus, the secret information is hidden in the generated text as the predicted characters having different prediction probability values can be encoded into different secret bit values. For the same secret information, the generated stego texts vary with the starting strings of the text generation model, so we design a selection strategy to find the highest quality stego text from a number of candidate stego texts as the final stego text by changing the starting strings. The experimental results demonstrate that compared with other similar methods, the proposed method has the fastest running speed and highest embedding capacity. Moreover, extensive experiments are conducted to verify the effect of the number of candidate stego texts on the quality of the final stego text. The experimental results show that the quality of the final stego text increases with the number of candidate stego texts increasing, but the growth rate of the quality will slow down
Recommended from our members
Secure digital documents using Steganography and QR Code
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonWith the increasing use of the Internet several problems have arisen regarding the processing of electronic documents. These include content filtering, content retrieval/search. Moreover, document security has taken a centre stage including copyright protection, broadcast monitoring etc. There is an acute need of an effective tool which can find the identity, location and the time when the document was created so that it can be determined whether or not the contents of the document were tampered with after creation. Owing the sensitivity of the large amounts of data which is processed on a daily basis, verifying the authenticity and integrity of a document is more important now than it ever was. Unsurprisingly document authenticity verification has become the centre of attention in the world of research. Consequently, this research is concerned with creating a tool which deals with the above problem. This research proposes the use of a Quick Response Code as a message carrier for Text Key-print. The Text Key-print is a novel method which employs the basic element of the language (i.e. Characters of the alphabet) in order to achieve authenticity of electronic documents through the transformation of its physical structure into a logical structured relationship. The resultant dimensional matrix is then converted into a binary stream and encapsulated with a serial number or URL inside a Quick response Code (QR code) to form a digital fingerprint mark. For hiding a QR code, two image steganography techniques were developed based upon the spatial and the transform domains. In the spatial domain, three methods were proposed and implemented based on the least significant bit insertion technique and the use of pseudorandom number generator to scatter the message into a set of arbitrary pixels. These methods utilise the three colour channels in the images based on the RGB model based in order to embed one, two or three bits per the eight bit channel which results in three different hiding capacities. The second technique is an adaptive approach in transforming domain where a threshold value is calculated under a predefined location for embedding in order to identify the embedding strength of the embedding technique. The quality of the generated stego images was evaluated using both objective (PSNR) and Subjective (DSCQS) methods to ensure the reliability of our proposed methods. The experimental results revealed that PSNR is not a strong indicator of the perceived stego image quality, but not a bad interpreter also of the actual quality of stego images. Since the visual difference between the cover and the stego image must be absolutely imperceptible to the human visual system, it was logically convenient to ask human observers with different qualifications and experience in the field of image processing to evaluate the perceived quality of the cover and the stego image. Thus, the subjective responses were analysed using statistical measurements to describe the distribution of the scores given by the assessors. Thus, the proposed scheme presents an alternative approach to protect digital documents rather than the traditional techniques of digital signature and watermarking
Hiding Information in Reversible English Transforms for a Blind Receiver
This paper proposes a new technique for hiding secret messages in ordinary English text. The proposed technique exploits the redundancies existing in some English language constructs. Redundancies result from the flexibility in maneuvering certain statement constituents without altering the statement meaning or correctness. For example, one can say âshe went to sleep, because she was tiredâ or âBecause she was tired, she went to sleep.â The paper provides a number of such transformations that can be applied concurrently, while keeping the overall meaning and grammar intact. The proposed data hiding technique is blind since the receiver does not keep a copy of the original uncoded text (cover). Moreover, it can hide more than three bits per statement, which is higher than that achieved in the prior work. A secret key that is a function of the various transformations used is proposed to protect the confidentiality of the hidden message. Our security analysis shows that even if the attacker knows how the transforms are employed, the secret key provides enough security to protect the confidentiality of the hidden message. Moreover, we show that the proposed transformations do not affect the inconspicuousness of the transformed statements, and thus unlikely to draw suspicion
Hiding Information in Reversible English Transforms for a Blind Receiver
This paper proposes a new technique for hiding secret messages in ordinary English text. The proposed technique exploits the redundancies existing in some English language constructs. Redundancies result from the flexibility in maneuvering certain statement constituents without altering the statement meaning or correctness. For example, one can say "she went to sleep, because she was tired" or "Because she was tired, she went to sleep. " The paper provides a number of such transformations that can be applied concurrently, while keeping the overall meaning and grammar intact. The proposed data hiding technique is blind since the receiver does not keep a copy of the original uncoded text (cover). Moreover, it can hide more than three bits per statement, which is higher than that achieved in the prior work. A secret key that is a function of the various transformations used is proposed to protect the confidentiality of the hidden message. Our security analysis shows that even if the attacker knows how the transforms are employed, the secret key provides enough security to protect the confidentiality of the hidden message. Moreover, we show that the proposed transformations do not affect the inconspicuousness of the transformed statements, and thus unlikely to draw suspicion
Covert Channels Within IRC
The exploration of advanced information hiding techniques is important to understand and defend against illicit data extractions over networks. Many techniques have been developed to covertly transmit data over networks, each differing in their capabilities, methods, and levels of complexity. This research introduces a new class of information hiding techniques for use over Internet Relay Chat (IRC), called the Variable Advanced Network IRC Stealth Handler (VANISH) system. Three methods for concealing information are developed under this framework to suit the needs of an attacker. These methods are referred to as the Throughput, Stealth, and Baseline scenarios. Each is designed for a specific purpose: to maximize channel capacity, minimize shape-based detectability, or provide a baseline for comparison using established techniques applied to IRC. The effectiveness of these scenarios is empirically tested using public IRC servers in Chicago, Illinois and Amsterdam, Netherlands. The Throughput method exfiltrates covert data at nearly 800 bits per second (bps) compared to 18 bps with the Baseline method and 0.13 bps for the Stealth method. The Stealth method uses Reed-Solomon forward error correction to reduce bit errors from 3.1% to nearly 0% with minimal additional overhead. The Stealth method also successfully evades shape-based detection tests but is vulnerable to regularity-based tests
Recommended from our members
Steganography-based secret and reliable communications: Improving steganographic capacity and imperceptibility
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Unlike encryption, steganography hides the very existence of secret information rather than hiding its meaning only. Image based steganography is the most common system used since digital images are widely used over the Internet and Web. However, the capacity is mostly limited and restricted by the size of cover images. In addition, there is a tradeoff between both steganographic capacity and stego image quality. Therefore, increasing steganographic capacity and enhancing stego image quality are still challenges, and this is exactly our research main aim. Related to this, we also investigate hiding secret information in communication protocols, namely Simple Object Access Protocol (SOAP) message, rather than in conventional digital files.
To get a high steganographic capacity, two novel steganography methods were proposed. The first method was based on using 16x16 non-overlapping blocks and quantisation table for Joint Photographic Experts Group (JPEG) compression instead of 8x8. Then, the quality of JPEG stego images was enhanced by using optimised quantisation tables instead of the default tables. The second method, the hybrid method, was based on using optimised quantisation tables and two hiding techniques: JSteg along with our first proposed method. To increase the
steganographic capacity, the impact of hiding data within image chrominance was
investigated and explained. Since peak signal-to-noise ratio (PSNR) is extensively
used as a quality measure of stego images, the reliability of PSNR for stego images was also evaluated in the work described in this thesis. Finally, to eliminate any detectable traces that traditional steganography may leave in stego files, a novel and undetectable steganography method based on SOAP messages was proposed.
All methods proposed have been empirically validated as to indicate their utility
and value. The results revealed that our methods and suggestions improved the main aspects of image steganography. Nevertheless, PSNR was found not to be a
reliable quality evaluation measure to be used with stego image. On the other hand, information hiding in SOAP messages represented a distinctive way for undetectable and secret communication.The Ministry of Higher Education in Syria
and the University of Alepp
Hybrid Arabic text steganography
An improved method for Arabic text steganography is introduced in this paper. This method hides an Arabic text inside another based on a hybrid approach. Both Kashida and Arabic Diacritics are used to hide the Arabic text inside another text. In this improved method, the secret message is divided into two parts, the first part is to be hidden by the Kashida method, and the second is to be hidden by the Diacritics or Harakat method. For security purposes, we benefitted from the natural existence of Diacritics as a characteristic of Arabic written language, as used to represent vowel sounds. The paper exploits the possibility of hiding data in Fathah diacritic and Kashida punctuation marks, adjusting previously presented schemes that are based on a single method only. Here, the secret message is divided into two parts, the cover text is prepared, and then we apply the Harakat method on the first part. The Kashida method is applied on the second part, and then the two parts are combined. When the hidden âStegoTextâ is received, a split mechanism is used to recover the original message. The described hybrid Arabic StegoText showed higher capacity and security with promising results compared to other methods
Towards Code Watermarking with Dual-Channel Transformations
The expansion of the open source community and the rise of large language
models have raised ethical and security concerns on the distribution of source
code, such as misconduct on copyrighted code, distributions without proper
licenses, or misuse of the code for malicious purposes. Hence it is important
to track the ownership of source code, in wich watermarking is a major
technique. Yet, drastically different from natural languages, source code
watermarking requires far stricter and more complicated rules to ensure the
readability as well as the functionality of the source code. Hence we introduce
SrcMarker, a watermarking system to unobtrusively encode ID bitstrings into
source code, without affecting the usage and semantics of the code. To this
end, SrcMarker performs transformations on an AST-based intermediate
representation that enables unified transformations across different
programming languages. The core of the system utilizes learning-based embedding
and extraction modules to select rule-based transformations for watermarking.
In addition, a novel feature-approximation technique is designed to tackle the
inherent non-differentiability of rule selection, thus seamlessly integrating
the rule-based transformations and learning-based networks into an
interconnected system to enable end-to-end training. Extensive experiments
demonstrate the superiority of SrcMarker over existing methods in various
watermarking requirements.Comment: 16 page