Search CORE

22 research outputs found

Semantic-Preserving Linguistic Steganography by Pivot Translation and Semantic-Aware Bins Coding

Author: Feng Guorui
Wu Hanzhou
Yang Tianyu
Yi Biao
Zhang Xinpeng
Publication venue
Publication date: 07/03/2022
Field of study

Linguistic steganography (LS) aims to embed secret information into a highly encoded text for covert communication. It can be roughly divided to two main categories, i.e., modification based LS (MLS) and generation based LS (GLS). Unlike MLS that hides secret data by slightly modifying a given text without impairing the meaning of the text, GLS uses a trained language model to directly generate a text carrying secret data. A common disadvantage for MLS methods is that the embedding payload is very low, whose return is well preserving the semantic quality of the text. In contrast, GLS allows the data hider to embed a high payload, which has to pay the high price of uncontrollable semantics. In this paper, we propose a novel LS method to modify a given text by pivoting it between two different languages and embed secret data by applying a GLS-like information encoding strategy. Our purpose is to alter the expression of the given text, enabling a high payload to be embedded while keeping the semantic information unchanged. Experimental results have shown that the proposed work not only achieves a high embedding payload, but also shows superior performance in maintaining the semantic consistency and resisting linguistic steganalysis

arXiv.org e-Print Archive

Novel linguistic steganography based on character-level text generation

Author: Li Q
Liu Y
Xiang L
Yang S
Zhu C
Publication venue: 'MDPI AG'
Publication date: 25/05/2021
Field of study

With the development of natural language processing, linguistic steganography has become a research hotspot in the field of information security. However, most existing linguistic steganographic methods may suffer from the low embedding capacity problem. Therefore, this paper proposes a character-level linguistic steganographic method (CLLS) to embed the secret information into characters instead of words by employing a long short-term memory (LSTM) based language model. First, the proposed method utilizes the LSTM model and large-scale corpus to construct and train a character-level text generation model. Through training, the best evaluated model is obtained as the prediction model of generating stego text. Then, we use the secret information as the control information to select the right character from predictions of the trained character-level text generation model. Thus, the secret information is hidden in the generated text as the predicted characters having different prediction probability values can be encoded into different secret bit values. For the same secret information, the generated stego texts vary with the starting strings of the text generation model, so we design a selection strategy to find the highest quality stego text from a number of candidate stego texts as the final stego text by changing the starting strings. The experimental results demonstrate that compared with other similar methods, the proposed method has the fastest running speed and highest embedding capacity. Moreover, extensive experiments are conducted to verify the effect of the number of candidate stego texts on the quality of the final stego text. The experimental results show that the quality of the final stego text increases with the number of candidate stego texts increasing, but the growth rate of the quality will slow down

OPUS - University of Technology Sydney

Recommended from our members

Secure digital documents using Steganography and QR Code

Author: Hassanein Mohamed Sameh
Publication venue
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonWith the increasing use of the Internet several problems have arisen regarding the processing of electronic documents. These include content filtering, content retrieval/search. Moreover, document security has taken a centre stage including copyright protection, broadcast monitoring etc. There is an acute need of an effective tool which can find the identity, location and the time when the document was created so that it can be determined whether or not the contents of the document were tampered with after creation. Owing the sensitivity of the large amounts of data which is processed on a daily basis, verifying the authenticity and integrity of a document is more important now than it ever was. Unsurprisingly document authenticity verification has become the centre of attention in the world of research. Consequently, this research is concerned with creating a tool which deals with the above problem. This research proposes the use of a Quick Response Code as a message carrier for Text Key-print. The Text Key-print is a novel method which employs the basic element of the language (i.e. Characters of the alphabet) in order to achieve authenticity of electronic documents through the transformation of its physical structure into a logical structured relationship. The resultant dimensional matrix is then converted into a binary stream and encapsulated with a serial number or URL inside a Quick response Code (QR code) to form a digital fingerprint mark. For hiding a QR code, two image steganography techniques were developed based upon the spatial and the transform domains. In the spatial domain, three methods were proposed and implemented based on the least significant bit insertion technique and the use of pseudorandom number generator to scatter the message into a set of arbitrary pixels. These methods utilise the three colour channels in the images based on the RGB model based in order to embed one, two or three bits per the eight bit channel which results in three different hiding capacities. The second technique is an adaptive approach in transforming domain where a threshold value is calculated under a predefined location for embedding in order to identify the embedding strength of the embedding technique. The quality of the generated stego images was evaluated using both objective (PSNR) and Subjective (DSCQS) methods to ensure the reliability of our proposed methods. The experimental results revealed that PSNR is not a strong indicator of the perceived stego image quality, but not a bad interpreter also of the actual quality of stego images. Since the visual difference between the cover and the stego image must be absolutely imperceptible to the human visual system, it was logically convenient to ask human observers with different qualifications and experience in the field of image processing to evaluate the perceived quality of the cover and the stego image. Thus, the subjective responses were analysed using statistical measurements to describe the distribution of the scores given by the assessors. Thus, the proposed scheme presents an alternative approach to protect digital documents rather than the traditional techniques of digital signature and watermarking

Brunel University Research Archive

Hiding Information in Reversible English Transforms for a Blind Receiver

Author: Ibrahim Kamel
Salma Banawan
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

This paper proposes a new technique for hiding secret messages in ordinary English text. The proposed technique exploits the redundancies existing in some English language constructs. Redundancies result from the flexibility in maneuvering certain statement constituents without altering the statement meaning or correctness. For example, one can say “she went to sleep, because she was tired” or “Because she was tired, she went to sleep.” The paper provides a number of such transformations that can be applied concurrently, while keeping the overall meaning and grammar intact. The proposed data hiding technique is blind since the receiver does not keep a copy of the original uncoded text (cover). Moreover, it can hide more than three bits per statement, which is higher than that achieved in the prior work. A secret key that is a function of the various transformations used is proposed to protect the confidentiality of the hidden message. Our security analysis shows that even if the attacker knows how the transforms are employed, the secret key provides enough security to protect the confidentiality of the hidden message. Moreover, we show that the proposed transformations do not affect the inconspicuousness of the transformed statements, and thus unlikely to draw suspicion

Crossref

Directory of Open Access Journals

Hiding Information in Reversible English Transforms for a Blind Receiver

Author: Ibrahim Kamel
Salma Banawan
Publication venue
Publication date: 23/04/2020
Field of study

This paper proposes a new technique for hiding secret messages in ordinary English text. The proposed technique exploits the redundancies existing in some English language constructs. Redundancies result from the flexibility in maneuvering certain statement constituents without altering the statement meaning or correctness. For example, one can say "she went to sleep, because she was tired" or "Because she was tired, she went to sleep. " The paper provides a number of such transformations that can be applied concurrently, while keeping the overall meaning and grammar intact. The proposed data hiding technique is blind since the receiver does not keep a copy of the original uncoded text (cover). Moreover, it can hide more than three bits per statement, which is higher than that achieved in the prior work. A secret key that is a function of the various transformations used is proposed to protect the confidentiality of the hidden message. Our security analysis shows that even if the attacker knows how the transforms are employed, the secret key provides enough security to protect the confidentiality of the hidden message. Moreover, we show that the proposed transformations do not affect the inconspicuousness of the transformed statements, and thus unlikely to draw suspicion

CiteSeerX

Covert Channels Within IRC

Author: Henry Wayne C.
Publication venue: AFIT Scholar
Publication date: 11/03/2011
Field of study

The exploration of advanced information hiding techniques is important to understand and defend against illicit data extractions over networks. Many techniques have been developed to covertly transmit data over networks, each differing in their capabilities, methods, and levels of complexity. This research introduces a new class of information hiding techniques for use over Internet Relay Chat (IRC), called the Variable Advanced Network IRC Stealth Handler (VANISH) system. Three methods for concealing information are developed under this framework to suit the needs of an attacker. These methods are referred to as the Throughput, Stealth, and Baseline scenarios. Each is designed for a specific purpose: to maximize channel capacity, minimize shape-based detectability, or provide a baseline for comparison using established techniques applied to IRC. The effectiveness of these scenarios is empirically tested using public IRC servers in Chicago, Illinois and Amsterdam, Netherlands. The Throughput method exfiltrates covert data at nearly 800 bits per second (bps) compared to 18 bps with the Baseline method and 0.13 bps for the Stealth method. The Stealth method uses Reed-Solomon forward error correction to reduce bit errors from 3.1% to nearly 0% with minimal additional overhead. The Stealth method also successfully evades shape-based detection tests but is vulnerable to regularity-based tests

AFTI Scholar (Air Force Institute of Technology)

Recommended from our members

Steganography-based secret and reliable communications: Improving steganographic capacity and imperceptibility

Author: Al-Mohammad Adel
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Unlike encryption, steganography hides the very existence of secret information rather than hiding its meaning only. Image based steganography is the most common system used since digital images are widely used over the Internet and Web. However, the capacity is mostly limited and restricted by the size of cover images. In addition, there is a tradeoff between both steganographic capacity and stego image quality. Therefore, increasing steganographic capacity and enhancing stego image quality are still challenges, and this is exactly our research main aim. Related to this, we also investigate hiding secret information in communication protocols, namely Simple Object Access Protocol (SOAP) message, rather than in conventional digital files. To get a high steganographic capacity, two novel steganography methods were proposed. The first method was based on using 16x16 non-overlapping blocks and quantisation table for Joint Photographic Experts Group (JPEG) compression instead of 8x8. Then, the quality of JPEG stego images was enhanced by using optimised quantisation tables instead of the default tables. The second method, the hybrid method, was based on using optimised quantisation tables and two hiding techniques: JSteg along with our first proposed method. To increase the steganographic capacity, the impact of hiding data within image chrominance was investigated and explained. Since peak signal-to-noise ratio (PSNR) is extensively used as a quality measure of stego images, the reliability of PSNR for stego images was also evaluated in the work described in this thesis. Finally, to eliminate any detectable traces that traditional steganography may leave in stego files, a novel and undetectable steganography method based on SOAP messages was proposed. All methods proposed have been empirically validated as to indicate their utility and value. The results revealed that our methods and suggestions improved the main aspects of image steganography. Nevertheless, PSNR was found not to be a reliable quality evaluation measure to be used with stego image. On the other hand, information hiding in SOAP messages represented a distinctive way for undetectable and secret communication.The Ministry of Higher Education in Syria and the University of Alepp

Brunel University Research Archive

Hybrid Arabic text steganography

Author: Alshahrani Haya Mesfer S
Weir George
Publication venue
Publication date: 30/11/2017
Field of study

An improved method for Arabic text steganography is introduced in this paper. This method hides an Arabic text inside another based on a hybrid approach. Both Kashida and Arabic Diacritics are used to hide the Arabic text inside another text. In this improved method, the secret message is divided into two parts, the first part is to be hidden by the Kashida method, and the second is to be hidden by the Diacritics or Harakat method. For security purposes, we benefitted from the natural existence of Diacritics as a characteristic of Arabic written language, as used to represent vowel sounds. The paper exploits the possibility of hiding data in Fathah diacritic and Kashida punctuation marks, adjusting previously presented schemes that are based on a single method only. Here, the secret message is divided into two parts, the cover text is prepared, and then we apply the Harakat method on the first part. The Kashida method is applied on the second part, and then the two parts are combined. When the hidden ‘StegoText’ is received, a split mechanism is used to recover the original message. The described hybrid Arabic StegoText showed higher capacity and security with promising results compared to other methods

University of Strathclyde Institutional Repository

Towards Code Watermarking with Dual-Channel Transformations

Author: Li Bo
Li Wei
Xiang Liyao
Yang Borui
Publication venue
Publication date: 02/09/2023
Field of study

The expansion of the open source community and the rise of large language models have raised ethical and security concerns on the distribution of source code, such as misconduct on copyrighted code, distributions without proper licenses, or misuse of the code for malicious purposes. Hence it is important to track the ownership of source code, in wich watermarking is a major technique. Yet, drastically different from natural languages, source code watermarking requires far stricter and more complicated rules to ensure the readability as well as the functionality of the source code. Hence we introduce SrcMarker, a watermarking system to unobtrusively encode ID bitstrings into source code, without affecting the usage and semantics of the code. To this end, SrcMarker performs transformations on an AST-based intermediate representation that enables unified transformations across different programming languages. The core of the system utilizes learning-based embedding and extraction modules to select rule-based transformations for watermarking. In addition, a novel feature-approximation technique is designed to tackle the inherent non-differentiability of rule selection, thus seamlessly integrating the rule-based transformations and learning-based networks into an interconnected system to enable end-to-end training. Extensive experiments demonstrate the superiority of SrcMarker over existing methods in various watermarking requirements.Comment: 16 page

arXiv.org e-Print Archive