51 research outputs found

    Improve Steganalysis by MWM Feature Selection

    Get PDF

    CANVASS - A Steganalysis Forensic Tool for JPEG Images

    Get PDF
    Steganography is a way to communicate a message such that no one except the sender and recipient suspects the existence of the message. This type of covert communication lends itself to a variety of different purposes such as spy-to-spy communication, exchange of pornographic material hidden in innocuous image files, and other illicit acts. Computer forensic personnel have an interest in testing for possible steganographic files, but often do not have access to the technical and financial resources required to perform steganalysis in an effective manner. This paper describes the results of a funded effort by a grant from the National Institutes of Justice to develop a user friendly and practical software program that has been designed to meet the steganalysis needs of the Iowa Division of Criminal Investigation in Ankeny, Iowa. The software performs steganalysis on JPEG image files in an efficient and effective way. JPEG images are popular and used by a great many people, and thus are naturally exploited for steganography. The commercial software that is available for detection of hidden messages is often expensive and does not fit the need of smaller police forensic labs. Our software checks for the presence of hidden payloads for five different JPEG-embedding steganography algorithms with the potential of identifying stego images generated by other (possibly unknown) embedding algorithm. Keywords: steganography, steganalysis, JPEG images, GUI softwar

    Double-Compressed JPEG Detection in a Steganalysis System

    Get PDF
    The detection of hidden messages in JPEG images is a growing concern. Current detection of JPEG stego images must include detection of double compression: a JPEG image is double compressed if it has been compressed with one quality factor, uncompressed, and then re-compressed with a different quality factor. When detection of double compression is not included, erroneous detection rates are very high. The main contribution of this paper is to present an efficient double-compression detection algorithm that has relatively lower dimensionality of features and relatively lower computational time for the detection part, than current comparative classifiers. We use a model-based approach for creating features, using a subclass of Markov random fields called partially ordered Markov models (POMMs) to modeling the phenomenon of the bit changes that occur in an image after an application of steganography. We model as noise the embedding process, and create features to capture this noise characteristic. We show that the nonparametric conditional probabilities that are modeled using a POMM can work very well to distinguish between an image that has been double compressed and one that has not, with lower overall computational cost. After double compression detection, we analyze histogram patterns that identify the primary quality compression factor to classify the image as stego or cover. The latter is an analytic approach that requires no classifier training. We compare our results with another state-of-the-art double compression detector. Keywords: steganalysis; steganography; JPEG; double compression; digital image forensics

    The role of side information in steganography

    Full text link
    Das Ziel digitaler Steganographie ist es, eine geheime Kommunikation in digitalen Medien zu verstecken. Der übliche Ansatz ist es, die Nachricht in einem empirischen Trägermedium zu verstecken. In dieser Arbeit definieren wir den Begriff der Steganographischen Seiteninformation (SSI). Diese Definition umfasst alle wichtigen Eigenschaften von SSI. Wir begründen die Definition informationstheoretisch und erklären den Einsatz von SSI. Alle neueren steganographischen Algorithmen nutzen SSI um die Nachricht einzubetten. Wir entwickeln einen Angriff auf adaptive Steganographie und zeigen anhand von weit verbreiteten SSI-Varianten, dass unser Angriff funktioniert. Wir folgern, dass adaptive Steganographie spieltheoretisch beschrieben werden muss. Wir entwickeln ein spieltheoretisches Modell für solch ein System und berechnen die spieltheoretisch optimalen Strategien. Wir schlussfolgern, dass ein Steganograph diesen Strategien folgen sollte. Zudem entwickeln wir eine neue spieltheoretisch optimale Strategie zur Einbettung, die sogenannten Ausgleichseinbettungsstrategien.The  goal of digital steganography is to hide a secret communication in digital media. The common approach in steganography is to hide the secret messages in empirical cover objects. We are the first to define Steganographic Side Information (SSI). Our definition of SSI captures all relevant properties of SSI. We explain the common usage of SSI. All recent steganographic schemes use SSI to identify suitable areas fot the embedding change. We develop a targeted attack on four widely used variants of SSI, and show that our attack detects them almost perfectly. We argue that the steganographic competition must be framed with means of game theory. We present a game-theoretical framework that captures all relevant properties of such a steganographic system. We instantiate the framework with five different models and solve each of these models for game-theoretically optimal strategies. Inspired by our solutions, we give a new paradigm for secure adaptive steganography, the so-called equalizer embedding strategies

    Natural Image Statistics for Digital Image Forensics

    Get PDF
    We describe a set of natural image statistics that are built upon two multi-scale image decompositions, the quadrature mirror filter pyramid decomposition and the local angular harmonic decomposition. These image statistics consist of first- and higher-order statistics that capture certain statistical regularities of natural images. We propose to apply these image statistics, together with classification techniques, to three problems in digital image forensics: (1) differentiating photographic images from computer-generated photorealistic images, (2) generic steganalysis; (3) rebroadcast image detection. We also apply these image statistics to the traditional art authentication for forgery detection and identification of artists in an art work. For each application we show the effectiveness of these image statistics and analyze their sensitivity and robustness

    Detecting CNN-Generated Facial Images in Real-World Scenarios

    Get PDF
    Artificial, CNN-generated images are now of such high quality that humans have trouble distinguishing them from real images. Several algorithmic detection methods have been proposed, but these appear to generalize poorly to data from unknown sources, making them infeasible for real-world scenarios. In this work, we present a framework for evaluating detection methods under real-world conditions, consisting of cross-model, cross-data, and post-processing evaluation, and we evaluate state-of-the-art detection methods using the proposed framework. Furthermore, we examine the usefulness of commonly used image pre-processing methods. Lastly, we evaluate human performance on detecting CNN-generated images, along with factors that influence this performance, by conducting an online survey. Our results suggest that CNN-based detection methods are not yet robust enough to be used in real-world scenarios.Comment: Accepted to the workshop on Media Forensics at CVPR 202

    Challenges and Open Questions of Machine Learning in Computer Security

    Get PDF
    This habilitation thesis presents advancements in machine learning for computer security, arising from problems in network intrusion detection and steganography. The thesis put an emphasis on explanation of traits shared by steganalysis, network intrusion detection, and other security domains, which makes these domains different from computer vision, speech recognition, and other fields where machine learning is typically studied. Then, the thesis presents methods developed to at least partially solve the identified problems with an overall goal to make machine learning based intrusion detection system viable. Most of them are general in the sense that they can be used outside intrusion detection and steganalysis on problems with similar constraints. A common feature of all methods is that they are generally simple, yet surprisingly effective. According to large-scale experiments they almost always improve the prior art, which is likely caused by being tailored to security problems and designed for large volumes of data. Specifically, the thesis addresses following problems: anomaly detection with low computational and memory complexity such that efficient processing of large data is possible; multiple-instance anomaly detection improving signal-to-noise ration by classifying larger group of samples; supervised classification of tree-structured data simplifying their encoding in neural networks; clustering of structured data; supervised training with the emphasis on the precision in top p% of returned data; and finally explanation of anomalies to help humans understand the nature of anomaly and speed-up their decision. Many algorithms and method presented in this thesis are deployed in the real intrusion detection system protecting millions of computers around the globe

    A forensics software toolkit for DNA steganalysis.

    Get PDF
    Recent advances in genetic engineering have allowed the insertion of artificial DNA strands into the living cells of organisms. Several methods have been developed to insert information into a DNA sequence for the purpose of data storage, watermarking, or communication of secret messages. The ability to detect, extract, and decode messages from DNA is important for forensic data collection and for data security. We have developed a software toolkit that is able to detect the presence of a hidden message within a DNA sequence, extract that message, and then decode it. The toolkit is able to detect, extract, and decode messages that have been encoded with a variety of different coding schemes. The goal of this project is to enable our software toolkit to determine with which coding scheme a message has been encoded in DNA and then to decode it. The software package is able to decode messages that have been encoded with every variation of most of the coding schemes described in this document. The software toolkit has two different options for decoding that can be selected by the user. The first is a frequency analysis approach that is very commonly used in cryptanalysis. This approach is very fast, but is unable to decode messages shorter than 200 words accurately. The second option is using a Genetic Algorithm (GA) in combination with a Wisdom of Artificial Crowds (WoAC) technique. This approach is very time consuming, but can decode shorter messages with much higher accuracy
    corecore