45 research outputs found

    Improve Steganalysis by MWM Feature Selection

    Get PDF

    Multi-Class Classification for Identifying JPEG Steganography Embedding Methods

    Get PDF
    Over 725 steganography tools are available over the Internet, each providing a method for covert transmission of secret messages. This research presents four steganalysis advancements that result in an algorithm that identifies the steganalysis tool used to embed a secret message in a JPEG image file. The algorithm includes feature generation, feature preprocessing, multi-class classification and classifier fusion. The first contribution is a new feature generation method which is based on the decomposition of discrete cosine transform (DCT) coefficients used in the JPEG image encoder. The generated features are better suited to identifying discrepancies in each area of the decomposed DCT coefficients. Second, the classification accuracy is further improved with the development of a feature ranking technique in the preprocessing stage for the kernel Fisher s discriminant (KFD) and support vector machines (SVM) classifiers in the kernel space during the training process. Third, for the KFD and SVM two-class classifiers a classification tree is designed from the kernel space to provide a multi-class classification solution for both methods. Fourth, by analyzing a set of classifiers, signature detectors, and multi-class classification methods a classifier fusion system is developed to increase the detection accuracy of identifying the embedding method used in generating the steganography images. Based on classifying stego images created from research and commercial JPEG steganography techniques, F5, JP Hide, JSteg, Model-based, Model-based Version 1.2, OutGuess, Steganos, StegHide and UTSA embedding methods, the performance of the system shows a statistically significant increase in classification accuracy of 5%. In addition, this system provides a solution for identifying steganographic fingerprints as well as the ability to include future multi-class classification tools

    Compressive sensing based secret signals recovery for effective image steganalysis in secure communications

    Get PDF
    Conventional image steganalysis mainly focus on presence detection rather than the recovery of the original secret messages that were embedded in the host image. To address this issue, we propose an image steganalysis method featured in the compressive sensing (CS) domain, where block CS measurement matrix senses the transform coefficients of stego-image to reflect the statistical differences between the cover and stego- images. With multi-hypothesis prediction in the CS domain, the reconstruction of hidden signals is achieved efficiently. Extensive experiments have been carried out on five diverse image databases and benchmarked with four typical stegographic algorithms. The comprehensive results have demonstrated the efficacy of the proposed approach as a universal scheme for effective detection of stegography in secure communications whilst it has greatly reduced the numbers of features requested for secret signal reconstruction

    Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

    Get PDF
    In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

    Classifiers and machine learning techniques for image processing and computer vision

    Get PDF
    Orientador: Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto da ComputaçãoResumo: Neste trabalho de doutorado, propomos a utilizaçãoo de classificadores e técnicas de aprendizado de maquina para extrair informações relevantes de um conjunto de dados (e.g., imagens) para solução de alguns problemas em Processamento de Imagens e Visão Computacional. Os problemas de nosso interesse são: categorização de imagens em duas ou mais classes, detecçãao de mensagens escondidas, distinção entre imagens digitalmente adulteradas e imagens naturais, autenticação, multi-classificação, entre outros. Inicialmente, apresentamos uma revisão comparativa e crítica do estado da arte em análise forense de imagens e detecção de mensagens escondidas em imagens. Nosso objetivo é mostrar as potencialidades das técnicas existentes e, mais importante, apontar suas limitações. Com esse estudo, mostramos que boa parte dos problemas nessa área apontam para dois pontos em comum: a seleção de características e as técnicas de aprendizado a serem utilizadas. Nesse estudo, também discutimos questões legais associadas a análise forense de imagens como, por exemplo, o uso de fotografias digitais por criminosos. Em seguida, introduzimos uma técnica para análise forense de imagens testada no contexto de detecção de mensagens escondidas e de classificação geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte. Ao estudarmos esse problema de multi-classificação, surgem algumas questões: como resolver um problema multi-classe de modo a poder combinar, por exemplo, caracteríisticas de classificação de imagens baseadas em cor, textura, forma e silhueta, sem nos preocuparmos demasiadamente em como normalizar o vetor-comum de caracteristicas gerado? Como utilizar diversos classificadores diferentes, cada um, especializado e melhor configurado para um conjunto de caracteristicas ou classes em confusão? Nesse sentido, apresentamos, uma tecnica para fusão de classificadores e caracteristicas no cenário multi-classe através da combinação de classificadores binários. Nós validamos nossa abordagem numa aplicação real para classificação automática de frutas e legumes. Finalmente, nos deparamos com mais um problema interessante: como tornar a utilização de poderosos classificadores binarios no contexto multi-classe mais eficiente e eficaz? Assim, introduzimos uma tecnica para combinação de classificadores binarios (chamados classificadores base) para a resolução de problemas no contexto geral de multi-classificação.Abstract: In this work, we propose the use of classifiers and machine learning techniques to extract useful information from data sets (e.g., images) to solve important problems in Image Processing and Computer Vision. We are particularly interested in: two and multi-class image categorization, hidden messages detection, discrimination among natural and forged images, authentication, and multiclassification. To start with, we present a comparative survey of the state-of-the-art in digital image forensics as well as hidden messages detection. Our objective is to show the importance of the existing solutions and discuss their limitations. In this study, we show that most of these techniques strive to solve two common problems in Machine Learning: the feature selection and the classification techniques to be used. Furthermore, we discuss the legal and ethical aspects of image forensics analysis, such as, the use of digital images by criminals. We introduce a technique for image forensics analysis in the context of hidden messages detection and image classification in categories such as indoors, outdoors, computer generated, and art works. From this multi-class classification, we found some important questions: how to solve a multi-class problem in order to combine, for instance, several different features such as color, texture, shape, and silhouette without worrying about the pre-processing and normalization of the combined feature vector? How to take advantage of different classifiers, each one custom tailored to a specific set of classes in confusion? To cope with most of these problems, we present a feature and classifier fusion technique based on combinations of binary classifiers. We validate our solution with a real application for automatic produce classification. Finally, we address another interesting problem: how to combine powerful binary classifiers in the multi-class scenario more effectively? How to boost their efficiency? In this context, we present a solution that boosts the efficiency and effectiveness of multi-class from binary techniques.DoutoradoEngenharia de ComputaçãoDoutor em Ciência da Computaçã

    Machine learning based digital image forensics and steganalysis

    Get PDF
    The security and trustworthiness of digital images have become crucial issues due to the simplicity of malicious processing. Therefore, the research on image steganalysis (determining if a given image has secret information hidden inside) and image forensics (determining the origin and authenticity of a given image and revealing the processing history the image has gone through) has become crucial to the digital society. In this dissertation, the steganalysis and forensics of digital images are treated as pattern classification problems so as to make advanced machine learning (ML) methods applicable. Three topics are covered: (1) architectural design of convolutional neural networks (CNNs) for steganalysis, (2) statistical feature extraction for camera model classification, and (3) real-world tampering detection and localization. For covert communications, steganography is used to embed secret messages into images by altering pixel values slightly. Since advanced steganography alters the pixel values in the image regions that are hard to be detected, the traditional ML-based steganalytic methods heavily relied on sophisticated manual feature design have been pushed to the limit. To overcome this difficulty, in-depth studies are conducted and reported in this dissertation so as to move the success achieved by the CNNs in computer vision to steganalysis. The outcomes achieved and reported in this dissertation are: (1) a proposed CNN architecture incorporating the domain knowledge of steganography and steganalysis, and (2) ensemble methods of the CNNs for steganalysis. The proposed CNN is currently one of the best classifiers against steganography. Camera model classification from images aims at assigning a given image to its source capturing camera model based on the statistics of image pixel values. For this, two types of statistical features are designed to capture the traces left by in-camera image processing algorithms. The first is Markov transition probabilities modeling block-DCT coefficients for JPEG images; the second is based on histograms of local binary patterns obtained in both the spatial and wavelet domains. The designed features serve as the input to train support vector machines, which have the best classification performance at the time the features are proposed. The last part of this dissertation documents the solutions delivered by the author’s team to The First Image Forensics Challenge organized by the Information Forensics and Security Technical Committee of the IEEE Signal Processing Society. In the competition, all the fake images involved were doctored by popular image-editing software to simulate the real-world scenario of tampering detection (determine if a given image has been tampered or not) and localization (determine which pixels have been tampered). In Phase-1 of the Challenge, advanced steganalysis features were successfully migrated to tampering detection. In Phase-2 of the Challenge, an efficient copy-move detector equipped with PatchMatch as a fast approximate nearest neighbor searching method were developed to identify duplicated regions within images. With these tools, the author’s team won the runner-up prizes in both the two phases of the Challenge

    Exploring Biomedical Video Source Identification: Transitioning from Fuzzy-Based Systems to Machine Learning Models

    Get PDF
    In recent years, the field of biomedical video source identification has witnessed a significant evolution driven by advances in both fuzzy-based systems and machine learning models. This paper presents a comprehensive survey of the current state of the art in this domain, highlighting the transition from traditional fuzzy-based approaches to the emerging dominance of machine learning techniques. Biomedical videos have become integral in various aspects of healthcare, from medical imaging and diagnostics to surgical procedures and patient monitoring. The accurate identification of the sources of these videos is of paramount importance for quality control, accountability, and ensuring the integrity of medical data. In this context, source identification plays a critical role in establishing the authenticity and origin of biomedical videos. This survey delves into the evolution of source identification methods, covering the foundational principles of fuzzy-based systems and their applications in the biomedical context. It explores how linguistic variables and expert knowledge were employed to model video sources, and discusses the strengths and limitations of these early approaches. By surveying existing methodologies and databases, this paper contributes to a broader understanding of the field’s progress and challenges
    corecore