11 research outputs found

    Adaptive spatial image steganography and steganalysis using perceptual modelling and machine learning

    Get PDF
    Image steganography is a method for communicating secret messages under the cover images. A sender will embed the secret messages into the cover images according to an algorithm, and then the resulting image will be sent to the receiver. The receiver can extract the secret messages with the predefined algorithm. To counter this kind of technique, image steganalysis is proposed to detect the presence of secret messages. After many years of development, current image steganography uses the adaptive algorithm for embedding the secrets, which automatically finds the complex area in the cover source to avoid being noticed. Meanwhile, image steganalysis has also been advanced to universal steganalysis, which does not require the knowledge of the steganographic algorithm. With the development of the computational hardware, i.e., Graphical Processing Units (GPUs), some computational expensive techniques are now available, i.e., Convolutional Neural Networks (CNNs), which bring a large improvement in the detection tasks in image steganalysis. To defend against the attacks, new techniques are also being developed to improve the security of image steganography, these include designing more scientific cost functions, the key in adaptive steganography, and generating stego images from the knowledge of the CNNs. Several contributions are made for both image steganography and steganalysis in this thesis. Firstly, inspired by the Ranking Priority Profile (RPP), a new cost function for adaptive image steganography is proposed, which uses the two-dimensional Singular Spectrum Analysis (2D-SSA) and Weighted Median Filter (WMF) in the design. The RPP mainly includes three rules, i.e., the Complexity-First rule, the Clustering rule and the Spreading rule, to design a cost function. The 2D-SSA is employed in selecting the key components and clustering the embedding positions, which follows the Complexity-First rule and the Clustering rule. Also, the Spreading rule is followed to smooth the resulting image produced by 2D-SSA with WMF. The proposed algorithm has improved performance over four benchmarking approaches against non-shared selection channel attacks. It also provides comparable performance in selection-channel-aware scenarios, where the best results are observed when the relative payload is 0.3 bpp or larger. The approach is much faster than other model-based methods. Secondly, for image steganalysis, to tackle more complex datasets that are close to the real scenarios and to push image steganalysis further to real-life applications, an Enhanced Residual Network with self-attention ability, i.e., ERANet, is proposed. By employing a more mathematically sophisticated way to extract more effective features in the images and the global self-Attention technique, the ERANet can further capture the stego signal in the deeper layers, hence it is suitable for the more complex situations in the new datasets. The proposed Enhanced Low-Level Feature Representation Module can be easily mounted on other CNNs in selecting the most representative features. Although it comes with a slightly extra computational cost, comprehensive experiments on the BOSSbase and ALASKA#2 datasets have demonstrated the effectiveness of the proposed methodology. Lastly, for image steganography, with the knowledge from the CNNs, a novel postcost-optimization algorithm is proposed. Without modifying the original stego image and the original cost function of the steganography, and no need for training a Generative Adversarial Network (GAN), the proposed method mainly uses the gradient maps from a well-trained CNN to represent the cost, where the original cost map of the steganography is adopted to indicate the embedding positions. This method will smooth the gradient maps before adjusting the cost, which solves the boundary problem of the CNNs having multiple subnets. Extensive experiments have been carried out to validate the effectiveness of the proposed method, which provides state-of-the-art performance. In addition, compared to existing work, the proposed method is effcient in computing time as well. In short, this thesis has made three major contributions to image steganography and steganalysis by using perceptual modelling and machine learning. A novel cost function and a post-cost-optimization function have been proposed for adaptive spatial image steganography, which helps protect the secret messages. For image steganalysis, a new CNN architecture has also been proposed, which utilizes multiple techniques for providing state of-the-art performance. Future directions are also discussed for indicating potential research.Image steganography is a method for communicating secret messages under the cover images. A sender will embed the secret messages into the cover images according to an algorithm, and then the resulting image will be sent to the receiver. The receiver can extract the secret messages with the predefined algorithm. To counter this kind of technique, image steganalysis is proposed to detect the presence of secret messages. After many years of development, current image steganography uses the adaptive algorithm for embedding the secrets, which automatically finds the complex area in the cover source to avoid being noticed. Meanwhile, image steganalysis has also been advanced to universal steganalysis, which does not require the knowledge of the steganographic algorithm. With the development of the computational hardware, i.e., Graphical Processing Units (GPUs), some computational expensive techniques are now available, i.e., Convolutional Neural Networks (CNNs), which bring a large improvement in the detection tasks in image steganalysis. To defend against the attacks, new techniques are also being developed to improve the security of image steganography, these include designing more scientific cost functions, the key in adaptive steganography, and generating stego images from the knowledge of the CNNs. Several contributions are made for both image steganography and steganalysis in this thesis. Firstly, inspired by the Ranking Priority Profile (RPP), a new cost function for adaptive image steganography is proposed, which uses the two-dimensional Singular Spectrum Analysis (2D-SSA) and Weighted Median Filter (WMF) in the design. The RPP mainly includes three rules, i.e., the Complexity-First rule, the Clustering rule and the Spreading rule, to design a cost function. The 2D-SSA is employed in selecting the key components and clustering the embedding positions, which follows the Complexity-First rule and the Clustering rule. Also, the Spreading rule is followed to smooth the resulting image produced by 2D-SSA with WMF. The proposed algorithm has improved performance over four benchmarking approaches against non-shared selection channel attacks. It also provides comparable performance in selection-channel-aware scenarios, where the best results are observed when the relative payload is 0.3 bpp or larger. The approach is much faster than other model-based methods. Secondly, for image steganalysis, to tackle more complex datasets that are close to the real scenarios and to push image steganalysis further to real-life applications, an Enhanced Residual Network with self-attention ability, i.e., ERANet, is proposed. By employing a more mathematically sophisticated way to extract more effective features in the images and the global self-Attention technique, the ERANet can further capture the stego signal in the deeper layers, hence it is suitable for the more complex situations in the new datasets. The proposed Enhanced Low-Level Feature Representation Module can be easily mounted on other CNNs in selecting the most representative features. Although it comes with a slightly extra computational cost, comprehensive experiments on the BOSSbase and ALASKA#2 datasets have demonstrated the effectiveness of the proposed methodology. Lastly, for image steganography, with the knowledge from the CNNs, a novel postcost-optimization algorithm is proposed. Without modifying the original stego image and the original cost function of the steganography, and no need for training a Generative Adversarial Network (GAN), the proposed method mainly uses the gradient maps from a well-trained CNN to represent the cost, where the original cost map of the steganography is adopted to indicate the embedding positions. This method will smooth the gradient maps before adjusting the cost, which solves the boundary problem of the CNNs having multiple subnets. Extensive experiments have been carried out to validate the effectiveness of the proposed method, which provides state-of-the-art performance. In addition, compared to existing work, the proposed method is effcient in computing time as well. In short, this thesis has made three major contributions to image steganography and steganalysis by using perceptual modelling and machine learning. A novel cost function and a post-cost-optimization function have been proposed for adaptive spatial image steganography, which helps protect the secret messages. For image steganalysis, a new CNN architecture has also been proposed, which utilizes multiple techniques for providing state of-the-art performance. Future directions are also discussed for indicating potential research

    Pokročilé metody detekce steganografického obsahu

    Get PDF
    Steganography can be used for illegal activities. It is essential to be prepared. To detect steganography images, we have a counter-technique known as steganalysis. There are different steganalysis types, depending on if the original artifact (cover work) is known or not, or we know which algorithm was used for embedding. In terms of practical use, the most important are “blind steganalysis” methods that can be applied to image files because we do not have the original cover work for comparison. This philosophiæ doctor thesis describes the methodology to the issues of image steganalysis.In this work, it is crucial to understand the behavior of the targeted steganography algorithm. Then we can use it is weaknesses to increase the detection capability and success of categorization. We are primarily focusing on breaking the steganography algorithm OutGuess2.0. and secondary on breaking the F5 algorithm. We are analyzing the detector's ability, which utilizes a calibration process, blockiness calculation, and shallow neural network, to detect the presence of steganography message in the suspected image. The new approach and results are discussed in this Ph.D. thesis.Steganografie může být využita k nelegálním aktivitám. Proto je velmi důležité být připraven. K detekci steganografického obrázku máme k dispozici techniku známou jako stegoanalýza. Existují různé typy stegoanalýzy v závislosti na tom, zda je znám originální nosič nebo zdali víme, jaký byl použit algoritmus pro vložení tajné zprávy. Z hlediska praktického použití jsou nejdůležitější metody "slepé stagoanalýzy", které zle aplikovat na obrazové soubory a jelikož nemáme originální nosič pro srovnání. Tato doktorská práce popisuje metodologii obrazové stegoanalýzy. V této práci je důležité porozumět chování cíleného steganografického algoritmu. Pak můžeme využít jeho slabiny ke zvýšení detekční schopnosti a úspěšnosti kategorizace. Primárně se zaměřujeme na prolomení steganografického algoritmu OutGuess2.0 a sekundárně na algoritmus F5. Analyzujeme schopnost detektoru, který využívá proces kalibrace, výpočtu shlukování a mělkou neuronovou síť k detekci přítomnosti steganografické zprávy na podezřelém snímku. Nový přístup a výsledky jsou sepsány v této doktorské práci.460 - Katedra informatikyvyhově

    Establishing the digital chain of evidence in biometric systems

    Get PDF
    Traditionally, a chain of evidence or chain of custody refers to the chronological documentation, or paper trail, showing the seizure, custody, control, transfer, analysis, and disposition of evidence, physical or electronic. Whether in the criminal justice system, military applications, or natural disasters, ensuring the accuracy and integrity of such chains is of paramount importance. Intentional or unintentional alteration, tampering, or fabrication of digital evidence can lead to undesirable effects. We find despite the consequences at stake, historically, no unique protocol or standardized procedure exists for establishing such chains. Current practices rely on traditional paper trails and handwritten signatures as the foundation of chains of evidence.;Copying, fabricating or deleting electronic data is easier than ever and establishing equivalent digital chains of evidence has become both necessary and desirable. We propose to consider a chain of digital evidence as a multi-component validation problem. It ensures the security of access control, confidentiality, integrity, and non-repudiation of origin. Our framework, includes techniques from cryptography, keystroke analysis, digital watermarking, and hardware source identification. The work offers contributions to many of the fields used in the formation of the framework. Related to biometric watermarking, we provide a means for watermarking iris images without significantly impacting biometric performance. Specific to hardware fingerprinting, we establish the ability to verify the source of an image captured by biometric sensing devices such as fingerprint sensors and iris cameras. Related to keystroke dynamics, we establish that user stimulus familiarity is a driver of classification performance. Finally, example applications of the framework are demonstrated with data collected in crime scene investigations, people screening activities at port of entries, naval maritime interdiction operations, and mass fatality incident disaster responses

    Challenges and Open Questions of Machine Learning in Computer Security

    Get PDF
    This habilitation thesis presents advancements in machine learning for computer security, arising from problems in network intrusion detection and steganography. The thesis put an emphasis on explanation of traits shared by steganalysis, network intrusion detection, and other security domains, which makes these domains different from computer vision, speech recognition, and other fields where machine learning is typically studied. Then, the thesis presents methods developed to at least partially solve the identified problems with an overall goal to make machine learning based intrusion detection system viable. Most of them are general in the sense that they can be used outside intrusion detection and steganalysis on problems with similar constraints. A common feature of all methods is that they are generally simple, yet surprisingly effective. According to large-scale experiments they almost always improve the prior art, which is likely caused by being tailored to security problems and designed for large volumes of data. Specifically, the thesis addresses following problems: anomaly detection with low computational and memory complexity such that efficient processing of large data is possible; multiple-instance anomaly detection improving signal-to-noise ration by classifying larger group of samples; supervised classification of tree-structured data simplifying their encoding in neural networks; clustering of structured data; supervised training with the emphasis on the precision in top p% of returned data; and finally explanation of anomalies to help humans understand the nature of anomaly and speed-up their decision. Many algorithms and method presented in this thesis are deployed in the real intrusion detection system protecting millions of computers around the globe

    On the data hiding theory and multimedia content security applications

    Get PDF
    This dissertation is a comprehensive study of digital steganography for multimedia content protection. With the increasing development of Internet technology, protection and enforcement of multimedia property rights has become a great concern to multimedia authors and distributors. Watermarking technologies provide a possible solution for this problem. The dissertation first briefly introduces the current watermarking schemes, including their applications in video,, image and audio. Most available embedding schemes are based on direct Spread Sequence (SS) modulation. A small value pseudo random signature sequence is embedded into the host signal and the information is extracted via correlation. The correlation detection problem is discussed at the beginning. It is concluded that the correlator is not optimum in oblivious detection. The Maximum Likelihood detector is derived and some feasible suboptimal detectors are also analyzed. Through the calculation of extraction Bit Error Rate (BER), it is revealed that the SS scheme is not very efficient due to its poor host noise suppression. The watermark domain selection problem is addressed subsequently. Some implications on hiding capacity and reliability are also studied. The last topic in SS modulation scheme is the sequence selection. The relationship between sequence bandwidth and synchronization requirement is detailed in the work. It is demonstrated that the white sequence commonly used in watermarking may not really boost watermark security. To address the host noise suppression problem, the hidden communication is modeled as a general hypothesis testing problem and a set partitioning scheme is proposed. Simulation studies and mathematical analysis confirm that it outperforms the SS schemes in host noise suppression. The proposed scheme demonstrates improvement over the existing embedding schemes. Data hiding in audio signals are explored next. The audio data hiding is believed a more challenging task due to the human sensitivity to audio artifacts and advanced feature of current compression techniques. The human psychoacoustic model and human music understanding are also covered in the work. Then as a typical audio perceptual compression scheme, the popular MP3 compression is visited in some length. Several schemes, amplitude modulation, phase modulation and noise substitution are presented together with some experimental results. As a case study, a music bitstream encryption scheme is proposed. In all these applications, human psychoacoustic model plays a very important role. A more advanced audio analysis model is introduced to reveal implications on music understanding. In the last part, conclusions and future research are presented

    Photo response non-uniformity based image forensics in the presence of challenging factors

    Get PDF
    With the ever-increasing prevalence of digital imaging devices and the rapid development of networks, the sharing of digital images becomes ubiquitous in our daily life. However, the pervasiveness of powerful image-editing tools also makes the digital images an easy target for malicious manipulations. Thus, to prevent people from falling victims to fake information and trace the criminal activities, digital image forensics methods like source camera identification, source oriented image clustering and image forgery detections have been developed. Photo response non-uniformity (PRNU), which is an intrinsic sensor noise arises due to the pixels non-uniform response to the incident, has been used as a powerful tool for image device fingerprinting. The forensic community has developed a vast number of PRNU-based methods in different fields of digital image forensics. However, with the technology advancement in digital photography, the emergence of photo-sharing social networking sites, as well as the anti-forensics attacks targeting the PRNU, it brings new challenges to PRNU-based image forensics. For example, the performance of the existing forensic methods may deteriorate due to different camera exposure parameter settings and the efficacy of the PRNU-based methods can be directly challenged by image editing tools from social network sites or anti-forensics attacks. The objective of this thesis is to investigate and design effective methods to mitigate some of these challenges on PRNU-based image forensics. We found that the camera exposure parameter settings, especially the camera sensitivity, which is commonly known by the name of the ISO speed, can influence the PRNU-based image forgery detection. Hence, we first construct the Warwick Image Forensics Dataset, which contains images taken with diverse exposure parameter settings to facilitate further studies. To address the impact from ISO speed on PRNU-based image forgery detection, an ISO speed-specific correlation prediction process is proposed with a content-based ISO speed inference method to facilitate the process even if the ISO speed information is not available. We also propose a three-step framework to allow the PRNUbased source oriented clustering methods to perform successfully on Instagram images, despite some built-in image filters from Instagram may significantly distort PRNU. Additionally, for the binary classification of detecting whether an image's PRNU is attacked or not, we propose a generative adversarial network-based training strategy for a neural network-based classifier, which makes the classifier generalize better for images subject to unprecedented attacks. The proposed methods are evaluated on public benchmarking datasets and our Warwick Image Forensics Dataset, which is released to the public as well. The experimental results validate the effectiveness of the methods proposed in this thesis

    Deteção de propagação de ameaças e exfiltração de dados em redes empresariais

    Get PDF
    Modern corporations face nowadays multiple threats within their networks. In an era where companies are tightly dependent on information, these threats can seriously compromise the safety and integrity of sensitive data. Unauthorized access and illicit programs comprise a way of penetrating the corporate networks, able to traversing and propagating to other terminals across the private network, in search of confidential data and business secrets. The efficiency of traditional security defenses are being questioned with the number of data breaches occurred nowadays, being essential the development of new active monitoring systems with artificial intelligence capable to achieve almost perfect detection in very short time frames. However, network monitoring and storage of network activity records are restricted and limited by legal laws and privacy strategies, like encryption, aiming to protect the confidentiality of private parties. This dissertation proposes methodologies to infer behavior patterns and disclose anomalies from network traffic analysis, detecting slight variations compared with the normal profile. Bounded by network OSI layers 1 to 4, raw data are modeled in features, representing network observations, and posteriorly, processed by machine learning algorithms to classify network activity. Assuming the inevitability of a network terminal to be compromised, this work comprises two scenarios: a self-spreading force that propagates over internal network and a data exfiltration charge which dispatch confidential info to the public network. Although features and modeling processes have been tested for these two cases, it is a generic operation that can be used in more complex scenarios as well as in different domains. The last chapter describes the proof of concept scenario and how data was generated, along with some evaluation metrics to perceive the model’s performance. The tests manifested promising results, ranging from 96% to 99% for the propagation case and 86% to 97% regarding data exfiltration.Nos dias de hoje, várias organizações enfrentam múltiplas ameaças no interior da sua rede. Numa época onde as empresas dependem cada vez mais da informação, estas ameaças podem compremeter seriamente a segurança e a integridade de dados confidenciais. O acesso não autorizado e o uso de programas ilícitos constituem uma forma de penetrar e ultrapassar as barreiras organizacionais, sendo capazes de propagarem-se para outros terminais presentes no interior da rede privada com o intuito de atingir dados confidenciais e segredos comerciais. A eficiência da segurança oferecida pelos sistemas de defesa tradicionais está a ser posta em causa devido ao elevado número de ataques de divulgação de dados sofridos pelas empresas. Desta forma, o desenvolvimento de novos sistemas de monitorização ativos usando inteligência artificial é crucial na medida de atingir uma deteção mais precisa em curtos períodos de tempo. No entanto, a monitorização e o armazenamento dos registos da atividade da rede são restritos e limitados por questões legais e estratégias de privacidade, como a cifra dos dados, visando proteger a confidencialidade das entidades. Esta dissertação propõe metodologias para inferir padrões de comportamento e revelar anomalias através da análise de tráfego que passa na rede, detetando pequenas variações em comparação com o perfil normal de atividade. Delimitado pelas camadas de rede OSI 1 a 4, os dados em bruto são modelados em features, representando observações de rede e, posteriormente, processados por algoritmos de machine learning para classificar a atividade de rede. Assumindo a inevitabilidade de um terminal ser comprometido, este trabalho compreende dois cenários: um ataque que se auto-propaga sobre a rede interna e uma tentativa de exfiltração de dados que envia informações para a rede pública. Embora os processos de criação de features e de modelação tenham sido testados para estes dois casos, é uma operação genérica que pode ser utilizada em cenários mais complexos, bem como em domínios diferentes. O último capítulo inclui uma prova de conceito e descreve o método de criação dos dados, com a utilização de algumas métricas de avaliação de forma a espelhar a performance do modelo. Os testes mostraram resultados promissores, variando entre 96% e 99% para o caso da propagação e entre 86% e 97% relativamente ao roubo de dados.Mestrado em Engenharia de Computadores e Telemátic

    Generalized transfer component analysis for mismatched JPEG steganalysis

    No full text
    corecore