12 research outputs found

    Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

    Full text link
    Two interlocking research questions of growing interest and importance in privacy research are Authorship Attribution (AA) and Authorship Obfuscation (AO). Given an artifact, especially a text t in question, an AA solution aims to accurately attribute t to its true author out of many candidate authors while an AO solution aims to modify t to hide its true authorship. Traditionally, the notion of authorship and its accompanying privacy concern is only toward human authors. However, in recent years, due to the explosive advancements in Neural Text Generation (NTG) techniques in NLP, capable of synthesizing human-quality open-ended texts (so-called "neural texts"), one has to now consider authorships by humans, machines, or their combination. Due to the implications and potential threats of neural texts when used maliciously, it has become critical to understand the limitations of traditional AA/AO solutions and develop novel AA/AO solutions in dealing with neural texts. In this survey, therefore, we make a comprehensive review of recent literature on the attribution and obfuscation of neural text authorship from a Data Mining perspective, and share our view on their limitations and promising research directions.Comment: Accepted at ACM SIGKDD Explorations, Vol. 25, June 202

    Synthetic steganography: Methods for generating and detecting covert channels in generated media

    Get PDF
    Issues of privacy in communication are becoming increasingly important. For many people and businesses, the use of strong cryptographic protocols is sufficient to protect their communications. However, the overt use of strong cryptography may be prohibited or individual entities may be prohibited from communicating directly. In these cases, a secure alternative to the overt use of strong cryptography is required. One promising alternative is to hide the use of cryptography by transforming ciphertext into innocuous-seeming messages to be transmitted in the clear. ^ In this dissertation, we consider the problem of synthetic steganography: generating and detecting covert channels in generated media. We start by demonstrating how to generate synthetic time series data that not only mimic an authentic source of the data, but also hide data at any of several different locations in the reversible generation process. We then design a steganographic context-sensitive tiling system capable of hiding secret data in a variety of procedurally-generated multimedia objects. Next, we show how to securely hide data in the structure of a Huffman tree without affecting the length of the codes. Next, we present a method for hiding data in Sudoku puzzles, both in the solved board and the clue configuration. Finally, we present a general framework for exploiting steganographic capacity in structured interactions like online multiplayer games, network protocols, auctions, and negotiations. Recognizing that structured interactions represent a vast field of novel media for steganography, we also design and implement an open-source extensible software testbed for analyzing steganographic interactions and use it to measure the steganographic capacity of several classic games. ^ We analyze the steganographic capacity and security of each method that we present and show that existing steganalysis techniques cannot accurately detect the usage of the covert channels. We develop targeted steganalysis techniques which improve detection accuracy and then use the insights gained from those methods to improve the security of the steganographic systems. We find that secure synthetic steganography, and accurate steganalysis thereof, depends on having access to an accurate model of the cover media

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Covert Communication in Autoencoder Wireless Systems

    Get PDF
    The broadcast nature of wireless communications presents security and privacy challenges. Covert communication is a wireless security practice that focuses on intentionally hiding transmitted information. Recently, wireless systems have experienced significant growth, including the emergence of autoencoder-based models. These models, like other DNN architectures, are vulnerable to adversarial attacks, highlighting the need to study their susceptibility to covert communication. While there is ample research on covert communication in traditional wireless systems, the investigation of autoencoder wireless systems remains scarce. Furthermore, many existing covert methods are either detectable analytically or difficult to adapt to diverse wireless systems. The first part of this thesis provides a comprehensive examination of autoencoder-based communication systems in various scenarios and channel conditions. It begins with an introduction to autoencoder communication systems, followed by a detailed discussion of our own implementation and evaluation results. This serves as a solid foundation for the subsequent part of the thesis, where we propose a GAN-based covert communication model. By treating the covert sender, covert receiver, and observer as generator, decoder, and discriminator neural networks, respectively, we conduct joint training in an adversarial setting to develop a covert communication scheme that can be integrated into any normal autoencoder. Our proposal minimizes the impact on ongoing normal communication, addressing previous works shortcomings. We also introduce a training algorithm that allows for the desired tradeoff between covertness and reliability. Numerical results demonstrate the establishment of a reliable and undetectable channel between covert users, regardless of the cover signal or channel condition, with minimal disruption to the normal system operation

    Digital Signal Processing (Second Edition)

    Get PDF
    This book provides an account of the mathematical background, computational methods and software engineering associated with digital signal processing. The aim has been to provide the reader with the mathematical methods required for signal analysis which are then used to develop models and algorithms for processing digital signals and finally to encourage the reader to design software solutions for Digital Signal Processing (DSP). In this way, the reader is invited to develop a small DSP library that can then be expanded further with a focus on his/her research interests and applications. There are of course many excellent books and software systems available on this subject area. However, in many of these publications, the relationship between the mathematical methods associated with signal analysis and the software available for processing data is not always clear. Either the publications concentrate on mathematical aspects that are not focused on practical programming solutions or elaborate on the software development of solutions in terms of working ‘black-boxes’ without covering the mathematical background and analysis associated with the design of these software solutions. Thus, this book has been written with the aim of giving the reader a technical overview of the mathematics and software associated with the ‘art’ of developing numerical algorithms and designing software solutions for DSP, all of which is built on firm mathematical foundations. For this reason, the work is, by necessity, rather lengthy and covers a wide range of subjects compounded in four principal parts. Part I provides the mathematical background for the analysis of signals, Part II considers the computational techniques (principally those associated with linear algebra and the linear eigenvalue problem) required for array processing and associated analysis (error analysis for example). Part III introduces the reader to the essential elements of software engineering using the C programming language, tailored to those features that are used for developing C functions or modules for building a DSP library. The material associated with parts I, II and III is then used to build up a DSP system by defining a number of ‘problems’ and then addressing the solutions in terms of presenting an appropriate mathematical model, undertaking the necessary analysis, developing an appropriate algorithm and then coding the solution in C. This material forms the basis for part IV of this work. In most chapters, a series of tutorial problems is given for the reader to attempt with answers provided in Appendix A. These problems include theoretical, computational and programming exercises. Part II of this work is relatively long and arguably contains too much material on the computational methods for linear algebra. However, this material and the complementary material on vector and matrix norms forms the computational basis for many methods of digital signal processing. Moreover, this important and widely researched subject area forms the foundations, not only of digital signal processing and control engineering for example, but also of numerical analysis in general. The material presented in this book is based on the lecture notes and supplementary material developed by the author for an advanced Masters course ‘Digital Signal Processing’ which was first established at Cranfield University, Bedford in 1990 and modified when the author moved to De Montfort University, Leicester in 1994. The programmes are still operating at these universities and the material has been used by some 700++ graduates since its establishment and development in the early 1990s. The material was enhanced and developed further when the author moved to the Department of Electronic and Electrical Engineering at Loughborough University in 2003 and now forms part of the Department’s post-graduate programmes in Communication Systems Engineering. The original Masters programme included a taught component covering a period of six months based on two semesters, each Semester being composed of four modules. The material in this work covers the first Semester and its four parts reflect the four modules delivered. The material delivered in the second Semester is published as a companion volume to this work entitled Digital Image Processing, Horwood Publishing, 2005 which covers the mathematical modelling of imaging systems and the techniques that have been developed to process and analyse the data such systems provide. Since the publication of the first edition of this work in 2003, a number of minor changes and some additions have been made. The material on programming and software engineering in Chapters 11 and 12 has been extended. This includes some additions and further solved and supplementary questions which are included throughout the text. Nevertheless, it is worth pointing out, that while every effort has been made by the author and publisher to provide a work that is error free, it is inevitable that typing errors and various ‘bugs’ will occur. If so, and in particular, if the reader starts to suffer from a lack of comprehension over certain aspects of the material (due to errors or otherwise) then he/she should not assume that there is something wrong with themselves, but with the author

    Deep Neural Networks and Data for Automated Driving

    Get PDF
    This open access book brings together the latest developments from industry and research on automated driving and artificial intelligence. Environment perception for highly automated driving heavily employs deep neural networks, facing many challenges. How much data do we need for training and testing? How to use synthetic data to save labeling costs for training? How do we increase robustness and decrease memory usage? For inevitably poor conditions: How do we know that the network is uncertain about its decisions? Can we understand a bit more about what actually happens inside neural networks? This leads to a very practical problem particularly for DNNs employed in automated driving: What are useful validation techniques and how about safety? This book unites the views from both academia and industry, where computer vision and machine learning meet environment perception for highly automated driving. Naturally, aspects of data, robustness, uncertainty quantification, and, last but not least, safety are at the core of it. This book is unique: In its first part, an extended survey of all the relevant aspects is provided. The second part contains the detailed technical elaboration of the various questions mentioned above

    JTIT

    Get PDF
    kwartalni
    corecore