18 research outputs found

    Natural Image Statistics for Digital Image Forensics

    Get PDF
    We describe a set of natural image statistics that are built upon two multi-scale image decompositions, the quadrature mirror filter pyramid decomposition and the local angular harmonic decomposition. These image statistics consist of first- and higher-order statistics that capture certain statistical regularities of natural images. We propose to apply these image statistics, together with classification techniques, to three problems in digital image forensics: (1) differentiating photographic images from computer-generated photorealistic images, (2) generic steganalysis; (3) rebroadcast image detection. We also apply these image statistics to the traditional art authentication for forgery detection and identification of artists in an art work. For each application we show the effectiveness of these image statistics and analyze their sensitivity and robustness

    Classifiers and machine learning techniques for image processing and computer vision

    Get PDF
    Orientador: Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto da ComputaçãoResumo: Neste trabalho de doutorado, propomos a utilizaçãoo de classificadores e técnicas de aprendizado de maquina para extrair informações relevantes de um conjunto de dados (e.g., imagens) para solução de alguns problemas em Processamento de Imagens e Visão Computacional. Os problemas de nosso interesse são: categorização de imagens em duas ou mais classes, detecçãao de mensagens escondidas, distinção entre imagens digitalmente adulteradas e imagens naturais, autenticação, multi-classificação, entre outros. Inicialmente, apresentamos uma revisão comparativa e crítica do estado da arte em análise forense de imagens e detecção de mensagens escondidas em imagens. Nosso objetivo é mostrar as potencialidades das técnicas existentes e, mais importante, apontar suas limitações. Com esse estudo, mostramos que boa parte dos problemas nessa área apontam para dois pontos em comum: a seleção de características e as técnicas de aprendizado a serem utilizadas. Nesse estudo, também discutimos questões legais associadas a análise forense de imagens como, por exemplo, o uso de fotografias digitais por criminosos. Em seguida, introduzimos uma técnica para análise forense de imagens testada no contexto de detecção de mensagens escondidas e de classificação geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte. Ao estudarmos esse problema de multi-classificação, surgem algumas questões: como resolver um problema multi-classe de modo a poder combinar, por exemplo, caracteríisticas de classificação de imagens baseadas em cor, textura, forma e silhueta, sem nos preocuparmos demasiadamente em como normalizar o vetor-comum de caracteristicas gerado? Como utilizar diversos classificadores diferentes, cada um, especializado e melhor configurado para um conjunto de caracteristicas ou classes em confusão? Nesse sentido, apresentamos, uma tecnica para fusão de classificadores e caracteristicas no cenário multi-classe através da combinação de classificadores binários. Nós validamos nossa abordagem numa aplicação real para classificação automática de frutas e legumes. Finalmente, nos deparamos com mais um problema interessante: como tornar a utilização de poderosos classificadores binarios no contexto multi-classe mais eficiente e eficaz? Assim, introduzimos uma tecnica para combinação de classificadores binarios (chamados classificadores base) para a resolução de problemas no contexto geral de multi-classificação.Abstract: In this work, we propose the use of classifiers and machine learning techniques to extract useful information from data sets (e.g., images) to solve important problems in Image Processing and Computer Vision. We are particularly interested in: two and multi-class image categorization, hidden messages detection, discrimination among natural and forged images, authentication, and multiclassification. To start with, we present a comparative survey of the state-of-the-art in digital image forensics as well as hidden messages detection. Our objective is to show the importance of the existing solutions and discuss their limitations. In this study, we show that most of these techniques strive to solve two common problems in Machine Learning: the feature selection and the classification techniques to be used. Furthermore, we discuss the legal and ethical aspects of image forensics analysis, such as, the use of digital images by criminals. We introduce a technique for image forensics analysis in the context of hidden messages detection and image classification in categories such as indoors, outdoors, computer generated, and art works. From this multi-class classification, we found some important questions: how to solve a multi-class problem in order to combine, for instance, several different features such as color, texture, shape, and silhouette without worrying about the pre-processing and normalization of the combined feature vector? How to take advantage of different classifiers, each one custom tailored to a specific set of classes in confusion? To cope with most of these problems, we present a feature and classifier fusion technique based on combinations of binary classifiers. We validate our solution with a real application for automatic produce classification. Finally, we address another interesting problem: how to combine powerful binary classifiers in the multi-class scenario more effectively? How to boost their efficiency? In this context, we present a solution that boosts the efficiency and effectiveness of multi-class from binary techniques.DoutoradoEngenharia de ComputaçãoDoutor em Ciência da Computaçã

    Statistical Tools for Digital Image Forensics

    Get PDF
    A digitally altered image, often leaving no visual clues of having been tampered with, can be indistinguishable from an authentic image. The tampering, however, may disturb some underlying statistical properties of the image. Under this assumption, we propose five techniques that quantify and detect statistical perturbations found in different forms of tampered images: (1) re-sampled images (e.g., scaled or rotated); (2) manipulated color filter array interpolated images; (3) double JPEG compressed images; (4) images with duplicated regions; and (5) images with inconsistent noise patterns. These techniques work in the absence of any embedded watermarks or signatures. For each technique we develop the theoretical foundation, show its effectiveness on credible forgeries, and analyze its sensitivity and robustness to simple counter-attacks

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Media Forensics and DeepFakes: an overview

    Full text link
    With the rapid progress of recent years, techniques that generate and manipulate multimedia content can now guarantee a very advanced level of realism. The boundary between real and synthetic media has become very thin. On the one hand, this opens the door to a series of exciting applications in different fields such as creative arts, advertising, film production, video games. On the other hand, it poses enormous security threats. Software packages freely available on the web allow any individual, without special skills, to create very realistic fake images and videos. So-called deepfakes can be used to manipulate public opinion during elections, commit fraud, discredit or blackmail people. Potential abuses are limited only by human imagination. Therefore, there is an urgent need for automated tools capable of detecting false multimedia content and avoiding the spread of dangerous false information. This review paper aims to present an analysis of the methods for visual media integrity verification, that is, the detection of manipulated images and videos. Special emphasis will be placed on the emerging phenomenon of deepfakes and, from the point of view of the forensic analyst, on modern data-driven forensic methods. The analysis will help to highlight the limits of current forensic tools, the most relevant issues, the upcoming challenges, and suggest future directions for research

    DEEP LEARNING FOR FASHION AND FORENSICS

    Get PDF
    Deep learning is the new electricity, which has dramatically reshaped people's everyday life. In this thesis, we focus on two emerging applications of deep learning - fashion and forensics. The ubiquity of online fashion shopping demands effective search and recommendation services for customers. To this end, we first propose an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites. We first fine-tune GoogleNet by jointly modeling clothing images and their corresponding descriptions in a visual-semantic embedding space. Then, for each attribute (word), we generate its spatially-aware representation by combining its semantic word vector representation with its spatial representation derived from the convolutional maps of the fine-tuned network. The resulting spatially-aware representations are further used to cluster attributes into multiple groups to form spatially-aware concepts (e.g., the neckline concept might consist of attributes like v-neck, round-neck}, \textit{etc}). Finally, we decompose the visual-semantic embedding space into multiple concept-specific subspaces, which facilitates structured browsing and attribute-feedback product retrieval by exploiting multimodal linguistic regularities. We conducted extensive experiments on our newly collected Fashion200K dataset, and results on clustering quality evaluation and attribute-feedback product retrieval task demonstrate the effectiveness of our automatically discovered spatially-aware concepts. For fashion recommendation tasks, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user. To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion. More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step. Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships. Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM. The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit. We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods. In addition to searching and recommendation, customers also would like to virtually try-on fashion items. We present an image-based VIirtual Try-On Network (VITON) without using 3D information in any form, which seamlessly transfers a desired clothing item onto the corresponding region of a person using a coarse-to-fine strategy. Conditioned upon a new clothing-agnostic yet descriptive person representation, our framework first generates a coarse synthesized image with the target clothing item overlaid on that same person in the same pose. We further enhance the initial blurry clothing area with a refinement network. The network is trained to learn how much detail to utilize from the target clothing item, and where to apply to the person in order to synthesize a photo-realistic image in which the target item deforms naturally with clear visual patterns. Experiments on our newly collected dataset demonstrate its promise in the image-based virtual try-on task over state-of-the-art generative models. Interestingly, VITON can be modified to swap faces instead of swapping clothing items. Conditioned on the landmarks of a face, generative adversarial networks can synthesize a target identity on to the original face keeping the original facial expression. We achieve this by introducing an identity preserving loss together with a perceptually-aware discriminator. The identity preserving loss tries to keep the synthesized face presents the same identity as the target, while the perceptually-aware discriminator ensures the generated face looks realistic. It is worth noticing that these face-swap techniques can be easily used to manipulated people's faces, and might cause serious social and political consequences. Researchers have developed powerful tools to detect these manipulations. In this dissertation, we utilize convolutional neural networks to boost the detection accuracy of tampered face or person in images. Firstly, a two-stream network is proposed to determine if a face has been tampered with. We train a GoogLeNet to detect tampering artifacts in a face classification stream, and train a patch based triplet network to leverage features capturing local noise residuals and camera characteristics as a second stream. In addition, we use two different online face swapping applications to create a new dataset that consists of 2010 tampered images, each of which contains a tampered face. We evaluate the proposed two-stream network on our newly collected dataset. Experimental results demonstrate the effectiveness of our method. Further, spliced people are also very common in image manipulation. We describe a tampering detection system containing multiple modules, which model different aspects of tampering traces. The system first detects faces in an image. Then, for each detected face, it enlarges the bounding box to include a portrait image of that person. Three models are fused to detect if this person (portrait) is tampered or not: (i) PortraintNet: A binary classifier fine-tuned on ImageNet pre-trained GoogLeNet. (ii) SegNet: A U-Net predicts tampered masks and boundaries, followed by a LeNet to classify if the predicted masks and boundaries indicating the image has been tampered with or not. (iii) EdgeNet: A U-Net predicts the edge mask of each portrait, and the extracted portrait edges are fed into a GoogLeNet for tampering classification. Experiments show that these three models are complementary and can be fused to effectively detect a spliced portrait image
    corecore