5 research outputs found

    multi-patch aggregation models for resampling detection

    Full text link
    Images captured nowadays are of varying dimensions with smartphones and DSLR's allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments, we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes. To handle this issue, we propose a novel pooling strategy called ITERATIVE POOLING. This pooling strategy can dynamically adjust input tensors in a discrete without much loss of information as in ROI Max-pooling. This pooling strategy can be used with any of the existing deep models and for demonstration purposes, we show its utility on Resnet-18 for the case of resampling detection a fundamental operation for any image sought of image manipulation. Compared to existing strategies and Max-pooling it gives up to 7-8% improvement on public datasets.Comment: 6 pages; 6 tables; 4 figure

    Towards Effective Image Forensics via A Novel Computationally Efficient Framework and A New Image Splice Dataset

    Full text link
    Splice detection models are the need of the hour since splice manipulations can be used to mislead, spread rumors and create disharmony in society. However, there is a severe lack of image splicing datasets, which restricts the capabilities of deep learning models to extract discriminative features without overfitting. This manuscript presents two-fold contributions toward splice detection. Firstly, a novel splice detection dataset is proposed having two variants. The two variants include spliced samples generated from code and through manual editing. Spliced images in both variants have corresponding binary masks to aid localization approaches. Secondly, a novel Spatio-Compression Lightweight Splice Detection Framework is proposed for accurate splice detection with minimum computational cost. The proposed dual-branch framework extracts discriminative spatial features from a lightweight spatial branch. It uses original resolution compression data to extract double compression artifacts from the second branch, thereby making it 'information preserving.' Several CNNs are tested in combination with the proposed framework on a composite dataset of images from the proposed dataset and the CASIA v2.0 dataset. The best model accuracy of 0.9382 is achieved and compared with similar state-of-the-art methods, demonstrating the superiority of the proposed framework

    RemNet: Remnant Convolutional Neural Network for Camera Model Identification and Image Manipulation Detection

    Get PDF
    Camera model identification (CMI) and image manipulation detection are of paramount importance in image forensics as digitally altered images are becoming increasingly commonplace. In this thesis, we propose a novel convolutional neural network (CNN) architecture for performing these two crucial tasks. Our proposed Remnant Convolutional Neural Network (RemNet) is designed with emphasis given on the preprocessing task considered to be inevitable for removing the scene content that heavily obscures the camera model fingerprints and image manipulation artifacts. Unlike the conventional approaches where fixed filters are used for preprocessing, the proposed remnant blocks, when coupled with a classification block and trained end-to-end, learn to suppress the unnecessary image contents dynamically. This helps the classification block extract more robust images forensics features from the remnant of the image. We also propose a variant of the network titled L2-constrained Remnant Convolutional Neural Network (L2-constrained RemNet), where an L2 loss is applied to the output of the preprocessor block, and categorical crossentropy loss is calculated based on the output of the classification block. The whole network is trained in an end-to-end manner by minimizing the total loss, which is a combination of the L2 loss and the categorical crossentropy loss. The whole network, consisting of a preprocessing block and a shallow classification block, when trained on 18 models from the Dresden database, shows 100% accuracy for 16 camera models with an overall accuracy of 98.15% on test images from unseen devices and scenes, outperforming the state-of-the-art deep CNNs used in CMI. Furthermore, the proposed remnant blocks, when cascaded with the existing deep CNNs, e.g., ResNet, DenseNet, boost their performances by a large margin. The proposed approach proves to be very robust in identifying the source camera models, even if the original images are post-processed. It also achieves an overall accuracy of 95.49% on the IEEE Signal Processing Cup 2018 dataset, which indicates its generalizability. Furthermore, we attain an overall accuracy of 99.68% in image manipulation detection, which implies that it can be used as a general-purpose network for image forensic tasks
    corecore