8,980 research outputs found
Contextual cropping and scaling of TV productions
This is the author's accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/s11042-011-0804-3. Copyright @ Springer Science+Business Media, LLC 2011.In this paper, an application is presented which automatically adapts SDTV (Standard Definition Television) sports productions to smaller displays through intelligent cropping and scaling. It crops regions of interest of sports productions based on a smart combination of production metadata and systematic video analysis methods. This approach allows a context-based composition of cropped images. It provides a differentiation between the original SD version of the production and the processed one adapted to the requirements for mobile TV. The system has been comprehensively evaluated by comparing the outcome of the proposed method with manually and statically cropped versions, as well as with non-cropped versions. Envisaged is the integration of the tool in post-production and live workflows
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
FaceForensics++: Learning to Detect Manipulated Facial Images
The rapid progress in synthetic image generation and manipulation has now
come to a point where it raises significant concerns for the implications
towards society. At best, this leads to a loss of trust in digital content, but
could potentially cause further harm by spreading false information or fake
news. This paper examines the realism of state-of-the-art image manipulations,
and how difficult it is to detect them, either automatically or by humans. To
standardize the evaluation of detection methods, we propose an automated
benchmark for facial manipulation detection. In particular, the benchmark is
based on DeepFakes, Face2Face, FaceSwap and NeuralTextures as prominent
representatives for facial manipulations at random compression level and size.
The benchmark is publicly available and contains a hidden test set as well as a
database of over 1.8 million manipulated images. This dataset is over an order
of magnitude larger than comparable, publicly available, forgery datasets.
Based on this data, we performed a thorough analysis of data-driven forgery
detectors. We show that the use of additional domainspecific knowledge improves
forgery detection to unprecedented accuracy, even in the presence of strong
compression, and clearly outperforms human observers.Comment: Video: https://youtu.be/x2g48Q2I2Z
Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization
The self-media era provides us tremendous high quality videos. Unfortunately,
frequent video copyright infringements are now seriously damaging the interests
and enthusiasm of video creators. Identifying infringing videos is therefore a
compelling task. Current state-of-the-art methods tend to simply feed
high-dimensional mixed video features into deep neural networks and count on
the networks to extract useful representations. Despite its simplicity, this
paradigm heavily relies on the original entangled features and lacks
constraints guaranteeing that useful task-relevant semantics are extracted from
the features.
In this paper, we seek to tackle the above challenges from two aspects: (1)
We propose to disentangle an original high-dimensional feature into multiple
sub-features, explicitly disentangling the feature into exclusive
lower-dimensional components. We expect the sub-features to encode
non-overlapping semantics of the original feature and remove redundant
information.
(2) On top of the disentangled sub-features, we further learn an auxiliary
feature to enhance the sub-features. We theoretically analyzed the mutual
information between the label and the disentangled features, arriving at a loss
that maximizes the extraction of task-relevant information from the original
feature.
Extensive experiments on two large-scale benchmark datasets (i.e., SVD and
VCSL) demonstrate that our method achieves 90.1% TOP-100 mAP on the large-scale
SVD dataset and also sets the new state-of-the-art on the VCSL benchmark
dataset. Our code and model have been released at
https://github.com/yyyooooo/DMI/, hoping to contribute to the community.Comment: This paper is accepted by ACM MM 202
Image objects detection based on boosting neural network
This paper discusses the problem of object area detection of video frames. The goal is to design a pixel accurate detector for grass, which could be used for object adaptive video enhancement. A boosting neural network is used for creating such a detector. The resulted detector uses both textural features and color features of the frames
A robust forgery detection method for copy-move and splicing attacks in images
Internet of Things (IoT) image sensors, social media, and smartphones generate huge volumes of digital images every day. Easy availability and usability of photo editing tools have made forgery attacks, primarily splicing and copy-move attacks, effortless, causing cybercrimes to be on the rise. While several models have been proposed in the literature for detecting these attacks, the robustness of those models has not been investigated when (i) a low number of tampered images are available for model building or (ii) images from IoT sensors are distorted due to image rotation or scaling caused by unwanted or unexpected changes in sensors' physical set-up. Moreover, further improvement in detection accuracy is needed for real-word security management systems. To address these limitations, in this paper, an innovative image forgery detection method has been proposed based on Discrete Cosine Transformation (DCT) and Local Binary Pattern (LBP) and a new feature extraction method using the mean operator. First, images are divided into non-overlapping fixed size blocks and 2D block DCT is applied to capture changes due to image forgery. Then LBP is applied to the magnitude of the DCT array to enhance forgery artifacts. Finally, the mean value of a particular cell across all LBP blocks is computed, which yields a fixed number of features and presents a more computationally efficient method. Using Support Vector Machine (SVM), the proposed method has been extensively tested on four well known publicly available gray scale and color image forgery datasets, and additionally on an IoT based image forgery dataset that we built. Experimental results reveal the superiority of our proposed method over recent state-of-the-art methods in terms of widely used performance metrics and computational time and demonstrate robustness against low availability of forged training samples.This research was funded by Research Priority Area (RPA) scholarship of Federation University Australia
A Novel Dataset for Non-Destructive Inspection of Handwritten Documents
Forensic handwriting examination is a branch of Forensic Science that aims to
examine handwritten documents in order to properly define or hypothesize the
manuscript's author. These analysis involves comparing two or more (digitized)
documents through a comprehensive comparison of intrinsic local and global
features. If a correlation exists and specific best practices are satisfied,
then it will be possible to affirm that the documents under analysis were
written by the same individual. The need to create sophisticated tools capable
of extracting and comparing significant features has led to the development of
cutting-edge software with almost entirely automated processes, improving the
forensic examination of handwriting and achieving increasingly objective
evaluations. This is made possible by algorithmic solutions based on purely
mathematical concepts. Machine Learning and Deep Learning models trained with
specific datasets could turn out to be the key elements to best solve the task
at hand. In this paper, we proposed a new and challenging dataset consisting of
two subsets: the first consists of 21 documents written either by the classic
``pen and paper" approach (and later digitized) and directly acquired on common
devices such as tablets; the second consists of 362 handwritten manuscripts by
124 different people, acquired following a specific pipeline. Our study
pioneered a comparison between traditionally handwritten documents and those
produced with digital tools (e.g., tablets). Preliminary results on the
proposed datasets show that 90% classification accuracy can be achieved on the
first subset (documents written on both paper and pen and later digitized and
on tablets) and 96% on the second portion of the data. The datasets are
available at
https://iplab.dmi.unict.it/mfs/forensic-handwriting-analysis/novel-dataset-2023/.Comment: arXiv admin note: text overlap with arXiv:2310.1121
- …