4,046 research outputs found
Fast object detection in compressed JPEG Images
Object detection in still images has drawn a lot of attention over past few
years, and with the advent of Deep Learning impressive performances have been
achieved with numerous industrial applications. Most of these deep learning
models rely on RGB images to localize and identify objects in the image.
However in some application scenarii, images are compressed either for storage
savings or fast transmission. Therefore a time consuming image decompression
step is compulsory in order to apply the aforementioned deep models. To
alleviate this drawback, we propose a fast deep architecture for object
detection in JPEG images, one of the most widespread compression format. We
train a neural network to detect objects based on the blockwise DCT (discrete
cosine transform) coefficients {issued from} the JPEG compression algorithm. We
modify the well-known Single Shot multibox Detector (SSD) by replacing its
first layers with one convolutional layer dedicated to process the DCT inputs.
Experimental evaluations on PASCAL VOC and industrial dataset comprising images
of road traffic surveillance show that the model is about faster than
regular SSD with promising detection performances. To the best of our
knowledge, this paper is the first to address detection in compressed JPEG
images
Semantic Perceptual Image Compression using Deep Convolution Networks
It has long been considered a significant problem to improve the visual
quality of lossy image and video compression. Recent advances in computing
power together with the availability of large training data sets has increased
interest in the application of deep learning cnns to address image recognition
and image processing tasks. Here, we present a powerful cnn tailored to the
specific task of semantic image understanding to achieve higher visual quality
in lossy compression. A modest increase in complexity is incorporated to the
encoder which allows a standard, off-the-shelf jpeg decoder to be used. While
jpeg encoding may be optimized for generic images, the process is ultimately
unaware of the specific content of the image to be compressed. Our technique
makes jpeg content-aware by designing and training a model to identify multiple
semantic regions in a given image. Unlike object detection techniques, our
model does not require labeling of object positions and is able to identify
objects in a single pass. We present a new cnn architecture directed
specifically to image compression, which generates a map that highlights
semantically-salient regions so that they can be encoded at higher quality as
compared to background regions. By adding a complete set of features for every
class, and then taking a threshold over the sum of all feature activations, we
generate a map that highlights semantically-salient regions so that they can be
encoded at a better quality compared to background regions. Experiments are
presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,
in which our algorithm achieves higher visual quality for the same compressed
size.Comment: Accepted to Data Compression Conference, 11 pages, 5 figure
CAS-CNN: A Deep Convolutional Neural Network for Image Compression Artifact Suppression
Lossy image compression algorithms are pervasively used to reduce the size of
images transmitted over the web and recorded on data storage media. However, we
pay for their high compression rate with visual artifacts degrading the user
experience. Deep convolutional neural networks have become a widespread tool to
address high-level computer vision tasks very successfully. Recently, they have
found their way into the areas of low-level computer vision and image
processing to solve regression problems mostly with relatively shallow
networks.
We present a novel 12-layer deep convolutional network for image compression
artifact suppression with hierarchical skip connections and a multi-scale loss
function. We achieve a boost of up to 1.79 dB in PSNR over ordinary JPEG and an
improvement of up to 0.36 dB over the best previous ConvNet result. We show
that a network trained for a specific quality factor (QF) is resilient to the
QF used to compress the input image - a single network trained for QF 60
provides a PSNR gain of more than 1.5 dB over the wide QF range from 40 to 76.Comment: 8 page
Localization of JPEG double compression through multi-domain convolutional neural networks
When an attacker wants to falsify an image, in most of cases she/he will
perform a JPEG recompression. Different techniques have been developed based on
diverse theoretical assumptions but very effective solutions have not been
developed yet. Recently, machine learning based approaches have been started to
appear in the field of image forensics to solve diverse tasks such as
acquisition source identification and forgery detection. In this last case, the
aim ahead would be to get a trained neural network able, given a to-be-checked
image, to reliably localize the forged areas. With this in mind, our paper
proposes a step forward in this direction by analyzing how a single or double
JPEG compression can be revealed and localized using convolutional neural
networks (CNNs). Different kinds of input to the CNN have been taken into
consideration, and various experiments have been carried out trying also to
evidence potential issues to be further investigated.Comment: Accepted to CVPRW 2017, Workshop on Media Forensic
An Evaluation of Popular Copy-Move Forgery Detection Approaches
A copy-move forgery is created by copying and pasting content within the same
image, and potentially post-processing it. In recent years, the detection of
copy-move forgeries has become one of the most actively researched topics in
blind image forensics. A considerable number of different algorithms have been
proposed focusing on different types of postprocessed copies. In this paper, we
aim to answer which copy-move forgery detection algorithms and processing steps
(e.g., matching, filtering, outlier detection, affine transformation
estimation) perform best in various postprocessing scenarios. The focus of our
analysis is to evaluate the performance of previously proposed feature sets. We
achieve this by casting existing algorithms in a common pipeline. In this
paper, we examined the 15 most prominent feature sets. We analyzed the
detection performance on a per-image basis and on a per-pixel basis. We created
a challenging real-world copy-move dataset, and a software framework for
systematic image manipulation. Experiments show, that the keypoint-based
features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and
Zernike features perform very well. These feature sets exhibit the best
robustness against various noise sources and downsampling, while reliably
identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper
appeared in IEEE Transaction on Information Forensics and Securit
- …