327 research outputs found
Generalized Inpainting Method for Hyperspectral Image Acquisition
A recently designed hyperspectral imaging device enables multiplexed
acquisition of an entire data volume in a single snapshot thanks to
monolithically-integrated spectral filters. Such an agile imaging technique
comes at the cost of a reduced spatial resolution and the need for a
demosaicing procedure on its interleaved data. In this work, we address both
issues and propose an approach inspired by recent developments in compressed
sensing and analysis sparse models. We formulate our superresolution and
demosaicing task as a 3-D generalized inpainting problem. Interestingly, the
target spatial resolution can be adjusted for mitigating the compression level
of our sensing. The reconstruction procedure uses a fast greedy method called
Pseudo-inverse IHT. We also show on simulations that a random arrangement of
the spectral filters on the sensor is preferable to regular mosaic layout as it
improves the quality of the reconstruction. The efficiency of our technique is
demonstrated through numerical experiments on both synthetic and real data as
acquired by the snapshot imager.Comment: Keywords: Hyperspectral, inpainting, iterative hard thresholding,
sparse models, CMOS, Fabry-P\'ero
A Comprehensive Review on Digital Image Watermarking
The advent of the Internet led to the easy availability of digital data like
images, audio, and video. Easy access to multimedia gives rise to the issues
such as content authentication, security, copyright protection, and ownership
identification. Here, we discuss the concept of digital image watermarking with
a focus on the technique used in image watermark embedding and extraction of
the watermark. The detailed classification along with the basic
characteristics, namely visual imperceptibility, robustness, capacity, security
of digital watermarking is also presented in this work. Further, we have also
discussed the recent application areas of digital watermarking such as
healthcare, remote education, electronic voting systems, and the military. The
robustness is evaluated by examining the effect of image processing attacks on
the signed content and the watermark recoverability. The authors believe that
the comprehensive survey presented in this paper will help the new researchers
to gather knowledge in this domain. Further, the comparative analysis can
enkindle ideas to improve upon the already mentioned techniques
Maximum Energy Subsampling: A General Scheme For Multi-resolution Image Representation And Analysis
Image descriptors play an important role in image representation and analysis. Multi-resolution image descriptors can effectively characterize complex images and extract their hidden information.
Wavelets descriptors have been widely used in multi-resolution image analysis. However, making the wavelets transform shift and rotation invariant produces redundancy and requires complex matching processes. As to other multi-resolution descriptors, they usually depend on other theories or information, such as filtering function, prior-domain knowledge, etc.; that not only increases the computation complexity, but also generates errors.
We propose a novel multi-resolution scheme that is capable of transforming any kind of image descriptor into its multi-resolution structure with high computation accuracy and efficiency. Our multi-resolution scheme is based on sub-sampling an image into an odd-even image tree. Through applying image descriptors to the odd-even image tree, we get the relative multi-resolution image descriptors. Multi-resolution analysis is based on downsampling expansion with maximum energy extraction followed by upsampling reconstruction. Since the maximum energy usually retained in the lowest frequency coefficients; we do maximum energy extraction through keeping the lowest coefficients from each resolution level.
Our multi-resolution scheme can analyze images recursively and effectively without introducing artifacts or changes to the original images, produce multi-resolution representations, obtain higher resolution images only using information from lower resolutions, compress data, filter noise, extract effective image features and be implemented in parallel processing
Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
With advanced image journaling tools, one can easily alter the semantic
meaning of an image by exploiting certain manipulation techniques such as
copy-clone, object splicing, and removal, which mislead the viewers. In
contrast, the identification of these manipulations becomes a very challenging
task as manipulated regions are not visually apparent. This paper proposes a
high-confidence manipulation localization architecture which utilizes
resampling features, Long-Short Term Memory (LSTM) cells, and encoder-decoder
network to segment out manipulated regions from non-manipulated ones.
Resampling features are used to capture artifacts like JPEG quality loss,
upsampling, downsampling, rotation, and shearing. The proposed network exploits
larger receptive fields (spatial maps) and frequency domain correlation to
analyze the discriminative characteristics between manipulated and
non-manipulated regions by incorporating encoder and LSTM network. Finally,
decoder network learns the mapping from low-resolution feature maps to
pixel-wise predictions for image tamper localization. With predicted mask
provided by final layer (softmax) of the proposed architecture, end-to-end
training is performed to learn the network parameters through back-propagation
using ground-truth masks. Furthermore, a large image splicing dataset is
introduced to guide the training process. The proposed method is capable of
localizing image manipulations at pixel level with high precision, which is
demonstrated through rigorous experimentation on three diverse datasets
Structured Sparsity: Discrete and Convex approaches
Compressive sensing (CS) exploits sparsity to recover sparse or compressible
signals from dimensionality reducing, non-adaptive sensing mechanisms. Sparsity
is also used to enhance interpretability in machine learning and statistics
applications: While the ambient dimension is vast in modern data analysis
problems, the relevant information therein typically resides in a much lower
dimensional space. However, many solutions proposed nowadays do not leverage
the true underlying structure. Recent results in CS extend the simple sparsity
idea to more sophisticated {\em structured} sparsity models, which describe the
interdependency between the nonzero components of a signal, allowing to
increase the interpretability of the results and lead to better recovery
performance. In order to better understand the impact of structured sparsity,
in this chapter we analyze the connections between the discrete models and
their convex relaxations, highlighting their relative advantages. We start with
the general group sparse model and then elaborate on two important special
cases: the dispersive and the hierarchical models. For each, we present the
models in their discrete nature, discuss how to solve the ensuing discrete
problems and then describe convex relaxations. We also consider more general
structures as defined by set functions and present their convex proxies.
Further, we discuss efficient optimization solutions for structured sparsity
problems and illustrate structured sparsity in action via three applications.Comment: 30 pages, 18 figure
Feature extraction using MPEG-CDVS and Deep Learning with application to robotic navigation and image classification
The main contributions of this thesis are the evaluation of MPEG Compact Descriptor for Visual Search in the context of indoor robotic navigation and the introduction of a new method for training Convolutional Neural Networks with applications to
object classification.
The choice for image descriptor in a visual navigation system is not straightforward. Visual descriptors must be distinctive enough to allow for correct localisation while still offering low matching complexity and short descriptor size for real-time applications. MPEG Compact Descriptor for Visual Search is a low complexity image descriptor that offers several levels of compromises between descriptor distinctiveness and size. In this work, we describe how these trade-offs can be used for efficient loop-detection in a typical indoor environment. We first describe a probabilistic approach to loop detection based on the standard’s suggested similarity metric. We then evaluate the performance of CDVS compression modes in terms
of matching speed, feature extraction, and storage requirements and compare them with the state of the art SIFT descriptor for five different types of indoor floors.
During the second part of this thesis we focus on the new paradigm to machine learning and computer vision called Deep Learning. Under this paradigm visual features are no longer extracted using fine-grained, highly engineered feature extractor, but rather using a Convolutional Neural Networks (CNN) that extracts hierarchical features learned directly from data at the cost of long training periods.
In this context, we propose a method for speeding up the training of Convolutional Neural Networks (CNN) by exploiting the spatial scaling property of convolutions. This is done by first training a pre-train CNN of smaller kernel resolutions for a few epochs, followed by properly rescaling its kernels to the target’s original dimensions and continuing training at full resolution. We show that the overall training time of a target CNN architecture can be reduced by exploiting the spatial scaling property of convolutions during early stages of learning. Moreover, by rescaling the kernels at different epochs, we identify a trade-off between total training time and maximum obtainable accuracy. Finally, we propose a method for choosing when to rescale kernels and evaluate our approach on recent architectures showing savings in training times of nearly 20% while test set accuracy is preserved
Oblivious data hiding : a practical approach
This dissertation presents an in-depth study of oblivious data hiding with the emphasis on quantization based schemes. Three main issues are specifically addressed:
1. Theoretical and practical aspects of embedder-detector design.
2. Performance evaluation, and analysis of performance vs. complexity tradeoffs.
3. Some application specific implementations.
A communications framework based on channel adaptive encoding and channel independent decoding is proposed and interpreted in terms of oblivious data hiding problem. The duality between the suggested encoding-decoding scheme and practical embedding-detection schemes are examined. With this perspective, a formal treatment of the processing employed in quantization based hiding methods is presented. In accordance with these results, the key aspects of embedder-detector design problem for practical methods are laid out, and various embedding-detection schemes are compared in terms of probability of error, normalized correlation, and hiding rate performance merits assuming AWGN attack scenarios and using mean squared error distortion measure.
The performance-complexity tradeoffs available for large and small embedding signal size (availability of high bandwidth and limitation of low bandwidth) cases are examined and some novel insights are offered. A new codeword generation scheme is proposed to enhance the performance of low-bandwidth applications. Embeddingdetection schemes are devised for watermarking application of data hiding, where robustness against the attacks is the main concern rather than the hiding rate or payload. In particular, cropping-resampling and lossy compression types of noninvertible attacks are considered in this dissertation work
An Evaluation of Popular Copy-Move Forgery Detection Approaches
A copy-move forgery is created by copying and pasting content within the same
image, and potentially post-processing it. In recent years, the detection of
copy-move forgeries has become one of the most actively researched topics in
blind image forensics. A considerable number of different algorithms have been
proposed focusing on different types of postprocessed copies. In this paper, we
aim to answer which copy-move forgery detection algorithms and processing steps
(e.g., matching, filtering, outlier detection, affine transformation
estimation) perform best in various postprocessing scenarios. The focus of our
analysis is to evaluate the performance of previously proposed feature sets. We
achieve this by casting existing algorithms in a common pipeline. In this
paper, we examined the 15 most prominent feature sets. We analyzed the
detection performance on a per-image basis and on a per-pixel basis. We created
a challenging real-world copy-move dataset, and a software framework for
systematic image manipulation. Experiments show, that the keypoint-based
features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and
Zernike features perform very well. These feature sets exhibit the best
robustness against various noise sources and downsampling, while reliably
identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper
appeared in IEEE Transaction on Information Forensics and Securit
- …