2,294 research outputs found
A Novel Semantics and Feature Preserving Perspective for Content Aware Image Retargeting
There is an increasing requirement for efficient image retargeting techniques
to adapt the content to various forms of digital media. With rapid growth of
mobile communications and dynamic web page layouts, one often needs to resize
the media content to adapt to the desired display sizes. For various layouts of
web pages and typically small sizes of handheld portable devices, the
importance in the original image content gets obfuscated after resizing it with
the approach of uniform scaling. Thus, there occurs a need for resizing the
images in a content aware manner which can automatically discard irrelevant
information from the image and present the salient features with more
magnitude. There have been proposed some image retargeting techniques keeping
in mind the content awareness of the input image. However, these techniques
fail to prove globally effective for various kinds of images and desired sizes.
The major problem is the inefficiency of these algorithms to process these
images with minimal visual distortion while also retaining the meaning conveyed
from the image. In this dissertation, we present a novel perspective for
content aware image retargeting, which is well implementable in real time. We
introduce a novel method of analysing semantic information within the input
image while also maintaining the important and visually significant features.
We present the various nuances of our algorithm mathematically and logically,
and show that the results prove better than the state-of-the-art techniques.Comment: 74 Pages, 46 Figures, Masters Thesi
Hierarchical Watermarking Framework Based on Analysis of Local Complexity Variations
Increasing production and exchange of multimedia content has increased the
need for better protection of copyright by means of watermarking. Different
methods have been proposed to satisfy the tradeoff between imperceptibility and
robustness as two important characteristics in watermarking while maintaining
proper data-embedding capacity. Many watermarking methods use image independent
set of parameters. Different images possess different potentials for robust and
transparent hosting of watermark data. To overcome this deficiency, in this
paper we have proposed a new hierarchical adaptive watermarking framework. At
the higher level of hierarchy, complexity of an image is ranked in comparison
with complexities of images of a dataset. For a typical dataset of images, the
statistical distribution of block complexities is found. At the lower level of
the hierarchy, for a single cover image that is to be watermarked, complexities
of blocks can be found. Local complexity variation (LCV) among a block and its
neighbors is used to adaptively control the watermark strength factor of each
block. Such local complexity analysis creates an adaptive embedding scheme,
which results in higher transparency by reducing blockiness effects. This two
level hierarchy has enabled our method to take advantage of all image blocks to
elevate the embedding capacity while preserving imperceptibility. For testing
the effectiveness of the proposed framework, contourlet transform (CT) in
conjunction with discrete cosine transform (DCT) is used to embed pseudo-random
binary sequences as watermark. Experimental results show that the proposed
framework elevates the performance the watermarking routine in terms of both
robustness and transparency.Comment: 12 pages, 14 figures, 8 table
Face Sketch Synthesis Style Similarity:A New Structure Co-occurrence Texture Measure
Existing face sketch synthesis (FSS) similarity measures are sensitive to
slight image degradation (e.g., noise, blur). However, human perception of the
similarity of two sketches will consider both structure and texture as
essential factors and is not sensitive to slight ("pixel-level") mismatches.
Consequently, the use of existing similarity measures can lead to better
algorithms receiving a lower score than worse algorithms. This unreliable
evaluation has significantly hindered the development of the FSS field. To
solve this problem, we propose a novel and robust style similarity measure
called Scoot-measure (Structure CO-Occurrence Texture Measure), which
simultaneously evaluates "block-level" spatial structure and co-occurrence
texture statistics. In addition, we further propose 4 new meta-measures and
create 2 new datasets to perform a comprehensive evaluation of several
widely-used FSS measures on two large databases. Experimental results
demonstrate that our measure not only provides a reliable evaluation but also
achieves significantly improved performance. Specifically, the study indicated
a higher degree (78.8%) of correlation between our measure and human judgment
than the best prior measure (58.6%). Our code will be made available.Comment: 9pages, 15 figures, conferenc
Reducing the Model Variance of a Rectal Cancer Segmentation Network
In preoperative imaging, the demarcation of rectal cancer with magnetic
resonance images provides an important basis for cancer staging and treatment
planning. Recently, deep learning has greatly improved the state-of-the-art
method in automatic segmentation. However, limitations in data availability in
the medical field can cause large variance and consequent overfitting to
medical image segmentation networks. In this study, we propose methods to
reduce the model variance of a rectal cancer segmentation network by adding a
rectum segmentation task and performing data augmentation; the geometric
correlation between the rectum and rectal cancer motivated the former approach.
Moreover, we propose a method to perform a bias-variance analysis within an
arbitrary region-of-interest (ROI) of a segmentation network, which we applied
to assess the efficacy of our approaches in reducing model variance. As a
result, adding a rectum segmentation task reduced the model variance of the
rectal cancer segmentation network within tumor regions by a factor of 0.90;
data augmentation further reduced the variance by a factor of 0.89. These
approaches also reduced the training duration by a factor of 0.96 and a further
factor of 0.78, respectively. Our approaches will improve the quality of rectal
cancer staging by increasing the accuracy of its automatic demarcation and by
providing rectum boundary information since rectal cancer staging requires the
demarcation of both rectum and rectal cancer. Besides such clinical benefits,
our method also enables segmentation networks to be assessed with bias-variance
analysis within an arbitrary ROI, such as a cancerous region.Comment: published at IEEE ACCES
Forensic Similarity for Digital Images
In this paper we introduce a new digital image forensics approach called
forensic similarity, which determines whether two image patches contain the
same forensic trace or different forensic traces. One benefit of this approach
is that prior knowledge, e.g. training samples, of a forensic trace are not
required to make a forensic similarity decision on it in the future. To do
this, we propose a two part deep-learning system composed of a CNN-based
feature extractor and a three-layer neural network, called the similarity
network. This system maps pairs of image patches to a score indicating whether
they contain the same or different forensic traces. We evaluated system
accuracy of determining whether two image patches were 1) captured by the same
or different camera model, 2) manipulated by the same or different editing
operation, and 3) manipulated by the same or different manipulation parameter,
given a particular editing operation. Experiments demonstrate applicability to
a variety of forensic traces, and importantly show efficacy on "unknown"
forensic traces that were not used to train the system. Experiments also show
that the proposed system significantly improves upon prior art, reducing error
rates by more than half. Furthermore, we demonstrated the utility of the
forensic similarity approach in two practical applications: forgery detection
and localization, and database consistency verification.Comment: 16 pages, Accepted for publication with IEEE T-IFS (IEEE Transactions
on Information Forensics and Security, 2019
Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks
We propose a novel video object segmentation algorithm based on pixel-level
matching using Convolutional Neural Networks (CNN). Our network aims to
distinguish the target area from the background on the basis of the pixel-level
similarity between two object units. The proposed network represents a target
object using features from different depth layers in order to take advantage of
both the spatial details and the category-level semantic information.
Furthermore, we propose a feature compression technique that drastically
reduces the memory requirements while maintaining the capability of feature
representation. Two-stage training (pre-training and fine-tuning) allows our
network to handle any target object regardless of its category (even if the
object's type does not belong to the pre-training data) or of variations in its
appearance through a video sequence. Experiments on large datasets demonstrate
the effectiveness of our model - against related methods - in terms of
accuracy, speed, and stability. Finally, we introduce the transferability of
our network to different domains, such as the infrared data domain.Comment: To appear on ICCV 201
InGAN: Capturing and Remapping the "DNA" of a Natural Image
Generative Adversarial Networks (GANs) typically learn a distribution of
images in a large image dataset, and are then able to generate new images from
this distribution. However, each natural image has its own internal statistics,
captured by its unique distribution of patches. In this paper we propose an
"Internal GAN" (InGAN) - an image-specific GAN - which trains on a single input
image and learns its internal distribution of patches. It is then able to
synthesize a plethora of new natural images of significantly different sizes,
shapes and aspect-ratios - all with the same internal patch-distribution (same
"DNA") as the input image. In particular, despite large changes in global
size/shape of the image, all elements inside the image maintain their local
size/shape. InGAN is fully unsupervised, requiring no additional data other
than the input image itself. Once trained on the input image, it can remap the
input to any size or shape in a single feedforward pass, while preserving the
same internal patch distribution. InGAN provides a unified framework for a
variety of tasks, bridging the gap between textures and natural images
3D Surface Reconstruction of Underwater Objects
In this paper, we propose a novel technique to reconstruct 3D surface of an
underwater object using stereo images. Reconstructing the 3D surface of an
underwater object is really a challenging task due to degraded quality of
underwater images. There are various reason of quality degradation of
underwater images i.e., non-uniform illumination of light on the surface of
objects, scattering and absorption effects. Floating particles present in
underwater produces Gaussian noise on the captured underwater images which
degrades the quality of images. The degraded underwater images are preprocessed
by applying homomorphic, wavelet denoising and anisotropic filtering
sequentially. The uncalibrated rectification technique is applied to
preprocessed images to rectify the left and right images. The rectified left
and right image lies on a common plane. To find the correspondence points in a
left and right images, we have applied dense stereo matching technique i.e.,
graph cut method. Finally, we estimate the depth of images using triangulation
technique. The experimental result shows that the proposed method reconstruct
3D surface of underwater objects accurately using captured underwater stereo
images.Comment: International Journal of Computer Applications (2012
Segmentation of Skin Lesions and their Attributes Using Multi-Scale Convolutional Neural Networks and Domain Specific Augmentations
Computer-aided diagnosis systems for classification of different type of skin
lesions have been an active field of research in recent decades. It has been
shown that introducing lesions and their attributes masks into lesion
classification pipeline can greatly improve the performance. In this paper, we
propose a framework by incorporating transfer learning for segmenting lesions
and their attributes based on the convolutional neural networks. The proposed
framework is based on the encoder-decoder architecture which utilizes a variety
of pre-trained networks in the encoding path and generates the prediction map
by combining multi-scale information in decoding path using a pyramid pooling
manner. To address the lack of training data and increase the proposed model
generalization, an extensive set of novel domain-specific augmentation routines
have been applied to simulate the real variations in dermoscopy images.
Finally, by performing broad experiments on three different data sets obtained
from International Skin Imaging Collaboration archive (ISIC2016, ISIC2017, and
ISIC2018 challenges data sets), we show that the proposed method outperforms
other state-of-the-art approaches for ISIC2016 and ISIC2017 segmentation task
and achieved the first rank on the leader-board of ISIC2018 attribute detection
task.Comment: 18 page
Saliency detection based on structural dissimilarity induced by image quality assessment model
The distinctiveness of image regions is widely used as the cue of saliency.
Generally, the distinctiveness is computed according to the absolute difference
of features. However, according to the image quality assessment (IQA) studies,
the human visual system is highly sensitive to structural changes rather than
absolute difference. Accordingly, we propose the computation of the structural
dissimilarity between image patches as the distinctiveness measure for saliency
detection. Similar to IQA models, the structural dissimilarity is computed
based on the correlation of the structural features. The global structural
dissimilarity of a patch to all the other patches represents saliency of the
patch. We adopt two widely used structural features, namely the local contrast
and gradient magnitude, into the structural dissimilarity computation in the
proposed model. Without any postprocessing, the proposed model based on the
correlation of either of the two structural features outperforms 11
state-of-the-art saliency models on three saliency databases.Comment: For associated source code, see https://github.com/yangli-xjtu/SD
- …