3,875 research outputs found
LIME: A Method for Low-light IMage Enhancement
When one captures images in low-light conditions, the images often suffer
from low visibility. This poor quality may significantly degrade the
performance of many computer vision and multimedia algorithms that are
primarily designed for high-quality inputs. In this paper, we propose a very
simple and effective method, named as LIME, to enhance low-light images. More
concretely, the illumination of each pixel is first estimated individually by
finding the maximum value in R, G and B channels. Further, we refine the
initial illumination map by imposing a structure prior on it, as the final
illumination map. Having the well-constructed illumination map, the enhancement
can be achieved accordingly. Experiments on a number of challenging real-world
low-light images are present to reveal the efficacy of our LIME and show its
superiority over several state-of-the-arts
Survey: Machine Learning in Production Rendering
In the past few years, machine learning-based approaches have had some great
success for rendering animated feature films. This survey summarizes several of
the most dramatic improvements in using deep neural networks over traditional
rendering methods, such as better image quality and lower computational
overhead. More specifically, this survey covers the fundamental principles of
machine learning and its applications, such as denoising, path guiding,
rendering participating media, and other notoriously difficult light transport
situations. Some of these techniques have already been used in the latest
released animations while others are still in the continuing development by
researchers in both academia and movie studios. Although learning-based
rendering methods still have some open issues, they have already demonstrated
promising performance in multiple parts of the rendering pipeline, and people
are continuously making new attempts.Comment: This was the survey I did for my PhD research exa
3D Surface Reconstruction of Underwater Objects
In this paper, we propose a novel technique to reconstruct 3D surface of an
underwater object using stereo images. Reconstructing the 3D surface of an
underwater object is really a challenging task due to degraded quality of
underwater images. There are various reason of quality degradation of
underwater images i.e., non-uniform illumination of light on the surface of
objects, scattering and absorption effects. Floating particles present in
underwater produces Gaussian noise on the captured underwater images which
degrades the quality of images. The degraded underwater images are preprocessed
by applying homomorphic, wavelet denoising and anisotropic filtering
sequentially. The uncalibrated rectification technique is applied to
preprocessed images to rectify the left and right images. The rectified left
and right image lies on a common plane. To find the correspondence points in a
left and right images, we have applied dense stereo matching technique i.e.,
graph cut method. Finally, we estimate the depth of images using triangulation
technique. The experimental result shows that the proposed method reconstruct
3D surface of underwater objects accurately using captured underwater stereo
images.Comment: International Journal of Computer Applications (2012
A multi-layer image representation using Regularized Residual Quantization: application to compression and denoising
A learning-based framework for representation of domain-specific images is
proposed where joint compression and denoising can be done using a VQ-based
multi-layer network. While it learns to compress the images from a training
set, the compression performance is very well generalized on images from a test
set. Moreover, when fed with noisy versions of the test set, since it has
priors from clean images, the network also efficiently denoises the test images
during the reconstruction. The proposed framework is a regularized version of
the Residual Quantization (RQ) where at each stage, the quantization error from
the previous stage is further quantized. Instead of codebook learning from the
k-means which over-trains for high-dimensional vectors, we show that only
generating the codewords from a random, but properly regularized distribution
suffices to compress the images globally and without the need to resort to
patch-based division of images. The experiments are done on the
\textit{CroppedYale-B} set of facial images and the method is compared with the
JPEG-2000 codec for compression and BM3D for denoising, showing promising
results.Comment: At the International Conference on Image Processing 2017 (ICIP'17),
Beijing, Chin
Deep Retinex Decomposition for Low-Light Enhancement
Retinex model is an effective tool for low-light image enhancement. It
assumes that observed images can be decomposed into the reflectance and
illumination. Most existing Retinex-based methods have carefully designed
hand-crafted constraints and parameters for this highly ill-posed
decomposition, which may be limited by model capacity when applied in various
scenes. In this paper, we collect a LOw-Light dataset (LOL) containing
low/normal-light image pairs and propose a deep Retinex-Net learned on this
dataset, including a Decom-Net for decomposition and an Enhance-Net for
illumination adjustment. In the training process for Decom-Net, there is no
ground truth of decomposed reflectance and illumination. The network is learned
with only key constraints including the consistent reflectance shared by paired
low/normal-light images, and the smoothness of illumination. Based on the
decomposition, subsequent lightness enhancement is conducted on illumination by
an enhancement network called Enhance-Net, and for joint denoising there is a
denoising operation on reflectance. The Retinex-Net is end-to-end trainable, so
that the learned decomposition is by nature good for lightness adjustment.
Extensive experiments demonstrate that our method not only achieves visually
pleasing quality for low-light enhancement but also provides a good
representation of image decomposition.Comment: BMVC 2018(Oral). Dataset and Project page:
https://daooshee.github.io/BMVC2018website
Texture retrieval using periodically extended and adaptive curvelets
Image retrieval is an important problem in the area of multimedia processing.
This paper presents two new curvelet-based algorithms for texture retrieval
which are suitable for use in constrained-memory devices. The developed
algorithms are tested on three publicly available texture datasets: CUReT,
Mondial-Marmi, and STex-fabric. Our experiments confirm the effectiveness of
the proposed system. Furthermore, a weighted version of the proposed retrieval
algorithm is proposed, which is shown to achieve promising results in the
classification of seismic activities
Hierarchical Invariant Feature Learning with Marginalization for Person Re-Identification
This paper addresses the problem of matching pedestrians across multiple
camera views, known as person re-identification. Variations in lighting
conditions, environment and pose changes across camera views make
re-identification a challenging problem. Previous methods address these
challenges by designing specific features or by learning a distance function.
We propose a hierarchical feature learning framework that learns invariant
representations from labeled image pairs. A mapping is learned such that the
extracted features are invariant for images belonging to same individual across
views. To learn robust representations and to achieve better generalization to
unseen data, the system has to be trained with a large amount of data.
Critically, most of the person re-identification datasets are small. Manually
augmenting the dataset by partial corruption of input data introduces
additional computational burden as it requires several training epochs to
converge. We propose a hierarchical network which incorporates a
marginalization technique that can reap the benefits of training on large
datasets without explicit augmentation. We compare our approach with several
baseline algorithms as well as popular linear and non-linear metric learning
algorithms and demonstrate improved performance on challenging publicly
available datasets, VIPeR, CUHK01, CAVIAR4REID and iLIDS. Our approach also
achieves the stateof-the-art results on these datasets
Parts for the Whole: The DCT Norm for Extreme Visual Recovery
Here we study the extreme visual recovery problem, in which over 90\% of
pixel values in a given image are missing. Existing low rank-based algorithms
are only effective for recovering data with at most 90\% missing values. Thus,
we exploit visual data's smoothness property to help solve this challenging
extreme visual recovery problem. Based on the Discrete Cosine Transformation
(DCT), we propose a novel DCT norm that involves all pixels and produces smooth
estimations in any view. Our theoretical analysis shows that the total
variation (TV) norm, which only achieves local smoothness, is a special case of
the proposed DCT norm. We also develop a new visual recovery algorithm by
minimizing the DCT and nuclear norms to achieve a more visually pleasing
estimation. Experimental results on a benchmark image dataset demonstrate that
the proposed approach is superior to state-of-the-art methods in terms of peak
signal-to-noise ratio and structural similarity
Global-Local Face Upsampling Network
Face hallucination, which is the task of generating a high-resolution face
image from a low-resolution input image, is a well-studied problem that is
useful in widespread application areas. Face hallucination is particularly
challenging when the input face resolution is very low (e.g., 10 x 12 pixels)
and/or the image is captured in an uncontrolled setting with large pose and
illumination variations. In this paper, we revisit the algorithm introduced in
[1] and present a deep interpretation of this framework that achieves
state-of-the-art under such challenging scenarios. In our deep network
architecture the global and local constraints that define a face can be
efficiently modeled and learned end-to-end using training data. Conceptually
our network design can be partitioned into two sub-networks: the first one
implements the holistic face reconstruction according to global constraints,
and the second one enhances face-specific details and enforces local patch
statistics. We optimize the deep network using a new loss function for
super-resolution that combines reconstruction error with a learned face quality
measure in adversarial setting, producing improved visual results. We conduct
extensive experiments in both controlled and uncontrolled setups and show that
our algorithm improves the state of the art both numerically and visually
Non-contact transmittance photoplethysmographic imaging (PPGI) for long-distance cardiovascular monitoring
Photoplethysmography (PPG) devices are widely used for monitoring
cardiovascular function. However, these devices require skin contact, which
restrict their use to at-rest short-term monitoring using single-point
measurements. Photoplethysmographic imaging (PPGI) has been recently proposed
as a non-contact monitoring alternative by measuring blood pulse signals across
a spatial region of interest. Existing systems operate in reflectance mode, of
which many are limited to short-distance monitoring and are prone to temporal
changes in ambient illumination. This paper is the first study to investigate
the feasibility of long-distance non-contact cardiovascular monitoring at the
supermeter level using transmittance PPGI. For this purpose, a novel PPGI
system was designed at the hardware and software level using ambient correction
via temporally coded illumination (TCI) and signal processing for PPGI signal
extraction. Experimental results show that the processing steps yield a
substantially more pulsatile PPGI signal than the raw acquired signal,
resulting in statistically significant increases in correlation to ground-truth
PPG in both short- () and long-distance () monitoring. The results support the hypothesis that
long-distance heart rate monitoring is feasible using transmittance PPGI,
allowing for new possibilities of monitoring cardiovascular function in a
non-contact manner.Comment: 13 pages, 6 figures, submitted to Nature Scientific Reports, for
associated video files see
http://vip.uwaterloo.ca/publications/non-contact-transmittance-photoplethysmographic-imaging-ppgi-long-distanc
- …