286 research outputs found
Artificial Color Constancy via GoogLeNet with Angular Loss Function
Color Constancy is the ability of the human visual system to perceive colors
unchanged independently of the illumination. Giving a machine this feature will
be beneficial in many fields where chromatic information is used. Particularly,
it significantly improves scene understanding and object recognition. In this
paper, we propose transfer learning-based algorithm, which has two main
features: accuracy higher than many state-of-the-art algorithms and simplicity
of implementation. Despite the fact that GoogLeNet was used in the experiments,
given approach may be applied to any CNN. Additionally, we discuss design of a
new loss function oriented specifically to this problem, and propose a few the
most suitable options
Joint Learning of Intrinsic Images and Semantic Segmentation
Semantic segmentation of outdoor scenes is problematic when there are
variations in imaging conditions. It is known that albedo (reflectance) is
invariant to all kinds of illumination effects. Thus, using reflectance images
for semantic segmentation task can be favorable. Additionally, not only
segmentation may benefit from reflectance, but also segmentation may be useful
for reflectance computation. Therefore, in this paper, the tasks of semantic
segmentation and intrinsic image decomposition are considered as a combined
process by exploring their mutual relationship in a joint fashion. To that end,
we propose a supervised end-to-end CNN architecture to jointly learn intrinsic
image decomposition and semantic segmentation. We analyze the gains of
addressing those two problems jointly. Moreover, new cascade CNN architectures
for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as
single tasks. Furthermore, a dataset of 35K synthetic images of natural
environments is created with corresponding albedo and shading (intrinsics), as
well as semantic labels (segmentation) assigned to each object/scene. The
experiments show that joint learning of intrinsic image decomposition and
semantic segmentation is beneficial for both tasks for natural scenes. Dataset
and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201
Spectral Ray Tracing for Generation of Spatial Color Constancy Training Data
Computational color constancy is a fundamental step in digital cameras that estimates the chromaticity of illumination. Most of automatic white balance (AWB) algorithms that perform computational color constancy assume that there is a single illuminant in the scene. This widely-known assumption is frequently violated in the real world. It could be argued that the main reason for the assumption of single illuminant comes from the limited amount of available mixed illuminant datasets and the laborious annotation process. Annotation of mixed illuminated images is orders of magnitude more laborious compared to a single illuminant case, due to the spatial complexity that requires pixel-wise ground truth illumination chromaticity in various ratios of existing illuminants.
Spectral ray tracing is a 3D rendering method to create physically realistic images and animations using the spectral representations of materials and light sources rather than a trichromatic representation such as red-green-blue (RGB). In this thesis, this physically correct image signal generation method is used in creation of spatially varying mixed illuminated image dataset with pixel-wise ground truth illumination chromaticity. In complex 3D scenes, materials are defined based on a database of real world spectral reflectance measurements and light sources are defined based on the spectral power distribution definitions that have been released by the International Commission on Illumination (CIE). Rendering is done by using Blender Cycles rendering engine in the visible spectrum wavelengths from 395nm to 705nm with 5nm equal bins resulting in 63 channel full-spectrum image. The resulting full-spectrum images can be turned into the raw response of any camera as long as the spectral sensitivity of the camera module is known. This is a big advantage of spectral ray tracing since color constancy is mostly camera module-dependent. Pixel-wise white balance gain is calculated through the linear average of illuminant chromaticities depending on their contribution to the mixed illuminated raw image. The raw image signal and pixel-wise white balance gain are fundamentally needed in spatial color constancy dataset. This study implements an image generation pipeline that starts from the spectral definitions of illuminants and materials and ends with an sRGB image created from a 3D scene.
6 different 3D Blender scenes are created, each having 7 different virtual cameras located throughout the scene. 406 single illuminated and 1015 spatially varying mixed illuminated images are created including their pixel-wise ground truth illumination chromaticity. Created dataset can be used to improve mixed illumination color constancy algorithms and paves the way for further research and testing in the field
A Dataset of Multi-Illumination Images in the Wild
Collections of images under a single, uncontrolled illumination have enabled
the rapid advancement of core computer vision tasks like classification,
detection, and segmentation. But even with modern learning techniques, many
inverse problems involving lighting and material understanding remain too
severely ill-posed to be solved with single-illumination datasets. To fill this
gap, we introduce a new multi-illumination dataset of more than 1000 real
scenes, each captured under 25 lighting conditions. We demonstrate the richness
of this dataset by training state-of-the-art models for three challenging
applications: single-image illumination estimation, image relighting, and
mixed-illuminant white balance.Comment: ICCV 201
Algorithms for the enhancement of dynamic range and colour constancy of digital images & video
One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities.
Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities.
The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises.
The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device
Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation
Most progress in semantic segmentation reports on daytime images taken under
favorable illumination conditions. We instead address the problem of semantic
segmentation of nighttime images and improve the state-of-the-art, by adapting
daytime models to nighttime without using nighttime annotations. Moreover, we
design a new evaluation framework to address the substantial uncertainty of
semantics in nighttime images. Our central contributions are: 1) a curriculum
framework to gradually adapt semantic segmentation models from day to night via
labeled synthetic images and unlabeled real images, both for progressively
darker times of day, which exploits cross-time-of-day correspondences for the
real images to guide the inference of their labels; 2) a novel
uncertainty-aware annotation and evaluation framework and metric for semantic
segmentation, designed for adverse conditions and including image regions
beyond human recognition capability in the evaluation in a principled fashion;
3) the Dark Zurich dataset, which comprises 2416 unlabeled nighttime and 2920
unlabeled twilight images with correspondences to their daytime counterparts
plus a set of 151 nighttime images with fine pixel-level annotations created
with our protocol, which serves as a first benchmark to perform our novel
evaluation. Experiments show that our guided curriculum adaptation
significantly outperforms state-of-the-art methods on real nighttime sets both
for standard metrics and our uncertainty-aware metric. Furthermore, our
uncertainty-aware evaluation reveals that selective invalidation of predictions
can lead to better results on data with ambiguous content such as our nighttime
benchmark and profit safety-oriented applications which involve invalid inputs.Comment: ICCV 2019 camera-read
Reflectance Adaptive Filtering Improves Intrinsic Image Estimation
Separating an image into reflectance and shading layers poses a challenge for
learning approaches because no large corpus of precise and realistic ground
truth decompositions exists. The Intrinsic Images in the Wild~(IIW) dataset
provides a sparse set of relative human reflectance judgments, which serves as
a standard benchmark for intrinsic images. A number of methods use IIW to learn
statistical dependencies between the images and their reflectance layer.
Although learning plays an important role for high performance, we show that a
standard signal processing technique achieves performance on par with current
state-of-the-art. We propose a loss function for CNN learning of dense
reflectance predictions. Our results show a simple pixel-wise decision, without
any context or prior knowledge, is sufficient to provide a strong baseline on
IIW. This sets a competitive baseline which only two other approaches surpass.
We then develop a joint bilateral filtering method that implements strong prior
knowledge about reflectance constancy. This filtering operation can be applied
to any intrinsic image algorithm and we improve several previous results
achieving a new state-of-the-art on IIW. Our findings suggest that the effect
of learning-based approaches may have been over-estimated so far. Explicit
prior knowledge is still at least as important to obtain high performance in
intrinsic image decompositions.Comment: CVPR 201
Visible hyperspectral imaging for predicting intra-muscular fat content from sheep carcasses
Intramuscular fat (IMF) content plays a key role in the quality attributes of meat, such as sensory properties and health considerations. The tenderness, flavour and juiciness of meat are examples of sensory attributes influenced by IMF content. Traditionally, IMF content in meat was determined using destructive, time consuming and at times unsuitable methods in industry applications. However, with recent advancement of technology, there has been an interest in exlporing ways to ascertain meat quality without damage. Hyperspectral imaging analysis is an emerging technology that combines the use of spectroscopy and computer imaging analysis to obtain both the spectral and spatial information of objects of interest. Hyperspectral imaging was initially developed for remote sensing, but has recently emerged as powerful tool for non-destructive analysis of quality in the food industry and has had very accurate results in the prediction of meat qualities such as IMF content. In this thesis, we use a data set of 101 hyperspectral images of sheep carcasses to investigate the ability of multivariate statistical methods to accurately predict IMF content
Non-parametric Methods for Automatic Exposure Control, Radiometric Calibration and Dynamic Range Compression
Imaging systems are essential to a wide range of modern day
applications. With the continuous advancement in imaging systems,
there is an on-going need to adapt and improve the imaging
pipeline running inside the imaging systems.
In this thesis, methods are presented to improve the imaging
pipeline of digital cameras. Here we present three methods to
improve important phases of the imaging process, which are (i)
``Automatic exposure adjustment'' (ii) ``Radiometric
calibration'' (iii) ''High dynamic range compression''. These
contributions touch the initial, intermediate and final stages of
imaging pipeline of digital cameras.
For exposure control, we propose two methods. The first makes use
of CCD-based equations to formulate the exposure control problem.
To estimate the exposure time, an initial image was acquired for
each wavelength channel to which contrast adjustment techniques
were applied. This helps to recover a reference cumulative
distribution function of image brightness at each channel. The
second method proposed for automatic exposure control is an
iterative method applicable for a broad range of imaging systems.
It uses spectral sensitivity functions such as the photopic
response functions for the generation of a spectral power image
of the captured scene. A target image is then generated using the
spectral power image by applying histogram equalization. The
exposure time is hence calculated iteratively by minimizing the
squared difference between target and the current spectral power
image. Here we further analyze the method by performing its
stability and controllability analysis using a state space
representation used in control theory. The applicability of the
proposed method for exposure time calculation was shown on real
world scenes using cameras with varying architectures.
Radiometric calibration is the estimate of the non-linear mapping
of the input radiance map to the output brightness values. The
radiometric mapping is represented by the camera response
function with which the radiance map of the scene is estimated.
Our radiometric calibration method employs an L1 cost function by
taking advantage of Weisfeld optimization scheme. The proposed
calibration works with multiple input images of the scene with
varying exposure. It can also perform calibration using a single
input with few constraints. The proposed method outperforms,
quantitatively and qualitatively, various alternative methods
found in the literature of radiometric calibration.
Finally, to realistically represent the estimated radiance maps
on low dynamic range display (LDR) devices, we propose a method
for dynamic range compression. Radiance maps generally have
higher dynamic range (HDR) as compared to the widely used display
devices. Thus, for display purposes, dynamic range compression is
required on HDR images. Our proposed method generates few LDR
images from the HDR radiance map by clipping its values at
different exposures. Using contrast information of each LDR
image generated, the method uses an energy minimization approach
to estimate the probability map of each LDR image. These
probability maps are then used as label set to form final
compressed dynamic range image for the display device. The
results of our method were compared qualitatively and
quantitatively with those produced by widely cited and
professionally used methods
- …