3,684 research outputs found
Estimating Reflectance Layer from A Single Image: Integrating Reflectance Guidance and Shadow/Specular Aware Learning
Estimating reflectance layer from a single image is a challenging task. It
becomes more challenging when the input image contains shadows or specular
highlights, which often render an inaccurate estimate of the reflectance layer.
Therefore, we propose a two-stage learning method, including reflectance
guidance and a Shadow/Specular-Aware (S-Aware) network to tackle the problem.
In the first stage, an initial reflectance layer free from shadows and
specularities is obtained with the constraint of novel losses that are guided
by prior-based shadow-free and specular-free images. To further enforce the
reflectance layer to be independent from shadows and specularities in the
second-stage refinement, we introduce an S-Aware network that distinguishes the
reflectance image from the input image. Our network employs a classifier to
categorize shadow/shadow-free, specular/specular-free classes, enabling the
activation features to function as attention maps that focus on shadow/specular
regions. Our quantitative and qualitative evaluations show that our method
outperforms the state-of-the-art methods in the reflectance layer estimation
that is free from shadows and specularities.Comment: Accepted to AAAI202
A Review of Remote Sensing Image Dehazing.
Remote sensing (RS) is one of the data collection technologies that help explore more earth surface information. However, RS data captured by satellite are susceptible to particles suspended during the imaging process, especially for data with visible light band. To make up for such deficiency, numerous dehazing work and efforts have been made recently, whose strategy is to directly restore single hazy data without the need for using any extra information. In this paper, we first classify the current available algorithm into three categories, i.e., image enhancement, physical dehazing, and data-driven. The advantages and disadvantages of each type of algorithm are then summarized in detail. Finally, the evaluation indicators used to rank the recovery performance and the application scenario of the RS data haze removal technique are discussed, respectively. In addition, some common deficiencies of current available methods and future research focus are elaborated
Removal of visual disruption caused by rain using cycle-consistent generative adversarial networks
This paper addresses the problem of removing rain disruption from images without blurring scene content, thereby retaining the visual quality of the image. This is particularly important in maintaining the performance of outdoor vision systems, which deteriorates with increasing rain disruption or degradation on the visual quality of the image. In this paper, the Cycle-Consistent Generative Adversarial Network (CycleGAN) is proposed as a more promising rain removal algorithm, as compared to the state-of-the-art Image De-raining Conditional Generative Adversarial Network (ID-CGAN). One of the main advantages of the CycleGAN is its ability to learn the underlying relationship between
the rain and rain-free domain without the need of paired domain examples, which is essential for rain removal as it is not possible to obtain the rain-free image under dynamic outdoor conditions. Based on the physical properties and the various types of rain phenomena [10], five broad categories of real rain distortions are proposed, which can be applied to the majority of outdoor rain conditions. For a fair comparison, both the ID-CGAN and CycleGAN were trained on the same set of 700 synthesized rain-and-ground-truth image-pairs. Subsequently, both networks were tested on real rain images, which fall broadly under these five categories. A comparison of the performance between the CycleGAN and the ID-CGAN demonstrated that the CycleGAN is superior in removing real rain distortions
Joint Depth Estimation and Mixture of Rain Removal From a Single Image
Rainy weather significantly deteriorates the visibility of scene objects,
particularly when images are captured through outdoor camera lenses or
windshields. Through careful observation of numerous rainy photos, we have
found that the images are generally affected by various rainwater artifacts
such as raindrops, rain streaks, and rainy haze, which impact the image quality
from both near and far distances, resulting in a complex and intertwined
process of image degradation. However, current deraining techniques are limited
in their ability to address only one or two types of rainwater, which poses a
challenge in removing the mixture of rain (MOR). In this study, we propose an
effective image deraining paradigm for Mixture of rain REmoval, called
DEMore-Net, which takes full account of the MOR effect. Going beyond the
existing deraining wisdom, DEMore-Net is a joint learning paradigm that
integrates depth estimation and MOR removal tasks to achieve superior rain
removal. The depth information can offer additional meaningful guidance
information based on distance, thus better helping DEMore-Net remove different
types of rainwater. Moreover, this study explores normalization approaches in
image deraining tasks and introduces a new Hybrid Normalization Block (HNB) to
enhance the deraining performance of DEMore-Net. Extensive experiments
conducted on synthetic datasets and real-world MOR photos fully validate the
superiority of the proposed DEMore-Net. Code is available at
https://github.com/yz-wang/DEMore-Net.Comment: 11 pages, 7 figures, 5 table
Improving Lens Flare Removal with General Purpose Pipeline and Multiple Light Sources Recovery
When taking images against strong light sources, the resulting images often
contain heterogeneous flare artifacts. These artifacts can importantly affect
image visual quality and downstream computer vision tasks. While collecting
real data pairs of flare-corrupted/flare-free images for training flare removal
models is challenging, current methods utilize the direct-add approach to
synthesize data. However, these methods do not consider automatic exposure and
tone mapping in image signal processing pipeline (ISP), leading to the limited
generalization capability of deep models training using such data. Besides,
existing methods struggle to handle multiple light sources due to the different
sizes, shapes and illuminance of various light sources. In this paper, we
propose a solution to improve the performance of lens flare removal by
revisiting the ISP and remodeling the principle of automatic exposure in the
synthesis pipeline and design a more reliable light sources recovery strategy.
The new pipeline approaches realistic imaging by discriminating the local and
global illumination through convex combination, avoiding global illumination
shifting and local over-saturation. Our strategy for recovering multiple light
sources convexly averages the input and output of the neural network based on
illuminance levels, thereby avoiding the need for a hard threshold in
identifying light sources. We also contribute a new flare removal testing
dataset containing the flare-corrupted images captured by ten types of consumer
electronics. The dataset facilitates the verification of the generalization
capability of flare removal methods. Extensive experiments show that our
solution can effectively improve the performance of lens flare removal and push
the frontier toward more general situations.Comment: ICCV 202
Medical image synthesis using generative adversarial networks: towards photo-realistic image synthesis
This proposed work addresses the photo-realism for synthetic images. We introduced a modified generative adversarial network: StencilGAN. It is a perceptually-aware generative adversarial network that synthesizes images based on overlaid labelled masks. This technique can be a prominent solution for the scarcity of the resources in the healthcare sector
Estimation of the QoE for video streaming services based on facial expressions and gaze direction
As the multimedia technologies evolve, the need to control their quality becomes even more important making the Quality of Experience (QoE) measurements a key priority. Machine Learning (ML) can support this task providing models to analyse the information extracted by the multimedia. It is possible to divide the ML models applications in the following categories:
1) QoE modelling: ML is used to define QoE models which provide an output (e.g., perceived QoE score) for any given input (e.g., QoE influence factor).
2) QoE monitoring in case of encrypted traffic: ML is used to analyze passive traffic monitored data to obtain insight into degradations perceived by end-users.
3) Big data analytics: ML is used for the extraction of meaningful and useful information from the collected data, which can further be converted to actionable knowledge and utilized in managing QoE.
The QoE estimation quality task can be carried out by using two approaches: the objective approach and subjective one. As the two names highlight, they are referred to the pieces of information that the model analyses. The objective approach analyses the objective features extracted by the network connection and by the used media. As objective parameters, the state-of-the-art shows different approaches that use also the features extracted by human behaviour. The subjective approach instead, comes as a result of the rating approach, where the participants were asked to rate the perceived quality using different scales. This approach had the problem of being a time-consuming approach and for this reason not all the users agree to compile the questionnaire. Thus the direct evolution of this approach is the ML model adoption. A model can substitute the questionnaire and evaluate the QoE, depending on the data that analyses. By modelling the human response to the perceived quality on multimedia, QoE researchers found that the parameters extracted from the users could be different, like Electroencephalogram (EEG), Electrocardiogram (ECG), waves of the brain. The main problem with these techniques is the hardware. In fact, the user must wear electrodes in case of ECG and EEG, and also if the obtained results from these methods are relevant, their usage in a real context could be not feasible. For this reason, my studies have been focused on the developing of a Machine Learning framework completely unobtrusively based on the Facial reactions
- …