909 research outputs found
A Joint Intensity and Depth Co-Sparse Analysis Model for Depth Map Super-Resolution
High-resolution depth maps can be inferred from low-resolution depth
measurements and an additional high-resolution intensity image of the same
scene. To that end, we introduce a bimodal co-sparse analysis model, which is
able to capture the interdependency of registered intensity and depth
information. This model is based on the assumption that the co-supports of
corresponding bimodal image structures are aligned when computed by a suitable
pair of analysis operators. No analytic form of such operators exist and we
propose a method for learning them from a set of registered training signals.
This learning process is done offline and returns a bimodal analysis operator
that is universally applicable to natural scenes. We use this to exploit the
bimodal co-sparse analysis model as a prior for solving inverse problems, which
leads to an efficient algorithm for depth map super-resolution.Comment: 13 pages, 4 figure
A signal conditioning approach for the extraction of the oscillatory petential from the electroretinogram
The oscillatory potential (OP), a signal component of the electroretinogram (ERG), was investigated to determine correlation of the OP and pathological conditions of the inner retina. Large transients characterize the ERG. Such transients stimulate a filter\u27s natural response. Since these responses can co-occur with the OP, a distorted OP will be extracted. A proposed signal windowing and padding technique for conditioning the ERG signal has been implemented for the extraction of a ntnimally distorted OP.
Windowing is used to capture only the OP period. The windowed ERG signal is then signal conditioned to generate initial values for the filter\u27s state variables. Such correct initial conditions eliminate the perturbations created from filtering the windowed ERG. OPs were successfully extracted from a database of fifty human ERGs. The extracted OPs did not display any filter-induced oscillations and did provide some indication of the retina\u27s pathology
Mutual-Guided Dynamic Network for Image Fusion
Image fusion aims to generate a high-quality image from multiple images
captured under varying conditions. The key problem of this task is to preserve
complementary information while filtering out irrelevant information for the
fused result. However, existing methods address this problem by leveraging
static convolutional neural networks (CNNs), suffering two inherent limitations
during feature extraction, i.e., being unable to handle spatial-variant
contents and lacking guidance from multiple inputs. In this paper, we propose a
novel mutual-guided dynamic network (MGDN) for image fusion, which allows for
effective information utilization across different locations and inputs.
Specifically, we design a mutual-guided dynamic filter (MGDF) for adaptive
feature extraction, composed of a mutual-guided cross-attention (MGCA) module
and a dynamic filter predictor, where the former incorporates additional
guidance from different inputs and the latter generates spatial-variant kernels
for different locations. In addition, we introduce a parallel feature fusion
(PFF) module to effectively fuse local and global information of the extracted
features. To further reduce the redundancy among the extracted features while
simultaneously preserving their shared structural information, we devise a
novel loss function that combines the minimization of normalized mutual
information (NMI) with an estimated gradient mask. Experimental results on five
benchmark datasets demonstrate that our proposed method outperforms existing
methods on four image fusion tasks. The code and model are publicly available
at: https://github.com/Guanys-dar/MGDN.Comment: ACMMM 2023 accepte
Probabilistic Pixel-Adaptive Refinement Networks
Encoder-decoder networks have found widespread use in various dense
prediction tasks. However, the strong reduction of spatial resolution in the
encoder leads to a loss of location information as well as boundary artifacts.
To address this, image-adaptive post-processing methods have shown beneficial
by leveraging the high-resolution input image(s) as guidance data. We extend
such approaches by considering an important orthogonal source of information:
the network's confidence in its own predictions. We introduce probabilistic
pixel-adaptive convolutions (PPACs), which not only depend on image guidance
data for filtering, but also respect the reliability of per-pixel predictions.
As such, PPACs allow for image-adaptive smoothing and simultaneously
propagating pixels of high confidence into less reliable regions, while
respecting object boundaries. We demonstrate their utility in refinement
networks for optical flow and semantic segmentation, where PPACs lead to a
clear reduction in boundary artifacts. Moreover, our proposed refinement step
is able to substantially improve the accuracy on various widely used
benchmarks.Comment: To appear at CVPR 202
FaceShop: Deep Sketch-based Face Image Editing
We present a novel system for sketch-based face image editing, enabling users
to edit images intuitively by sketching a few strokes on a region of interest.
Our interface features tools to express a desired image manipulation by
providing both geometry and color constraints as user-drawn strokes. As an
alternative to the direct user input, our proposed system naturally supports a
copy-paste mode, which allows users to edit a given image region by using parts
of another exemplar image without the need of hand-drawn sketching at all. The
proposed interface runs in real-time and facilitates an interactive and
iterative workflow to quickly express the intended edits. Our system is based
on a novel sketch domain and a convolutional neural network trained end-to-end
to automatically learn to render image regions corresponding to the input
strokes. To achieve high quality and semantically consistent results we train
our neural network on two simultaneous tasks, namely image completion and image
translation. To the best of our knowledge, we are the first to combine these
two tasks in a unified framework for interactive image editing. Our results
show that the proposed sketch domain, network architecture, and training
procedure generalize well to real user input and enable high quality synthesis
results without additional post-processing.Comment: 13 pages, 20 figure
Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning
We show that denoising of 3D point clouds can be learned unsupervised,
directly from noisy 3D point cloud data only. This is achieved by extending
recent ideas from learning of unsupervised image denoisers to unstructured 3D
point clouds. Unsupervised image denoisers operate under the assumption that a
noisy pixel observation is a random realization of a distribution around a
clean pixel value, which allows appropriate learning on this distribution to
eventually converge to the correct value. Regrettably, this assumption is not
valid for unstructured points: 3D point clouds are subject to total noise, i.
e., deviations in all coordinates, with no reliable pixel grid. Thus, an
observation can be the realization of an entire manifold of clean 3D points,
which makes a na\"ive extension of unsupervised image denoisers to 3D point
clouds impractical. Overcoming this, we introduce a spatial prior term, that
steers converges to the unique closest out of the many possible modes on a
manifold. Our results demonstrate unsupervised denoising performance similar to
that of supervised learning with clean data when given enough training examples
- whereby we do not need any pairs of noisy and clean training data.Comment: Proceedings of ICCV 201
- …