1,667 research outputs found
Deep Quantigraphic Image Enhancement via Comparametric Equations
Most recent methods of deep image enhancement can be generally classified
into two types: decompose-and-enhance and illumination estimation-centric. The
former is usually less efficient, and the latter is constrained by a strong
assumption regarding image reflectance as the desired enhancement result. To
alleviate this constraint while retaining high efficiency, we propose a novel
trainable module that diversifies the conversion from the low-light image and
illumination map to the enhanced image. It formulates image enhancement as a
comparametric equation parameterized by a camera response function and an
exposure compensation ratio. By incorporating this module in an illumination
estimation-centric DNN, our method improves the flexibility of deep image
enhancement, limits the computational burden to illumination estimation, and
allows for fully unsupervised learning adaptable to the diverse demands of
different tasks.Comment: Published in ICASSP 2023. For GitHub code, see
https://github.com/nttcslab/con
OTRE: Where Optimal Transport Guided Unpaired Image-to-Image Translation Meets Regularization by Enhancing
Non-mydriatic retinal color fundus photography (CFP) is widely available due
to the advantage of not requiring pupillary dilation, however, is prone to poor
quality due to operators, systemic imperfections, or patient-related causes.
Optimal retinal image quality is mandated for accurate medical diagnoses and
automated analyses. Herein, we leveraged the Optimal Transport (OT) theory to
propose an unpaired image-to-image translation scheme for mapping low-quality
retinal CFPs to high-quality counterparts. Furthermore, to improve the
flexibility, robustness, and applicability of our image enhancement pipeline in
the clinical practice, we generalized a state-of-the-art model-based image
reconstruction method, regularization by denoising, by plugging in priors
learned by our OT-guided image-to-image translation network. We named it as
regularization by enhancing (RE). We validated the integrated framework, OTRE,
on three publicly available retinal image datasets by assessing the quality
after enhancement and their performance on various downstream tasks, including
diabetic retinopathy grading, vessel segmentation, and diabetic lesion
segmentation. The experimental results demonstrated the superiority of our
proposed framework over some state-of-the-art unsupervised competitors and a
state-of-the-art supervised method.Comment: Accepted as a conference paper to The 28th biennial international
conference on Information Processing in Medical Imaging (IPMI 2023
Semi-supervised source localization in reverberant environments with deep generative modeling
We propose a semi-supervised approach to acoustic source localization in
reverberant environments based on deep generative modeling. Localization in
reverberant environments remains an open challenge. Even with large data
volumes, the number of labels available for supervised learning in reverberant
environments is usually small. We address this issue by performing
semi-supervised learning (SSL) with convolutional variational autoencoders
(VAEs) on reverberant speech signals recorded with microphone arrays. The VAE
is trained to generate the phase of relative transfer functions (RTFs) between
microphones, in parallel with a direction of arrival (DOA) classifier based on
RTF-phase. These models are trained using both labeled and unlabeled RTF-phase
sequences. In learning to perform these tasks, the VAE-SSL explicitly learns to
separate the physical causes of the RTF-phase (i.e., source location) from
distracting signal characteristics such as noise and speech activity. Relative
to existing semi-supervised localization methods in acoustics, VAE-SSL is
effectively an end-to-end processing approach which relies on minimal
preprocessing of RTF-phase features. As far as we are aware, our paper presents
the first approach to modeling the physics of acoustic propagation using deep
generative modeling. The VAE-SSL approach is compared with two signal
processing-based approaches, steered response power with phase transform
(SRP-PHAT) and MUltiple SIgnal Classification (MUSIC), as well as fully
supervised CNNs. We find that VAE-SSL can outperform the conventional
approaches and the CNN in label-limited scenarios. Further, the trained VAE-SSL
system can generate new RTF-phase samples, which shows the VAE-SSL approach
learns the physics of the acoustic environment. The generative modeling in
VAE-SSL thus provides a means of interpreting the learned representations.Comment: Revision, submitted to IEEE Acces
- …