30 research outputs found
SINCO: A Novel structural regularizer for image compression using implicit neural representations
Implicit neural representations (INR) have been recently proposed as deep
learning (DL) based solutions for image compression. An image can be compressed
by training an INR model with fewer weights than the number of image pixels to
map the coordinates of the image to corresponding pixel values. While
traditional training approaches for INRs are based on enforcing pixel-wise
image consistency, we propose to further improve image quality by using a new
structural regularizer. We present structural regularization for INR
compression (SINCO) as a novel INR method for image compression. SINCO imposes
structural consistency of the compressed images to the groundtruth by using a
segmentation network to penalize the discrepancy of segmentation masks
predicted from compressed images. We validate SINCO on brain MRI images by
showing that it can achieve better performance than some recent INR methods
Convergence of Nonconvex PnP-ADMM with MMSE Denoisers
Plug-and-Play Alternating Direction Method of Multipliers (PnP-ADMM) is a
widely-used algorithm for solving inverse problems by integrating physical
measurement models and convolutional neural network (CNN) priors. PnP-ADMM has
been theoretically proven to converge for convex data-fidelity terms and
nonexpansive CNNs. It has however been observed that PnP-ADMM often empirically
converges even for expansive CNNs. This paper presents a theoretical
explanation for the observed stability of PnP-ADMM based on the interpretation
of the CNN prior as a minimum mean-squared error (MMSE) denoiser. Our
explanation parallels a similar argument recently made for the iterative
shrinkage/thresholding algorithm variant of PnP (PnP-ISTA) and relies on the
connection between MMSE denoisers and proximal operators. We also numerically
evaluate the performance gap between PnP-ADMM using a nonexpansive DnCNN
denoiser and expansive DRUNet denoiser, thus motivating the use of expansive
CNNs
A Plug-and-Play Image Registration Network
Deformable image registration (DIR) is an active research topic in biomedical
imaging. There is a growing interest in developing DIR methods based on deep
learning (DL). A traditional DL approach to DIR is based on training a
convolutional neural network (CNN) to estimate the registration field between
two input images. While conceptually simple, this approach comes with a
limitation that it exclusively relies on a pre-trained CNN without explicitly
enforcing fidelity between the registered image and the reference. We present
plug-and-play image registration network (PIRATE) as a new DIR method that
addresses this issue by integrating an explicit data-fidelity penalty and a CNN
prior. PIRATE pre-trains a CNN denoiser on the registration field and "plugs"
it into an iterative method as a regularizer. We additionally present PIRATE+
that fine-tunes the CNN prior in PIRATE using deep equilibrium models (DEQ).
PIRATE+ interprets the fixed-point iteration of PIRATE as a network with
effectively infinite layers and then trains the resulting network end-to-end,
enabling it to learn more task-specific information and boosting its
performance. Our numerical results on OASIS and CANDI datasets show that our
methods achieve state-of-the-art performance on DIR
Block Coordinate Plug-and-Play Methods for Blind Inverse Problems
Plug-and-play (PnP) prior is a well-known class of methods for solving
imaging inverse problems by computing fixed-points of operators combining
physical measurement models and learned image denoisers. While PnP methods have
been extensively used for image recovery with known measurement operators,
there is little work on PnP for solving blind inverse problems. We address this
gap by presenting a new block-coordinate PnP (BC-PnP) method that efficiently
solves this joint estimation problem by introducing learned denoisers as priors
on both the unknown image and the unknown measurement operator. We present a
new convergence theory for BC-PnP compatible with blind inverse problems by
considering nonconvex data-fidelity terms and expansive denoisers. Our theory
analyzes the convergence of BC-PnP to a stationary point of an implicit
function associated with an approximate minimum mean-squared error (MMSE)
denoiser. We numerically validate our method on two blind inverse problems:
automatic coil sensitivity estimation in magnetic resonance imaging (MRI) and
blind image deblurring. Our results show that BC-PnP provides an efficient and
principled framework for using denoisers as PnP priors for jointly estimating
measurement operators and images
FLAIR: A Conditional Diffusion Framework with Applications to Face Video Restoration
Face video restoration (FVR) is a challenging but important problem where one
seeks to recover a perceptually realistic face videos from a low-quality input.
While diffusion probabilistic models (DPMs) have been shown to achieve
remarkable performance for face image restoration, they often fail to preserve
temporally coherent, high-quality videos, compromising the fidelity of
reconstructed faces. We present a new conditional diffusion framework called
FLAIR for FVR. FLAIR ensures temporal consistency across frames in a
computationally efficient fashion by converting a traditional image DPM into a
video DPM. The proposed conversion uses a recurrent video refinement layer and
a temporal self-attention at different scales. FLAIR also uses a conditional
iterative refinement process to balance the perceptual and distortion quality
during inference. This process consists of two key components: a
data-consistency module that analytically ensures that the generated video
precisely matches its degraded observation and a coarse-to-fine image
enhancement module specifically for facial regions. Our extensive experiments
show superiority of FLAIR over the current state-of-the-art (SOTA) for video
super-resolution, deblurring, JPEG restoration, and space-time frame
interpolation on two high-quality face video datasets.Comment: 32 pages, 27 figure
PtychoDV: Vision Transformer-Based Deep Unrolling Network for Ptychographic Image Reconstruction
Ptychography is an imaging technique that captures multiple overlapping
snapshots of a sample, illuminated coherently by a moving localized probe. The
image recovery from ptychographic data is generally achieved via an iterative
algorithm that solves a nonlinear phase retrieval problem derived from measured
diffraction patterns. However, these iterative approaches have high
computational cost. In this paper, we introduce PtychoDV, a novel deep
model-based network designed for efficient, high-quality ptychographic image
reconstruction. PtychoDV comprises a vision transformer that generates an
initial image from the set of raw measurements, taking into consideration their
mutual correlations. This is followed by a deep unrolling network that refines
the initial image using learnable convolutional priors and the ptychography
measurement model. Experimental results on simulated data demonstrate that
PtychoDV is capable of outperforming existing deep learning methods for this
problem, and significantly reduces computational cost compared to iterative
methodologies, while maintaining competitive performance
Self-Supervised Deep Equilibrium Models for Inverse Problems with Theoretical Guarantees
Deep equilibrium models (DEQ) have emerged as a powerful alternative to deep
unfolding (DU) for image reconstruction. DEQ models-implicit neural networks
with effectively infinite number of layers-were shown to achieve
state-of-the-art image reconstruction without the memory complexity associated
with DU. While the performance of DEQ has been widely investigated, the
existing work has primarily focused on the settings where groundtruth data is
available for training. We present self-supervised deep equilibrium model
(SelfDEQ) as the first self-supervised reconstruction framework for training
model-based implicit networks from undersampled and noisy MRI measurements. Our
theoretical results show that SelfDEQ can compensate for unbalanced sampling
across multiple acquisitions and match the performance of fully supervised DEQ.
Our numerical results on in-vivo MRI data show that SelfDEQ leads to
state-of-the-art performance using only undersampled and noisy training data
DDPET-3D: Dose-aware Diffusion Model for 3D Ultra Low-dose PET Imaging
As PET imaging is accompanied by substantial radiation exposure and cancer
risk, reducing radiation dose in PET scans is an important topic. Recently,
diffusion models have emerged as the new state-of-the-art generative model to
generate high-quality samples and have demonstrated strong potential for
various tasks in medical imaging. However, it is difficult to extend diffusion
models for 3D image reconstructions due to the memory burden. Directly stacking
2D slices together to create 3D image volumes would results in severe
inconsistencies between slices. Previous works tried to either apply a penalty
term along the z-axis to remove inconsistencies or reconstruct the 3D image
volumes with 2 pre-trained perpendicular 2D diffusion models. Nonetheless,
these previous methods failed to produce satisfactory results in challenging
cases for PET image denoising. In addition to administered dose, the noise
levels in PET images are affected by several other factors in clinical
settings, e.g. scan time, medical history, patient size, and weight, etc.
Therefore, a method to simultaneously denoise PET images with different
noise-levels is needed. Here, we proposed a Dose-aware Diffusion model for 3D
low-dose PET imaging (DDPET-3D) to address these challenges. We extensively
evaluated DDPET-3D on 100 patients with 6 different low-dose levels (a total of
600 testing studies), and demonstrated superior performance over previous
diffusion models for 3D imaging problems as well as previous noise-aware
medical image denoising models. The code is available at:
https://github.com/xxx/xxx.Comment: Paper under review. 16 pages, 11 figures, 4 table