401 research outputs found
Multimodal image super-resolution via joint sparse representations induced by coupled dictionaries
Real-world data processing problems often involve various image modalities associated with a certain scene, including RGB images, infrared images, or multispectral images. The fact that different image modalities often share certain attributes, such as edges, textures, and other structure primitives, represents an opportunity to enhance various image processing tasks. This paper proposes a new approach to construct a high-resolution (HR) version of a low-resolution (LR) image, given another HR image modality as guidance, based on joint sparse representations induced by coupled dictionaries. The proposed approach captures complex dependency correlations, including similarities and disparities, between different image modalities in a learned sparse feature domain in lieu of the original image domain. It consists of two phases: coupled dictionary learning phase and coupled superresolution phase. The learning phase learns a set of dictionaries from the training dataset to couple different image modalities together in the sparse feature domain. In turn, the super-resolution phase leverages such dictionaries to construct an HR version of the LR target image with another related image modality for guidance. In the advanced version of our approach, multistage strategy and neighbourhood regression concept are introduced to further improve the model capacity and performance. Extensive guided image super-resolution experiments on real multimodal images demonstrate that the proposed approach admits distinctive advantages with respect to the state-of-the-art approaches, for example, overcoming the texture copying artifacts commonly resulting from inconsistency between the guidance and target images. Of particular relevance, the proposed model demonstrates much better robustness than competing deep models in a range of noisy scenarios
Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval
In this paper we address the problem of learning robust cross-domain
representations for sketch-based image retrieval (SBIR). While most SBIR
approaches focus on extracting low- and mid-level descriptors for direct
feature matching, recent works have shown the benefit of learning coupled
feature representations to describe data from two related sources. However,
cross-domain representation learning methods are typically cast into non-convex
minimization problems that are difficult to optimize, leading to unsatisfactory
performance. Inspired by self-paced learning, a learning methodology designed
to overcome convergence issues related to local optima by exploiting the
samples in a meaningful order (i.e. easy to hard), we introduce the cross-paced
partial curriculum learning (CPPCL) framework. Compared with existing
self-paced learning methods which only consider a single modality and cannot
deal with prior knowledge, CPPCL is specifically designed to assess the
learning pace by jointly handling data from dual sources and modality-specific
prior information provided in the form of partial curricula. Additionally,
thanks to the learned dictionaries, we demonstrate that the proposed CPPCL
embeds robust coupled representations for SBIR. Our approach is extensively
evaluated on four publicly available datasets (i.e. CUFS, Flickr15K, QueenMary
SBIR and TU-Berlin Extension datasets), showing superior performance over
competing SBIR methods
Multi-modal dictionary learning for image separation with application in art investigation
In support of art investigation, we propose a new source separation method
that unmixes a single X-ray scan acquired from double-sided paintings. In this
problem, the X-ray signals to be separated have similar morphological
characteristics, which brings previous source separation methods to their
limits. Our solution is to use photographs taken from the front and back-side
of the panel to drive the separation process. The crux of our approach relies
on the coupling of the two imaging modalities (photographs and X-rays) using a
novel coupled dictionary learning framework able to capture both common and
disparate features across the modalities using parsimonious representations;
the common component models features shared by the multi-modal images, whereas
the innovation component captures modality-specific information. As such, our
model enables the formulation of appropriately regularized convex optimization
procedures that lead to the accurate separation of the X-rays. Our dictionary
learning framework can be tailored both to a single- and a multi-scale
framework, with the latter leading to a significant performance improvement.
Moreover, to improve further on the visual quality of the separated images, we
propose to train coupled dictionaries that ignore certain parts of the painting
corresponding to craquelure. Experimentation on synthetic and real data - taken
from digital acquisition of the Ghent Altarpiece (1432) - confirms the
superiority of our method against the state-of-the-art morphological component
analysis technique that uses either fixed or trained dictionaries to perform
image separation.Comment: submitted to IEEE Transactions on Images Processin
Multimodal Deep Unfolding for Guided Image Super-Resolution
The reconstruction of a high resolution image given a low resolution
observation is an ill-posed inverse problem in imaging. Deep learning methods
rely on training data to learn an end-to-end mapping from a low-resolution
input to a high-resolution output. Unlike existing deep multimodal models that
do not incorporate domain knowledge about the problem, we propose a multimodal
deep learning design that incorporates sparse priors and allows the effective
integration of information from another image modality into the network
architecture. Our solution relies on a novel deep unfolding operator,
performing steps similar to an iterative algorithm for convolutional sparse
coding with side information; therefore, the proposed neural network is
interpretable by design. The deep unfolding architecture is used as a core
component of a multimodal framework for guided image super-resolution. An
alternative multimodal design is investigated by employing residual learning to
improve the training efficiency. The presented multimodal approach is applied
to super-resolution of near-infrared and multi-spectral images as well as depth
upsampling using RGB images as side information. Experimental results show that
our model outperforms state-of-the-art methods
Spectral Superresolution of Multispectral Imagery with Joint Sparse and Low-Rank Learning
Extensive attention has been widely paid to enhance the spatial resolution of
hyperspectral (HS) images with the aid of multispectral (MS) images in remote
sensing. However, the ability in the fusion of HS and MS images remains to be
improved, particularly in large-scale scenes, due to the limited acquisition of
HS images. Alternatively, we super-resolve MS images in the spectral domain by
the means of partially overlapped HS images, yielding a novel and promising
topic: spectral superresolution (SSR) of MS imagery. This is challenging and
less investigated task due to its high ill-posedness in inverse imaging. To
this end, we develop a simple but effective method, called joint sparse and
low-rank learning (J-SLoL), to spectrally enhance MS images by jointly learning
low-rank HS-MS dictionary pairs from overlapped regions. J-SLoL infers and
recovers the unknown hyperspectral signals over a larger coverage by sparse
coding on the learned dictionary pair. Furthermore, we validate the SSR
performance on three HS-MS datasets (two for classification and one for
unmixing) in terms of reconstruction, classification, and unmixing by comparing
with several existing state-of-the-art baselines, showing the effectiveness and
superiority of the proposed J-SLoL algorithm. Furthermore, the codes and
datasets will be available at:
https://github.com/danfenghong/IEEE\_TGRS\_J-SLoL, contributing to the RS
community
Cross-Modality Feature Learning for Three-Dimensional Brain Image Synthesis
Multi-modality medical imaging is increasingly used for comprehensive assessment of complex diseases in either diagnostic examinations or as part of medical research trials. Different imaging modalities provide complementary information about living tissues. However, multi-modal examinations are not always possible due to adversary factors such as patient discomfort, increased cost, prolonged scanning time and scanner unavailability. In addition, in large imaging studies, incomplete records are not uncommon owing to image artifacts, data corruption or data loss, which compromise the potential of multi-modal acquisitions. Moreover, independently of how well an imaging system is, the performance of the imaging equipment usually comes to a certain limit through different physical devices. Additional interferences arise (particularly for medical imaging systems), for example, limited acquisition times, sophisticated and costly equipment and patients with severe medical conditions, which also cause image degradation. The acquisitions can be considered as the degraded version of the original high-quality images.
In this dissertation, we explore the problems of image super-resolution and cross-modality synthesis for one Magnetic Resonance Imaging (MRI) modality from an image of another MRI modality of the same subject using an image synthesis framework for reconstructing the missing/complex modality data. We develop models and techniques that allow us to connect the domain of source modality data and the domain of target modality data, enabling transformation between elements of
the two domains. In particular, we first introduce the models that project both source modality data and target modality data into a common multi-modality feature space in a supervised setting. This common space then allows us to connect cross-modality features that depict a relationship between each other, and we can impose the learned association function that synthesizes any target modality image. Moreover, we develop a weakly-supervised method that takes a few registered multi-modality image pairs as training data and generates the desired modality data without being constrained a large number of multi-modality images collection of well-processed (\textit{e.g.}, skull-stripped and strictly registered) brain data. Finally, we propose an approach that provides a generic way of learning a dual mapping between source and target domains while considering both visually high-fidelity synthesis and task-practicability. We demonstrate that this model can be used to take any arbitrary modality and efficiently synthesize the desirable modality data in an unsupervised manner.
We show that these proposed models advance the state-of-the-art on image super-resolution and cross-modality synthesis tasks that need jointly processing of multi-modality images and that we can design the algorithms in ways to generate the practically beneficial data to medical image analysis
Self-Supervised Super-Resolution Approach for Isotropic Reconstruction of 3D Electron Microscopy Images from Anisotropic Acquisition
Three-dimensional electron microscopy (3DEM) is an essential technique to
investigate volumetric tissue ultra-structure. Due to technical limitations and
high imaging costs, samples are often imaged anisotropically, where resolution
in the axial direction () is lower than in the lateral directions .
This anisotropy 3DEM can hamper subsequent analysis and visualization tasks. To
overcome this limitation, we propose a novel deep-learning (DL)-based
self-supervised super-resolution approach that computationally reconstructs
isotropic 3DEM from the anisotropic acquisition. The proposed DL-based
framework is built upon the U-shape architecture incorporating
vision-transformer (ViT) blocks, enabling high-capability learning of local and
global multi-scale image dependencies. To train the tailored network, we employ
a self-supervised approach. Specifically, we generate pairs of anisotropic and
isotropic training datasets from the given anisotropic 3DEM data. By feeding
the given anisotropic 3DEM dataset in the trained network through our proposed
framework, the isotropic 3DEM is obtained. Importantly, this isotropic
reconstruction approach relies solely on the given anisotropic 3DEM dataset and
does not require pairs of co-registered anisotropic and isotropic 3DEM training
datasets. To evaluate the effectiveness of the proposed method, we conducted
experiments using three 3DEM datasets acquired from brain. The experimental
results demonstrated that our proposed framework could successfully reconstruct
isotropic 3DEM from the anisotropic acquisition
- …