11 research outputs found
M2Net: Multi-modal Multi-channel Network for Overall Survival Time Prediction of Brain Tumor Patients
Early and accurate prediction of overall survival (OS) time can help to
obtain better treatment planning for brain tumor patients. Although many OS
time prediction methods have been developed and obtain promising results, there
are still several issues. First, conventional prediction methods rely on
radiomic features at the local lesion area of a magnetic resonance (MR) volume,
which may not represent the full image or model complex tumor patterns. Second,
different types of scanners (i.e., multi-modal data) are sensitive to different
brain regions, which makes it challenging to effectively exploit the
complementary information across multiple modalities and also preserve the
modality-specific properties. Third, existing methods focus on prediction
models, ignoring complex data-to-label relationships. To address the above
issues, we propose an end-to-end OS time prediction model; namely, Multi-modal
Multi-channel Network (M2Net). Specifically, we first project the 3D MR volume
onto 2D images in different directions, which reduces computational costs,
while preserving important information and enabling pre-trained models to be
transferred from other tasks. Then, we use a modality-specific network to
extract implicit and high-level features from different MR scans. A multi-modal
shared network is built to fuse these features using a bilinear pooling model,
exploiting their correlations to provide complementary information. Finally, we
integrate the outputs from each modality-specific network and the multi-modal
shared network to generate the final prediction result. Experimental results
demonstrate the superiority of our M2Net model over other methods.Comment: Accepted by MICCAI'2
Breaking Modality Disparity: Harmonized Representation for Infrared and Visible Image Registration
Since the differences in viewing range, resolution and relative position, the
multi-modality sensing module composed of infrared and visible cameras needs to
be registered so as to have more accurate scene perception. In practice, manual
calibration-based registration is the most widely used process, and it is
regularly calibrated to maintain accuracy, which is time-consuming and
labor-intensive. To cope with these problems, we propose a scene-adaptive
infrared and visible image registration. Specifically, in regard of the
discrepancy between multi-modality images, an invertible translation process is
developed to establish a modality-invariant domain, which comprehensively
embraces the feature intensity and distribution of both infrared and visible
modalities. We employ homography to simulate the deformation between different
planes and develop a hierarchical framework to rectify the deformation inferred
from the proposed latent representation in a coarse-to-fine manner. For that,
the advanced perception ability coupled with the residual estimation conducive
to the regression of sparse offsets, and the alternate correlation search
facilitates a more accurate correspondence matching. Moreover, we propose the
first ground truth available misaligned infrared and visible image dataset,
involving three synthetic sets and one real-world set. Extensive experiments
validate the effectiveness of the proposed method against the
state-of-the-arts, advancing the subsequent applications.Comment: 10 pages, 11 figure
KidneyRegNet: A Deep Learning Method for 3DCT-2DUS Kidney Registration during Breathing
This work proposed a novel deep registration pipeline for 3D CT and 2D U/S
kidney scans of free breathing, which consists of a feature network, and a
3D-2D CNN-based registration network. The feature network has handcraft texture
feature layers to reduce the semantic gap. The registration network is
encoder-decoder structure with loss of feature-image-motion (FIM), which
enables hierarchical regression at decoder layers and avoids multiple network
concatenation. It was first pretrained with retrospective datasets cum training
data generation strategy, then adapted to specific patient data under
unsupervised one-cycle transfer learning in onsite application. The experiment
was on 132 U/S sequences, 39 multiple phase CT and 210 public single phase CT
images, and 25 pairs of CT and U/S sequences. It resulted in mean contour
distance (MCD) of 0.94 mm between kidneys on CT and U/S images and MCD of 1.15
mm on CT and reference CT images. For datasets with small transformations, it
resulted in MCD of 0.82 and 1.02 mm respectively. For large transformations, it
resulted in MCD of 1.10 and 1.28 mm respectively. This work addressed
difficulties in 3DCT-2DUS kidney registration during free breathing via novel
network structures and training strategy.Comment: 15 pages, 8 figures, 9 table
LDDMM y GANs: Redes Generativas Antagónicas para Registro Difeomorfico.
El Registro Difeomorfico de imágenes es un problema clave para muchas aplicaciones de la Anatomía Computacional. Tradicionalmente, el registro deformable de imagen ha sido formulado como un problema variacional, resoluble mediante costosos métodos de optimización numérica. En la última década, contribuciones en la forma de nuevos métodos basados en formulaciones tradicionales están decreciendo, mientras que más modelos basados en Aprendizaje profundo están siendo desarrollados para aprender registros deformables de imágenes. En este trabajo contribuimos a esta nueva corriente proponiendo un novedoso método LDDMM para registro difeomorfico de imágenes 3D, basado en redes generativas antagónicas. Combinamos las arquitecturas de generadores y discriminadores con mejores prestaciones en registro deformable con el paradigma LDDMM. Hemos implementado con éxito tres modelos para distintas parametrizaciones de difeomorfismos, los cuales demuestran resultados competitivos en comparación con métodos del estado del arte tanto tradicionales como basados en aprendizaje profundo.<br /
Unsupervised image registration towards enhancing performance and explainability in cardiac and brain image analysis
Magnetic Resonance Imaging (MRI) typically recruits multiple sequences (defined here as “modalities”). As each modality is designed to offer different anatomical and functional clinical information, there are evident disparities in the imaging content across modalities. Inter- and intra-modality affine and non-rigid image registration is an essential medical image analysis process in clinical imaging, as for example before imaging biomarkers need to be derived and clinically evaluated across different MRI modalities, time phases and slices. Although commonly needed in real clinical scenarios, affine and non-rigid image registration is not extensively investigated using a single unsupervised model architecture. In our work, we present an unsupervised deep learning registration methodology that can accurately model affine and non-rigid transformations, simultaneously. Moreover, inverse-consistency is a fundamental inter-modality registration property that is not considered in deep learning registration algorithms. To address inverse consistency, our methodology performs bi-directional cross-modality image synthesis to learn modality-invariant latent representations, and involves two factorised transformation networks (one per each encoder-decoder channel) and an inverse-consistency loss to learn topology-preserving anatomical transformations. Overall, our model (named “FIRE”) shows improved performances against the reference standard baseline method (i.e., Symmetric Normalization implemented using the ANTs toolbox) on multi-modality brain 2D and 3D MRI and intra-modality cardiac 4D MRI data experiments. We focus on explaining model-data components to enhance model explainability in medical image registration. On computational time experiments, we show that the FIRE model performs on a memory-saving mode, as it can inherently learn topology-preserving image registration directly in the training phase. We therefore demonstrate an efficient and versatile registration technique that can have merit in multi-modal image registrations in the clinical setting
Medical Image Registration: Statistical Models of Performance in Relation to the Statistical Characteristics of the Image Data
For image-guided interventions, the imaging task often pertains to registering preoperative and intraoperative images within a common coordinate system. While the accuracy of the registration is directly tied to the accuracy of targeting in the intervention (and presumably the success of the medical outcome), there is relatively little quantitative understanding of the fundamental factors that govern image registration accuracy.
A statistical framework is presented that relates models of image noise and spatial resolution to the task of registration, giving theoretical limits on registration accuracy and providing guidance for the selection of image acquisition and post-processing parameters. The framework is further shown to model the confounding influence of soft-tissue deformation in rigid image registration — accurately predicting the reduction in registration accuracy and revealing similarity metrics that are robust against such effects. Furthermore, the framework is shown to provide conceptual guidance in the development of a novel CT-to-radiograph registration method that accounts for deformation.
The work also examines a learning-based method for deformable registration to investigate how the statistical characteristics of the training data affect the ability of the model to generalize to test data with differing statistical characteristics. The analysis provides insight on the benefits of statistically diverse training data in generalizability of a neural network and is further applied to the development of a learning-based MR-to-CT synthesis method.
Overall, the work yields a quantitative approach to theoretically and experimentally relate the accuracy of image registration to the statistical characteristics of the image data, providing a rigorous guide to the development of new registration methods