27,780 research outputs found
Learning Deep Similarity Metric for 3D MR-TRUS Registration
Purpose: The fusion of transrectal ultrasound (TRUS) and magnetic resonance
(MR) images for guiding targeted prostate biopsy has significantly improved the
biopsy yield of aggressive cancers. A key component of MR-TRUS fusion is image
registration. However, it is very challenging to obtain a robust automatic
MR-TRUS registration due to the large appearance difference between the two
imaging modalities. The work presented in this paper aims to tackle this
problem by addressing two challenges: (i) the definition of a suitable
similarity metric and (ii) the determination of a suitable optimization
strategy.
Methods: This work proposes the use of a deep convolutional neural network to
learn a similarity metric for MR-TRUS registration. We also use a composite
optimization strategy that explores the solution space in order to search for a
suitable initialization for the second-order optimization of the learned
metric. Further, a multi-pass approach is used in order to smooth the metric
for optimization.
Results: The learned similarity metric outperforms the classical mutual
information and also the state-of-the-art MIND feature based methods. The
results indicate that the overall registration framework has a large capture
range. The proposed deep similarity metric based approach obtained a mean TRE
of 3.86mm (with an initial TRE of 16mm) for this challenging problem.
Conclusion: A similarity metric that is learned using a deep neural network
can be used to assess the quality of any given image registration and can be
used in conjunction with the aforementioned optimization framework to perform
automatic registration that is robust to poor initialization.Comment: To appear on IJCAR
On Recognizing Transparent Objects in Domestic Environments Using Fusion of Multiple Sensor Modalities
Current object recognition methods fail on object sets that include both
diffuse, reflective and transparent materials, although they are very common in
domestic scenarios. We show that a combination of cues from multiple sensor
modalities, including specular reflectance and unavailable depth information,
allows us to capture a larger subset of household objects by extending a state
of the art object recognition method. This leads to a significant increase in
robustness of recognition over a larger set of commonly used objects.Comment: 12 page
A comparative evaluation of 3 different free-form deformable image registration and contour propagation methods for head and neck MRI : the case of parotid changes radiotherapy
Purpose: To validate and compare the deformable image registration and parotid contour propagation process for head and neck magnetic resonance imaging in patients treated with radiotherapy using 3 different approachesthe commercial MIM, the open-source Elastix software, and an optimized version of it.
Materials and Methods: Twelve patients with head and neck cancer previously treated with radiotherapy were considered. Deformable image registration and parotid contour propagation were evaluated by considering the magnetic resonance images acquired before and after the end of the treatment. Deformable image registration, based on free-form deformation method, and contour propagation available on MIM were compared to Elastix. Two different contour propagation approaches were implemented for Elastix software, a conventional one (DIR_Trx) and an optimized homemade version, based on mesh deformation (DIR_Mesh). The accuracy of these 3 approaches was estimated by comparing propagated to manual contours in terms of average symmetric distance, maximum symmetric distance, Dice similarity coefficient, sensitivity, and inclusiveness.
Results: A good agreement was generally found between the manual contours and the propagated ones, without differences among the 3 methods; in few critical cases with complex deformations, DIR_Mesh proved to be more accurate, having the lowest values of average symmetric distance and maximum symmetric distance and the highest value of Dice similarity coefficient, although nonsignificant. The average propagation errors with respect to the reference contours are lower than the voxel diagonal (2 mm), and Dice similarity coefficient is around 0.8 for all 3 methods.
Conclusion: The 3 free-form deformation approaches were not significantly different in terms of deformable image registration accuracy and can be safely adopted for the registration and parotid contour propagation during radiotherapy on magnetic resonance imaging. More optimized approaches (as DIR_Mesh) could be preferable for critical deformations
Integration of multimodal data based on surface registration
The paper proposes and evaluates a strategy for the alignment of
anatomical and functional data of the brain. The method takes as an
input two different sets of images of a same patient: MR data and
SPECT. It proceeds in four steps: first, it constructs two voxel
models from the two image sets; next, it extracts from the two voxel
models the surfaces of regions of interest; in the third step, the
surfaces are interactively aligned by corresponding pairs; finally a
unique volume model is constructed by selectively applying the
geometrical transformations associated to the regions and weighting
their contributions. The main advantages of this strategy are (i) that
it can be applied retrospectively, (ii) that it is tri-dimensional,
and (iii) that it is local. Its main disadvantage with regard to
previously published methods it that it requires the extraction of
surfaces. However, this step is often required for other stages of the
multimodal analysis such as the visualization and therefore its cost
can be accounted in the global cost of the process.Postprint (published version
Adversarial Deformation Regularization for Training Image Registration Neural Networks
We describe an adversarial learning approach to constrain convolutional
neural network training for image registration, replacing heuristic smoothness
measures of displacement fields often used in these tasks. Using
minimally-invasive prostate cancer intervention as an example application, we
demonstrate the feasibility of utilizing biomechanical simulations to
regularize a weakly-supervised anatomical-label-driven registration network for
aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural
transrectal ultrasound (TRUS) images. A discriminator network is optimized to
distinguish the registration-predicted displacement fields from the motion data
simulated by finite element analysis. During training, the registration network
simultaneously aims to maximize similarity between anatomical labels that
drives image alignment and to minimize an adversarial generator loss that
measures divergence between the predicted- and simulated deformation. The
end-to-end trained network enables efficient and fully-automated registration
that only requires an MR and TRUS image pair as input, without anatomical
labels or simulated data during inference. 108 pairs of labelled MR and TRUS
images from 76 prostate cancer patients and 71,500 nonlinear finite-element
simulations from 143 different patients were used for this study. We show that,
with only gland segmentation as training labels, the proposed method can help
predict physically plausible deformation without any other smoothness penalty.
Based on cross-validation experiments using 834 pairs of independent validation
landmarks, the proposed adversarial-regularized registration achieved a target
registration error of 6.3 mm that is significantly lower than those from
several other regularization methods.Comment: Accepted to MICCAI 201
Learning semantic sentence representations from visually grounded language without lexical knowledge
Current approaches to learning semantic representations of sentences often
use prior word-level knowledge. The current study aims to leverage visual
information in order to capture sentence level semantics without the need for
word embeddings. We use a multimodal sentence encoder trained on a corpus of
images with matching text captions to produce visually grounded sentence
embeddings. Deep Neural Networks are trained to map the two modalities to a
common embedding space such that for an image the corresponding caption can be
retrieved and vice versa. We show that our model achieves results comparable to
the current state-of-the-art on two popular image-caption retrieval benchmark
data sets: MSCOCO and Flickr8k. We evaluate the semantic content of the
resulting sentence embeddings using the data from the Semantic Textual
Similarity benchmark task and show that the multimodal embeddings correlate
well with human semantic similarity judgements. The system achieves
state-of-the-art results on several of these benchmarks, which shows that a
system trained solely on multimodal data, without assuming any word
representations, is able to capture sentence level semantics. Importantly, this
result shows that we do not need prior knowledge of lexical level semantics in
order to model sentence level semantics. These findings demonstrate the
importance of visual information in semantics
How can a multimodal approach to primate communication help us understand the evolution of communication?
Scientists studying the communication of non-human animals are often aiming to better understand the evolution of human communication, including human language. Some scientists take a phylogenetic perspective, where the goal is to trace the evolutionary history of communicative traits, while others take a functional perspective, where the goal is to understand the selection pressures underpinning specific traits. Both perspectives are necessary to fully understand the evolution of communication, but it is important to understand how the two perspectives differ and what they can and cannot tell us. Here, we suggest that integrating phylogenetic and functional questions can be fruitful in better understanding the evolution of communication. We also suggest that adopting a multimodal approach to communication might help to integrate phylogenetic and functional questions, and provide an interesting avenue for research into language evolution
- …