7,692 research outputs found
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
BInGo: Bayesian Intrinsic Groupwise Registration via Explicit Hierarchical Disentanglement
Multimodal groupwise registration aligns internal structures in a group of
medical images. Current approaches to this problem involve developing
similarity measures over the joint intensity profile of all images, which may
be computationally prohibitive for large image groups and unstable under
various conditions. To tackle these issues, we propose BInGo, a general
unsupervised hierarchical Bayesian framework based on deep learning, to learn
intrinsic structural representations to measure the similarity of multimodal
images. Particularly, a variational auto-encoder with a novel posterior is
proposed, which facilitates the disentanglement learning of structural
representations and spatial transformations, and characterizes the imaging
process from the common structure with shape transition and appearance
variation. Notably, BInGo is scalable to learn from small groups, whereas being
tested for large-scale groupwise registration, thus significantly reducing
computational costs. We compared BInGo with five iterative or deep learning
methods on three public intrasubject and intersubject datasets, i.e. BraTS,
MS-CMR of the heart, and Learn2Reg abdomen MR-CT, and demonstrated its superior
accuracy and computational efficiency, even for very large group sizes (e.g.,
over 1300 2D images from MS-CMR in each group)
Toward an object-based semantic memory for long-term operation of mobile service robots
Throughout a lifetime of operation, a mobile service robot needs to acquire, store and update its knowledge of a working environment. This includes the ability to identify and track objects in different places, as well as using this information for interaction with humans. This paper introduces a long-term updating mechanism, inspired by the modal model of human memory, to enable a mobile robot to maintain its knowledge of a changing environment. The memory model is integrated with a hybrid map that represents the global topology and local geometry of the environment, as well as the respective 3D location of objects. We aim to enable the robot to use this knowledge to help humans by suggesting the most likely locations of specific objects in its map. An experiment using omni-directional vision demonstrates the ability to track the movements of several objects in a dynamic environment over an extended period of time
SwinCross: Cross-modal Swin Transformer for Head-and-Neck Tumor Segmentation in PET/CT Images
Radiotherapy (RT) combined with cetuximab is the standard treatment for
patients with inoperable head and neck cancers. Segmentation of head and neck
(H&N) tumors is a prerequisite for radiotherapy planning but a time-consuming
process. In recent years, deep convolutional neural networks have become the de
facto standard for automated image segmentation. However, due to the expensive
computational cost associated with enlarging the field of view in DCNNs, their
ability to model long-range dependency is still limited, and this can result in
sub-optimal segmentation performance for objects with background context
spanning over long distances. On the other hand, Transformer models have
demonstrated excellent capabilities in capturing such long-range information in
several semantic segmentation tasks performed on medical images. Inspired by
the recent success of Vision Transformers and advances in multi-modal image
analysis, we propose a novel segmentation model, debuted, Cross-Modal Swin
Transformer (SwinCross), with cross-modal attention (CMA) module to incorporate
cross-modal feature extraction at multiple resolutions.To validate the
effectiveness of the proposed method, we performed experiments on the HECKTOR
2021 challenge dataset and compared it with the nnU-Net (the backbone of the
top-5 methods in HECKTOR 2021) and other state-of-the-art transformer-based
methods such as UNETR, and Swin UNETR. The proposed method is experimentally
shown to outperform these comparing methods thanks to the ability of the CMA
module to capture better inter-modality complimentary feature representations
between PET and CT, for the task of head-and-neck tumor segmentation.Comment: 9 pages, 3 figures. Med Phys. 202
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
- …