16,968 research outputs found
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
In this paper, we address semantic segmentation of road-objects from 3D LiDAR
point clouds. In particular, we wish to detect and categorize instances of
interest, such as cars, pedestrians and cyclists. We formulate this problem as
a point- wise classification problem, and propose an end-to-end pipeline called
SqueezeSeg based on convolutional neural networks (CNN): the CNN takes a
transformed LiDAR point cloud as input and directly outputs a point-wise label
map, which is then refined by a conditional random field (CRF) implemented as a
recurrent layer. Instance-level labels are then obtained by conventional
clustering algorithms. Our CNN model is trained on LiDAR point clouds from the
KITTI dataset, and our point-wise segmentation labels are derived from 3D
bounding boxes from KITTI. To obtain extra training data, we built a LiDAR
simulator into Grand Theft Auto V (GTA-V), a popular video game, to synthesize
large amounts of realistic training data. Our experiments show that SqueezeSeg
achieves high accuracy with astonishingly fast and stable runtime (8.7 ms per
frame), highly desirable for autonomous driving applications. Furthermore,
additionally training on synthesized data boosts validation accuracy on
real-world data. Our source code and synthesized data will be open-sourced
3D Anisotropic Hybrid Network: Transferring Convolutional Features from 2D Images to 3D Anisotropic Volumes
While deep convolutional neural networks (CNN) have been successfully applied
for 2D image analysis, it is still challenging to apply them to 3D anisotropic
volumes, especially when the within-slice resolution is much higher than the
between-slice resolution and when the amount of 3D volumes is relatively small.
On one hand, direct learning of CNN with 3D convolution kernels suffers from
the lack of data and likely ends up with poor generalization; insufficient GPU
memory limits the model size or representational power. On the other hand,
applying 2D CNN with generalizable features to 2D slices ignores between-slice
information. Coupling 2D network with LSTM to further handle the between-slice
information is not optimal due to the difficulty in LSTM learning. To overcome
the above challenges, we propose a 3D Anisotropic Hybrid Network (AH-Net) that
transfers convolutional features learned from 2D images to 3D anisotropic
volumes. Such a transfer inherits the desired strong generalization capability
for within-slice information while naturally exploiting between-slice
information for more effective modelling. The focal loss is further utilized
for more effective end-to-end learning. We experiment with the proposed 3D
AH-Net on two different medical image analysis tasks, namely lesion detection
from a Digital Breast Tomosynthesis volume, and liver and liver tumor
segmentation from a Computed Tomography volume and obtain the state-of-the-art
results
VesselMorph: Domain-Generalized Retinal Vessel Segmentation via Shape-Aware Representation
Due to the absence of a single standardized imaging protocol, domain shift
between data acquired from different sites is an inherent property of medical
images and has become a major obstacle for large-scale deployment of
learning-based algorithms. For retinal vessel images, domain shift usually
presents as the variation of intensity, contrast and resolution, while the
basic tubular shape of vessels remains unaffected. Thus, taking advantage of
such domain-invariant morphological features can greatly improve the
generalizability of deep models. In this study, we propose a method named
VesselMorph which generalizes the 2D retinal vessel segmentation task by
synthesizing a shape-aware representation. Inspired by the traditional Frangi
filter and the diffusion tensor imaging literature, we introduce a
Hessian-based bipolar tensor field to depict the morphology of the vessels so
that the shape information is taken into account. We map the intensity image
and the tensor field to a latent space for feature extraction. Then we fuse the
two latent representations via a weight-balancing trick and feed the result to
a segmentation network. We evaluate on six public datasets of fundus and OCT
angiography images from diverse patient populations. VesselMorph achieves
superior generalization performance compared with competing methods in
different domain shift scenarios
- …