12,367 research outputs found
Multi-region segmentation of bladder cancer structures in MRI with progressive dilated convolutional networks
Precise segmentation of bladder walls and tumor regions is an essential step
towards non-invasive identification of tumor stage and grade, which is critical
for treatment decision and prognosis of patients with bladder cancer (BC).
However, the automatic delineation of bladder walls and tumor in magnetic
resonance images (MRI) is a challenging task, due to important bladder shape
variations, strong intensity inhomogeneity in urine and very high variability
across population, particularly on tumors appearance. To tackle these issues,
we propose to use a deep fully convolutional neural network. The proposed
network includes dilated convolutions to increase the receptive field without
incurring extra cost nor degrading its performance. Furthermore, we introduce
progressive dilations in each convolutional block, thereby enabling extensive
receptive fields without the need for large dilation rates. The proposed
network is evaluated on 3.0T T2-weighted MRI scans from 60 pathologically
confirmed patients with BC. Experiments shows the proposed model to achieve
high accuracy, with a mean Dice similarity coefficient of 0.98, 0.84 and 0.69
for inner wall, outer wall and tumor region, respectively. These results
represent a very good agreement with reference contours and an increase in
performance compared to existing methods. In addition, inference times are less
than a second for a whole 3D volume, which is between 2-3 orders of magnitude
faster than related state-of-the-art methods for this application. We showed
that a CNN can yield precise segmentation of bladder walls and tumors in
bladder cancer patients on MRI. The whole segmentation process is
fully-automatic and yields results in very good agreement with the reference
standard, demonstrating the viability of deep learning models for the automatic
multi-region segmentation of bladder cancer MRI images.Comment: Published at the journal of Medical Physic
Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery
To autonomously navigate and plan interactions in real-world environments,
robots require the ability to robustly perceive and map complex, unstructured
surrounding scenes. Besides building an internal representation of the observed
scene geometry, the key insight toward a truly functional understanding of the
environment is the usage of higher-level entities during mapping, such as
individual object instances. We propose an approach to incrementally build
volumetric object-centric maps during online scanning with a localized RGB-D
camera. First, a per-frame segmentation scheme combines an unsupervised
geometric approach with instance-aware semantic object predictions. This allows
us to detect and segment elements both from the set of known classes and from
other, previously unseen categories. Next, a data association step tracks the
predicted instances across the different frames. Finally, a map integration
strategy fuses information about their 3D shape, location, and, if available,
semantic class into a global volume. Evaluation on a publicly available dataset
shows that the proposed approach for building instance-level semantic maps is
competitive with state-of-the-art methods, while additionally able to discover
objects of unseen categories. The system is further evaluated within a
real-world robotic mapping setup, for which qualitative results highlight the
online nature of the method.Comment: 8 pages, 4 figures. To be published in IEEE Robotics and Automation
Letters (RA-L) and 2019 IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS). Accompanying video material can be found at
http://youtu.be/Jvl42VJmYx
Analysis and Modeling of 3D Indoor Scenes
We live in a 3D world, performing activities and interacting with objects in
the indoor environments everyday. Indoor scenes are the most familiar and
essential environments in everyone's life. In the virtual world, 3D indoor
scenes are also ubiquitous in 3D games and interior design. With the fast
development of VR/AR devices and the emerging applications, the demand of
realistic 3D indoor scenes keeps growing rapidly. Currently, designing detailed
3D indoor scenes requires proficient 3D designing and modeling skills and is
often time-consuming. For novice users, creating realistic and complex 3D
indoor scenes is even more difficult and challenging.
Many efforts have been made in different research communities, e.g. computer
graphics, vision and robotics, to capture, analyze and generate the 3D indoor
data. This report mainly focuses on the recent research progress in graphics on
geometry, structure and semantic analysis of 3D indoor data and different
modeling techniques for creating plausible and realistic indoor scenes. We
first review works on understanding and semantic modeling of scenes from
captured 3D data of the real world. Then, we focus on the virtual scenes
composed of 3D CAD models and study methods for 3D scene analysis and
processing. After that, we survey various modeling paradigms for creating 3D
indoor scenes and investigate human-centric scene analysis and modeling, which
bridge indoor scene studies of graphics, vision and robotics. At last, we
discuss open problems in indoor scene processing that might bring interests to
graphics and all related communities
Radiological images and machine learning: trends, perspectives, and prospects
The application of machine learning to radiological images is an increasingly
active research area that is expected to grow in the next five to ten years.
Recent advances in machine learning have the potential to recognize and
classify complex patterns from different radiological imaging modalities such
as x-rays, computed tomography, magnetic resonance imaging and positron
emission tomography imaging. In many applications, machine learning based
systems have shown comparable performance to human decision-making. The
applications of machine learning are the key ingredients of future clinical
decision making and monitoring systems. This review covers the fundamental
concepts behind various machine learning techniques and their applications in
several radiological imaging areas, such as medical image segmentation, brain
function studies and neurological disease diagnosis, as well as computer-aided
systems, image registration, and content-based image retrieval systems.
Synchronistically, we will briefly discuss current challenges and future
directions regarding the application of machine learning in radiological
imaging. By giving insight on how take advantage of machine learning powered
applications, we expect that clinicians can prevent and diagnose diseases more
accurately and efficiently.Comment: 13 figure
ConvPoint: Continuous Convolutions for Point Cloud Processing
Point clouds are unstructured and unordered data, as opposed to images. Thus,
most machine learning approach developed for image cannot be directly
transferred to point clouds. In this paper, we propose a generalization of
discrete convolutional neural networks (CNNs) in order to deal with point
clouds by replacing discrete kernels by continuous ones. This formulation is
simple, allows arbitrary point cloud sizes and can easily be used for designing
neural networks similarly to 2D CNNs. We present experimental results with
various architectures, highlighting the flexibility of the proposed approach.
We obtain competitive results compared to the state-of-the-art on shape
classification, part segmentation and semantic segmentation for large-scale
point clouds.Comment: 12 page
Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control
In the context of deep learning for robotics, we show effective method of
training a real robot to grasp a tiny sphere (1.37cm of diameter), with an
original combination of system design choices. We decompose the end-to-end
system into a vision module and a closed-loop controller module. The two
modules use target object segmentation as their common interface. The vision
module extracts information from the robot end-effector camera, in the form of
a binary segmentation mask of the target. We train it to achieve effective
domain transfer by composing real background images with simulated images of
the target. The controller module takes as input the binary segmentation mask,
and thus is agnostic to visual discrepancies between simulated and real
environments. We train our closed-loop controller in simulation using imitation
learning and show it is robust with respect to discrepancies between the
dynamic model of the simulated and real robot: when combined with eye-in-hand
observations, we achieve a 90% success rate in grasping a tiny sphere with a
real robot. The controller can generalize to unseen scenarios where the target
is moving and even learns to recover from failures.Comment: Neural Information Processing Systems (NIPS) 2017 Workshop on Acting
and Interacting in the Real World: Challenges in Robot Learnin
Learning to Sample
Processing large point clouds is a challenging task. Therefore, the data is
often sampled to a size that can be processed more easily. The question is how
to sample the data? A popular sampling technique is Farthest Point Sampling
(FPS). However, FPS is agnostic to a downstream application (classification,
retrieval, etc.). The underlying assumption seems to be that minimizing the
farthest point distance, as done by FPS, is a good proxy to other objective
functions.
We show that it is better to learn how to sample. To do that, we propose a
deep network to simplify 3D point clouds. The network, termed S-NET, takes a
point cloud and produces a smaller point cloud that is optimized for a
particular task. The simplified point cloud is not guaranteed to be a subset of
the original point cloud. Therefore, we match it to a subset of the original
points in a post-processing step. We contrast our approach with FPS by
experimenting on two standard data sets and show significantly better results
for a variety of applications. Our code is publicly available at:
https://github.com/orendv/learning_to_sampleComment: CVPR 201
Automatic Cardiac Disease Assessment on cine-MRI via Time-Series Segmentation and Domain Specific Features
Cardiac magnetic resonance imaging improves on diagnosis of cardiovascular
diseases by providing images at high spatiotemporal resolution. Manual
evaluation of these time-series, however, is expensive and prone to biased and
non-reproducible outcomes. In this paper, we present a method that addresses
named limitations by integrating segmentation and disease classification into a
fully automatic processing pipeline. We use an ensemble of UNet inspired
architectures for segmentation of cardiac structures such as the left and right
ventricular cavity (LVC, RVC) and the left ventricular myocardium (LVM) on each
time instance of the cardiac cycle. For the classification task, information is
extracted from the segmented time-series in form of comprehensive features
handcrafted to reflect diagnostic clinical procedures. Based on these features
we train an ensemble of heavily regularized multilayer perceptrons (MLP) and a
random forest classifier to predict the pathologic target class. We evaluated
our method on the ACDC dataset (4 pathology groups, 1 healthy group) and
achieve dice scores of 0.945 (LVC), 0.908 (RVC) and 0.905 (LVM) in a
cross-validation over the training set (100 cases) and 0.950 (LVC), 0.923 (RVC)
and 0.911 (LVM) on the test set (50 cases). We report a classification accuracy
of 94% on a training set cross-validation and 92% on the test set. Our results
underpin the potential of machine learning methods for accurate, fast and
reproducible segmentation and computer-assisted diagnosis (CAD).Comment: To appear in the STACOM 2017 proceeding
IsMo-GAN: Adversarial Learning for Monocular Non-Rigid 3D Reconstruction
The majority of the existing methods for non-rigid 3D surface regression from
monocular 2D images require an object template or point tracks over multiple
frames as an input, and are still far from real-time processing rates. In this
work, we present the Isometry-Aware Monocular Generative Adversarial Network
(IsMo-GAN) - an approach for direct 3D reconstruction from a single image,
trained for the deformation model in an adversarial manner on a light-weight
synthetic dataset. IsMo-GAN reconstructs surfaces from real images under
varying illumination, camera poses, textures and shading at over 250 Hz. In
multiple experiments, it consistently outperforms several approaches in the
reconstruction accuracy, runtime, generalisation to unknown surfaces and
robustness to occlusions. In comparison to the state-of-the-art, we reduce the
reconstruction error by 10-30% including the textureless case and our surfaces
evince fewer artefacts qualitatively.Comment: 13 pages, 11 figures, 4 tables, 6 sections, 73 reference
MeshCNN: A Network with an Edge
Polygonal meshes provide an efficient representation for 3D shapes. They
explicitly capture both shape surface and topology, and leverage non-uniformity
to represent large flat regions as well as sharp, intricate features. This
non-uniformity and irregularity, however, inhibits mesh analysis efforts using
neural networks that combine convolution and pooling operations. In this paper,
we utilize the unique properties of the mesh for a direct analysis of 3D shapes
using MeshCNN, a convolutional neural network designed specifically for
triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized
convolution and pooling layers that operate on the mesh edges, by leveraging
their intrinsic geodesic connections. Convolutions are applied on edges and the
four edges of their incident triangles, and pooling is applied via an edge
collapse operation that retains surface topology, thereby, generating new mesh
connectivity for the subsequent convolutions. MeshCNN learns which edges to
collapse, thus forming a task-driven process where the network exposes and
expands the important features while discarding the redundant ones. We
demonstrate the effectiveness of our task-driven pooling on various learning
tasks applied to 3D meshes.Comment: For a two-minute explanation video see https://bit.ly/meshcnnvide
- …