413 research outputs found
Semi-Supervised Deep Learning for Fully Convolutional Networks
Deep learning usually requires large amounts of labeled training data, but
annotating data is costly and tedious. The framework of semi-supervised
learning provides the means to use both labeled data and arbitrary amounts of
unlabeled data for training. Recently, semi-supervised deep learning has been
intensively studied for standard CNN architectures. However, Fully
Convolutional Networks (FCNs) set the state-of-the-art for many image
segmentation tasks. To the best of our knowledge, there is no existing
semi-supervised learning method for such FCNs yet. We lift the concept of
auxiliary manifold embedding for semi-supervised learning to FCNs with the help
of Random Feature Embedding. In our experiments on the challenging task of MS
Lesion Segmentation, we leverage the proposed framework for the purpose of
domain adaptation and report substantial improvements over the baseline model.Comment: 9 pages, 6 figure
Fast, Simple Calcium Imaging Segmentation with Fully Convolutional Networks
Calcium imaging is a technique for observing neuron activity as a series of
images showing indicator fluorescence over time. Manually segmenting neurons is
time-consuming, leading to research on automated calcium imaging segmentation
(ACIS). We evaluated several deep learning models for ACIS on the Neurofinder
competition datasets and report our best model: U-Net2DS, a fully convolutional
network that operates on 2D mean summary images. U-Net2DS requires minimal
domain-specific pre/post-processing and parameter adjustment, and predictions
are made on full images at 9K images per minute. It
ranks third in the Neurofinder competition () and is the best model
to exclusively use deep learning. We also demonstrate useful segmentations on
data from outside the competition. The model's simplicity, speed, and quality
results make it a practical choice for ACIS and a strong baseline for more
complex models in the future.Comment: Accepted to 3rd Workshop on Deep Learning in Medical Image Analysis
(http://cs.adelaide.edu.au/~dlmia3/
Discriminative Region Proposal Adversarial Networks for High-Quality Image-to-Image Translation
Image-to-image translation has been made much progress with embracing
Generative Adversarial Networks (GANs). However, it's still very challenging
for translation tasks that require high quality, especially at high-resolution
and photorealism. In this paper, we present Discriminative Region Proposal
Adversarial Networks (DRPAN) for high-quality image-to-image translation. We
decompose the procedure of image-to-image translation task into three iterated
steps, first is to generate an image with global structure but some local
artifacts (via GAN), second is using our DRPnet to propose the most fake region
from the generated image, and third is to implement "image inpainting" on the
most fake region for more realistic result through a reviser, so that the
system (DRPAN) can be gradually optimized to synthesize images with more
attention on the most artifact local part. Experiments on a variety of
image-to-image translation tasks and datasets validate that our method
outperforms state-of-the-arts for producing high-quality translation results in
terms of both human perceptual studies and automatic quantitative measures.Comment: ECCV 201
Supervised Versus Unsupervised Deep Learning Based Methods for Skin Lesion Segmentation in Dermoscopy Images
Image segmentation is considered a crucial step in automatic dermoscopic image analysis as it affects the accuracy of subsequent steps. The huge progress in deep learning has recently revolutionized the image recognition and computer vision domains. In this paper, we compare a supervised deep learning based approach with an unsupervised deep learning based approach for the task of skin lesion segmentation in dermoscopy images. Results show that, by using the default parameter settings and network configurations proposed in the original approaches, although the unsupervised approach could detect fine structures of skin lesions in some occasions, the supervised approach shows much higher accuracy in terms of Dice coefficient and Jaccard index compared to the unsupervised approach, resulting in 77.7% vs. 40% and 67.2% vs. 30.4%, respectively. With a proposed modification to the unsupervised approach, the Dice and Jaccard values improved to 54.3% and 44%, respectively
Changes of Structure and Bonding with Thickness in Chalcogenide Thin Films
Extreme miniaturization is known to be detrimental for certain properties, such as ferroelectricity in perovskite oxide films below a critical thickness. Remarkably, few-layer crystalline films of monochalcogenides display robust in-plane ferroelectricity with potential applications in nanoelectronics. These applications critically depend on the electronic properties and the nature of bonding in the 2D limit. A fundamental open question is thus to what extent bulk properties persist in thin films. Here, this question is addressed by a first-principles study of the structural, electronic, and ferroelectric properties of selected monochalcogenides (GeSe, GeTe, SnSe, and SnTe) as a function of film thickness up to 18 bilayers. While in selenides a few bilayers are sufficient to recover the bulk behavior, the Te-based compounds deviate strongly from the bulk, irrespective of the slab thickness. These results are explained in terms of depolarizing fields in Te-based slabs and the different nature of the chemical bond in selenides and tellurides. It is shown that GeTe and SnTe slabs inherit metavalent bonding of the bulk phase, despite structural and electronic properties being strongly modified in thin films. This understanding of the nature of bonding in few-layers structures offers a powerful tool to tune materials properties for applications in information technology
A Weakly Supervised Approach for Estimating Spatial Density Functions from High-Resolution Satellite Imagery
We propose a neural network component, the regional aggregation layer, that
makes it possible to train a pixel-level density estimator using only
coarse-grained density aggregates, which reflect the number of objects in an
image region. Our approach is simple to use and does not require
domain-specific assumptions about the nature of the density function. We
evaluate our approach on several synthetic datasets. In addition, we use this
approach to learn to estimate high-resolution population and housing density
from satellite imagery. In all cases, we find that our approach results in
better density estimates than a commonly used baseline. We also show how our
housing density estimator can be used to classify buildings as residential or
non-residential.Comment: 10 pages, 8 figures. ACM SIGSPATIAL 2018, Seattle, US
Separating Reflection and Transmission Images in the Wild
The reflections caused by common semi-reflectors, such as glass windows, can
impact the performance of computer vision algorithms. State-of-the-art methods
can remove reflections on synthetic data and in controlled scenarios. However,
they are based on strong assumptions and do not generalize well to real-world
images. Contrary to a common misconception, real-world images are challenging
even when polarization information is used. We present a deep learning approach
to separate the reflected and the transmitted components of the recorded
irradiance, which explicitly uses the polarization properties of light. To
train it, we introduce an accurate synthetic data generation pipeline, which
simulates realistic reflections, including those generated by curved and
non-ideal surfaces, non-static scenes, and high-dynamic-range scenes.Comment: accepted at ECCV 201
Geometry meets semantics for semi-supervised monocular depth estimation
Depth estimation from a single image represents a very exciting challenge in
computer vision. While other image-based depth sensing techniques leverage on
the geometry between different viewpoints (e.g., stereo or structure from
motion), the lack of these cues within a single image renders ill-posed the
monocular depth estimation task. For inference, state-of-the-art
encoder-decoder architectures for monocular depth estimation rely on effective
feature representations learned at training time. For unsupervised training of
these models, geometry has been effectively exploited by suitable images
warping losses computed from views acquired by a stereo rig or a moving camera.
In this paper, we make a further step forward showing that learning semantic
information from images enables to improve effectively monocular depth
estimation as well. In particular, by leveraging on semantically labeled images
together with unsupervised signals gained by geometry through an image warping
loss, we propose a deep learning approach aimed at joint semantic segmentation
and depth estimation. Our overall learning framework is semi-supervised, as we
deploy groundtruth data only in the semantic domain. At training time, our
network learns a common feature representation for both tasks and a novel
cross-task loss function is proposed. The experimental findings show how,
jointly tackling depth prediction and semantic segmentation, allows to improve
depth estimation accuracy. In particular, on the KITTI dataset our network
outperforms state-of-the-art methods for monocular depth estimation.Comment: 16 pages, Accepted to ACCV 201
Concurrent Segmentation and Localization for Tracking of Surgical Instruments
Real-time instrument tracking is a crucial requirement for various
computer-assisted interventions. In order to overcome problems such as specular
reflections and motion blur, we propose a novel method that takes advantage of
the interdependency between localization and segmentation of the surgical tool.
In particular, we reformulate the 2D instrument pose estimation as heatmap
regression and thereby enable a concurrent, robust and near real-time
regression of both tasks via deep learning. As demonstrated by our experimental
results, this modeling leads to a significantly improved performance than
directly regressing the tool position and allows our method to outperform the
state of the art on a Retinal Microsurgery benchmark and the MICCAI EndoVis
Challenge 2015.Comment: I. Laina and N. Rieke contributed equally to this work. Accepted to
MICCAI 201
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Depth estimation from a single image is a fundamental problem in computer
vision. In this paper, we propose a simple yet effective convolutional spatial
propagation network (CSPN) to learn the affinity matrix for depth prediction.
Specifically, we adopt an efficient linear propagation model, where the
propagation is performed with a manner of recurrent convolutional operation,
and the affinity among neighboring pixels is learned through a deep
convolutional neural network (CNN). We apply the designed CSPN to two depth
estimation tasks given a single image: (1) To refine the depth output from
state-of-the-art (SOTA) existing methods; and (2) to convert sparse depth
samples to a dense depth map by embedding the depth samples within the
propagation procedure. The second task is inspired by the availability of
LIDARs that provides sparse but accurate depth measurements. We experimented
the proposed CSPN over two popular benchmarks for depth estimation, i.e. NYU v2
and KITTI, where we show that our proposed approach improves in not only
quality (e.g., 30% more reduction in depth error), but also speed (e.g., 2 to 5
times faster) than prior SOTA methods.Comment: 14 pages, 8 figures, ECCV 201
- …