24 research outputs found
RDCNet: Instance segmentation with a minimalist recurrent residual network
Instance segmentation is a key step for quantitative microscopy. While
several machine learning based methods have been proposed for this problem,
most of them rely on computationally complex models that are trained on
surrogate tasks. Building on recent developments towards end-to-end trainable
instance segmentation, we propose a minimalist recurrent network called
recurrent dilated convolutional network (RDCNet), consisting of a shared
stacked dilated convolution (sSDC) layer that iteratively refines its output
and thereby generates interpretable intermediate predictions. It is
light-weight and has few critical hyperparameters, which can be related to
physical aspects such as object size or density.We perform a sensitivity
analysis of its main parameters and we demonstrate its versatility on 3 tasks
with different imaging modalities: nuclear segmentation of H&E slides, of 3D
anisotropic stacks from light-sheet fluorescence microscopy and leaf
segmentation of top-view images of plants. It achieves state-of-the-art on 2 of
the 3 datasets.Comment: Accepted at MICCAI-MLMI 2020 worksho
Instance Segmentation of Biological Images Using Harmonic Embeddings
We present a new instance segmentation approach tailored to biological
images, where instances may correspond to individual cells, organisms or plant
parts. Unlike instance segmentation for user photographs or road scenes, in
biological data object instances may be particularly densely packed, the
appearance variation may be particularly low, the processing power may be
restricted, while, on the other hand, the variability of sizes of individual
instances may be limited. The proposed approach successfully addresses these
peculiarities.
Our approach describes each object instance using an expectation of a limited
number of sine waves with frequencies and phases adjusted to particular object
sizes and densities. At train time, a fully-convolutional network is learned to
predict the object embeddings at each pixel using a simple pixelwise regression
loss, while at test time the instances are recovered using clustering in the
embedding space. In the experiments, we show that our approach outperforms
previous embedding-based instance segmentation approaches on a number of
biological datasets, achieving state-of-the-art on a popular CVPPP benchmark.
This excellent performance is combined with computational efficiency that is
needed for deployment to domain specialists.
The source code of the approach is available at
https://github.com/kulikovv/harmonicComment: Accepted as oral to CVPR 202
Object Discovery with a Copy-Pasting GAN
We tackle the problem of object discovery, where objects are segmented for a
given input image, and the system is trained without using any direct
supervision whatsoever. A novel copy-pasting GAN framework is proposed, where
the generator learns to discover an object in one image by compositing it into
another image such that the discriminator cannot tell that the resulting image
is fake. After carefully addressing subtle issues, such as preventing the
generator from `cheating', this game results in the generator learning to
select objects, as copy-pasting objects is most likely to fool the
discriminator. The system is shown to work well on four very different
datasets, including large object appearance variations in challenging cluttered
backgrounds
Unsupervised Object Segmentation with Explicit Localization Module
In this paper, we propose a novel architecture that iteratively discovers and
segments out the objects of a scene based on the image reconstruction quality.
Different from other approaches, our model uses an explicit localization module
that localizes objects of the scene based on the pixel-level reconstruction
qualities at each iteration, where simpler objects tend to be reconstructed
better at earlier iterations and thus are segmented out first. We show that our
localization module improves the quality of the segmentation, especially on a
challenging background
Improving Pixel Embedding Learning through Intermediate Distance Regression Supervision for Instance Segmentation
As a proposal-free approach, instance segmentation through pixel embedding
learning and clustering is gaining more emphasis. Compared with bounding box
refinement approaches, such as Mask R-CNN, it has potential advantages in
handling complex shapes and dense objects. In this work, we propose a simple,
yet highly effective, architecture for object-aware embedding learning. A
distance regression module is incorporated into our architecture to generate
seeds for fast clustering. At the same time, we show that the features learned
by the distance regression module are able to promote the accuracy of learned
object-aware embeddings significantly. By simply concatenating features of the
distance regression module to the images as inputs of the embedding module, the
mSBD scores on the CVPPP Leaf Segmentation Challenge can be further improved by
more than 8% compared to the identical set-up without concatenation, yielding
the best overall result amongst the leaderboard at CodaLab.Comment: ECCV 2020 Workshop: Computer Vision Problems in Plant Phenotyping
(CVPPP 2020
Learning to Drive from Simulation without Real World Labels
Simulation can be a powerful tool for understanding machine learning systems
and designing methods to solve real-world problems. Training and evaluating
methods purely in simulation is often "doomed to succeed" at the desired task
in a simulated environment, but the resulting models are incapable of operation
in the real world. Here we present and evaluate a method for transferring a
vision-based lane following driving policy from simulation to operation on a
rural road without any real-world labels. Our approach leverages recent
advances in image-to-image translation to achieve domain transfer while jointly
learning a single-camera control policy from simulation control labels. We
assess the driving performance of this method using both open-loop regression
metrics, and closed-loop performance operating an autonomous vehicle on rural
and urban roads
SOLO: Segmenting Objects by Locations
We present a new, embarrassingly simple approach to instance segmentation in
images. Compared to many other dense prediction tasks, e.g., semantic
segmentation, it is the arbitrary number of instances that have made instance
segmentation much more challenging. In order to predict a mask for each
instance, mainstream approaches either follow the 'detect-thensegment' strategy
as used by Mask R-CNN, or predict category masks first then use clustering
techniques to group pixels into individual instances. We view the task of
instance segmentation from a completely new perspective by introducing the
notion of "instance categories", which assigns categories to each pixel within
an instance according to the instance's location and size, thus nicely
converting instance mask segmentation into a classification-solvable problem.
Now instance segmentation is decomposed into two classification tasks. We
demonstrate a much simpler and flexible instance segmentation framework with
strong performance, achieving on par accuracy with Mask R-CNN and outperforming
recent singleshot instance segmenters in accuracy. We hope that this very
simple and strong framework can serve as a baseline for many instance-level
recognition tasks besides instance segmentation.Comment: Accepted to Proc. Eur. Conf. Computer Vision (ECCV) 2020. Code is
available at https://git.io/AdelaiDe
Object Discovery in Videos as Foreground Motion Clustering
We consider the problem of providing dense segmentation masks for object
discovery in videos. We formulate the object discovery problem as foreground
motion clustering, where the goal is to cluster foreground pixels in videos
into different objects. We introduce a novel pixel-trajectory recurrent neural
network that learns feature embeddings of foreground pixel trajectories linked
across time. By clustering the pixel trajectories using the learned feature
embeddings, our method establishes correspondences between foreground object
masks across video frames. To demonstrate the effectiveness of our framework
for object discovery, we conduct experiments on commonly used datasets for
motion segmentation, where we achieve state-of-the-art performance
Rethinking Task and Metrics of Instance Segmentation on 3D Point Clouds
Instance segmentation on 3D point clouds is one of the most extensively
researched areas toward the realization of autonomous cars and robots. Certain
existing studies have split input point clouds into small regions such as 1m x
1m; one reason for this is that models in the studies cannot consume a large
number of points because of the large space complexity. However, because such
small regions occasionally include a very small number of instances belonging
to the same class, an evaluation using existing metrics such as mAP is largely
affected by the category recognition performance. To address these problems, we
propose a new method with space complexity O(Np) such that large regions can be
consumed, as well as novel metrics for tasks that are independent of the
categories or size of the inputs. Our method learns a mapping from input point
clouds to an embedding space, where the embeddings form clusters for each
instance and distinguish instances using these clusters during testing. Our
method achieves state-of-the-art performance using both existing and the
proposed metrics. Moreover, we show that our new metric can evaluate the
performance of a task without being affected by any other condition.Comment: The 4th Workshop on Geometry Meets Deep Learning (ICCV Workshop 2019
Learning Gaussian Instance Segmentation in Point Clouds
This paper presents a novel method for instance segmentation of 3D point
clouds. The proposed method is called Gaussian Instance Center Network (GICN),
which can approximate the distributions of instance centers scattered in the
whole scene as Gaussian center heatmaps. Based on the predicted heatmaps, a
small number of center candidates can be easily selected for the subsequent
predictions with efficiency, including i) predicting the instance size of each
center to decide a range for extracting features, ii) generating bounding boxes
for centers, and iii) producing the final instance masks. GICN is a
single-stage, anchor-free, and end-to-end architecture that is easy to train
and efficient to perform inference. Benefited from the center-dictated
mechanism with adaptive instance size selection, our method achieves
state-of-the-art performance in the task of 3D instance segmentation on ScanNet
and S3DIS datasets