2,703 research outputs found
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Recursive Training of 2D-3D Convolutional Networks for Neuronal Boundary Detection
Efforts to automate the reconstruction of neural circuits from 3D electron
microscopic (EM) brain images are critical for the field of connectomics. An
important computation for reconstruction is the detection of neuronal
boundaries. Images acquired by serial section EM, a leading 3D EM technique,
are highly anisotropic, with inferior quality along the third dimension. For
such images, the 2D max-pooling convolutional network has set the standard for
performance at boundary detection. Here we achieve a substantial gain in
accuracy through three innovations. Following the trend towards deeper networks
for object recognition, we use a much deeper network than previously employed
for boundary detection. Second, we incorporate 3D as well as 2D filters, to
enable computations that use 3D context. Finally, we adopt a recursively
trained architecture in which a first network generates a preliminary boundary
map that is provided as input along with the original image to a second network
that generates a final boundary map. Backpropagation training is accelerated by
ZNN, a new implementation of 3D convolutional networks that uses multicore CPU
parallelism for speed. Our hybrid 2D-3D architecture could be more generally
applicable to other types of anisotropic 3D images, including video, and our
recursive framework for any image labeling problem
Multi-stage Multi-recursive-input Fully Convolutional Networks for Neuronal Boundary Detection
In the field of connectomics, neuroscientists seek to identify cortical
connectivity comprehensively. Neuronal boundary detection from the Electron
Microscopy (EM) images is often done to assist the automatic reconstruction of
neuronal circuit. But the segmentation of EM images is a challenging problem,
as it requires the detector to be able to detect both filament-like thin and
blob-like thick membrane, while suppressing the ambiguous intracellular
structure. In this paper, we propose multi-stage multi-recursive-input fully
convolutional networks to address this problem. The multiple recursive inputs
for one stage, i.e., the multiple side outputs with different receptive field
sizes learned from the lower stage, provide multi-scale contextual boundary
information for the consecutive learning. This design is
biologically-plausible, as it likes a human visual system to compare different
possible segmentation solutions to address the ambiguous boundary issue. Our
multi-stage networks are trained end-to-end. It achieves promising results on
two public available EM segmentation datasets, the mouse piriform cortex
dataset and the ISBI 2012 EM dataset.Comment: Accepted by ICCV201
Incremental Learning on Food Instance Segmentation
Food instance segmentation is essential to estimate the serving size of
dishes in a food image. The recent cutting-edge techniques for instance
segmentation are deep learning networks with impressive segmentation quality
and fast computation. Nonetheless, they are hungry for data and expensive for
annotation. This paper proposes an incremental learning framework to optimize
the model performance given a limited data labelling budget. The power of the
framework is a novel difficulty assessment model, which forecasts how
challenging an unlabelled sample is to the latest trained instance segmentation
model. The data collection procedure is divided into several stages, each in
which a new sample package is collected. The framework allocates the labelling
budget to the most difficult samples. The unlabelled samples that meet a
certain qualification from the assessment model are used to generate
pseudo-labels. Eventually, the manual labels and pseudo-labels are sent to the
training data to improve the instance segmentation model. On four large-scale
food datasets, our proposed framework outperforms current incremental learning
benchmarks and achieves competitive performance with the model trained on fully
annotated samples
- …