604 research outputs found
Efficient Yet Deep Convolutional Neural Networks for Semantic Segmentation
Semantic Segmentation using deep convolutional neural network pose more
complex challenge for any GPU intensive task. As it has to compute million of
parameters, it results to huge memory consumption. Moreover, extracting finer
features and conducting supervised training tends to increase the complexity.
With the introduction of Fully Convolutional Neural Network, which uses finer
strides and utilizes deconvolutional layers for upsampling, it has been a go to
for any image segmentation task. In this paper, we propose two segmentation
architecture which not only needs one-third the parameters to compute but also
gives better accuracy than the similar architectures. The model weights were
transferred from the popular neural net like VGG19 and VGG16 which were trained
on Imagenet classification data-set. Then we transform all the fully connected
layers to convolutional layers and use dilated convolution for decreasing the
parameters. Lastly, we add finer strides and attach four skip architectures
which are element-wise summed with the deconvolutional layers in steps. We
train and test on different sparse and fine data-sets like Pascal VOC2012,
Pascal-Context and NYUDv2 and show how better our model performs in this tasks.
On the other hand our model has a faster inference time and consumes less
memory for training and testing on NVIDIA Pascal GPUs, making it more efficient
and less memory consuming architecture for pixel-wise segmentation.Comment: 8 page
PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
Contexts play an important role in the saliency detection task. However,
given a context region, not all contextual information is helpful for the final
task. In this paper, we propose a novel pixel-wise contextual attention
network, i.e., the PiCANet, to learn to selectively attend to informative
context locations for each pixel. Specifically, for each pixel, it can generate
an attention map in which each attention weight corresponds to the contextual
relevance at each context location. An attended contextual feature can then be
constructed by selectively aggregating the contextual information. We formulate
the proposed PiCANet in both global and local forms to attend to global and
local contexts, respectively. Both models are fully differentiable and can be
embedded into CNNs for joint training. We also incorporate the proposed models
with the U-Net architecture to detect salient objects. Extensive experiments
show that the proposed PiCANets can consistently improve saliency detection
performance. The global and local PiCANets facilitate learning global contrast
and homogeneousness, respectively. As a result, our saliency model can detect
salient objects more accurately and uniformly, thus performing favorably
against the state-of-the-art methods
INSTA-BEEER: Explicit Error Estimation and Refinement for Fast and Accurate Unseen Object Instance Segmentation
Efficient and accurate segmentation of unseen objects is crucial for robotic
manipulation. However, it remains challenging due to over- or
under-segmentation. Although existing refinement methods can enhance the
segmentation quality, they fix only minor boundary errors or are not
sufficiently fast. In this work, we propose INSTAnce Boundary Explicit Error
Estimation and Refinement (INSTA-BEEER), a novel refinement model that allows
for adding and deleting instances and sharpening boundaries. Leveraging an
error-estimation-then-refinement scheme, the model first estimates the
pixel-wise boundary explicit errors: true positive, true negative, false
positive, and false negative pixels of the instance boundary in the initial
segmentation. It then refines the initial segmentation using these error
estimates as guidance. Experiments show that the proposed model significantly
enhances segmentation, achieving state-of-the-art performance. Furthermore,
with a fast runtime (less than 0.1 s), the model consistently improves
performance across various initial segmentation methods, making it highly
suitable for practical robotic applications.Comment: 8 pages, 5 figure
- …