528 research outputs found
Pixelwise Instance Segmentation with a Dynamically Instantiated Network
Semantic segmentation and object detection research have recently achieved
rapid progress. However, the former task has no notion of different instances
of the same object, and the latter operates at a coarse, bounding-box level. We
propose an Instance Segmentation system that produces a segmentation map where
each pixel is assigned an object class and instance identity label. Most
approaches adapt object detectors to produce segments instead of boxes. In
contrast, our method is based on an initial semantic segmentation module, which
feeds into an instance subnetwork. This subnetwork uses the initial
category-level segmentation, along with cues from the output of an object
detector, within an end-to-end CRF to predict instances. This part of our model
is dynamically instantiated to produce a variable number of instances per
image. Our end-to-end approach requires no post-processing and considers the
image holistically, instead of processing independent proposals. Therefore,
unlike some related work, a pixel cannot belong to multiple instances.
Furthermore, far more precise segmentations are achieved, as shown by our
state-of-the-art results (particularly at high IoU thresholds) on the Pascal
VOC and Cityscapes datasets.Comment: CVPR 201
Holistic, Instance-Level Human Parsing
Object parsing -- the task of decomposing an object into its semantic parts
-- has traditionally been formulated as a category-level segmentation problem.
Consequently, when there are multiple objects in an image, current methods
cannot count the number of objects in the scene, nor can they determine which
part belongs to which object. We address this problem by segmenting the parts
of objects at an instance-level, such that each pixel in the image is assigned
a part label, as well as the identity of the object it belongs to. Moreover, we
show how this approach benefits us in obtaining segmentations at coarser
granularities as well. Our proposed network is trained end-to-end given
detections, and begins with a category-level segmentation module. Thereafter, a
differentiable Conditional Random Field, defined over a variable number of
instances for every input image, reasons about the identity of each part by
associating it with a human detection. In contrast to other approaches, our
method can handle the varying number of people in each image and our holistic
network produces state-of-the-art results in instance-level part and human
segmentation, together with competitive results in category-level part
segmentation, all achieved by a single forward-pass through our neural network.Comment: Poster at BMVC 201
Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation
We propose an approach to discover class-specific pixels for the
weakly-supervised semantic segmentation task. We show that properly combining
saliency and attention maps allows us to obtain reliable cues capable of
significantly boosting the performance. First, we propose a simple yet powerful
hierarchical approach to discover the class-agnostic salient regions, obtained
using a salient object detector, which otherwise would be ignored. Second, we
use fully convolutional attention maps to reliably localize the class-specific
regions in a given image. We combine these two cues to discover class-specific
pixels which are then used as an approximate ground truth for training a CNN.
While solving the weakly supervised semantic segmentation task, we ensure that
the image-level classification task is also solved in order to enforce the CNN
to assign at least one pixel to each object present in the image.
Experimentally, on the PASCAL VOC12 val and test sets, we obtain the mIoU of
60.8% and 61.9%, achieving the performance gains of 5.1% and 5.2% compared to
the published state-of-the-art results. The code is made publicly available
Alpha MAML: Adaptive Model-Agnostic Meta-Learning
Model-agnostic meta-learning (MAML) is a meta-learning technique to train a
model on a multitude of learning tasks in a way that primes the model for
few-shot learning of new tasks. The MAML algorithm performs well on few-shot
learning problems in classification, regression, and fine-tuning of policy
gradients in reinforcement learning, but comes with the need for costly
hyperparameter tuning for training stability. We address this shortcoming by
introducing an extension to MAML, called Alpha MAML, to incorporate an online
hyperparameter adaptation scheme that eliminates the need to tune meta-learning
and learning rates. Our results with the Omniglot database demonstrate a
substantial reduction in the need to tune MAML training hyperparameters and
improvement to training stability with less sensitivity to hyperparameter
choice.Comment: 6th ICML Workshop on Automated Machine Learning (2019
Straight to Shapes: Real-time Detection of Encoded Shapes
Current object detection approaches predict bounding boxes, but these provide
little instance-specific information beyond location, scale and aspect ratio.
In this work, we propose to directly regress to objects' shapes in addition to
their bounding boxes and categories. It is crucial to find an appropriate shape
representation that is compact and decodable, and in which objects can be
compared for higher-order concepts such as view similarity, pose variation and
occlusion. To achieve this, we use a denoising convolutional auto-encoder to
establish an embedding space, and place the decoder after a fast end-to-end
network trained to regress directly to the encoded shape vectors. This yields
what to the best of our knowledge is the first real-time shape prediction
network, running at ~35 FPS on a high-end desktop. With higher-order shape
reasoning well-integrated into the network pipeline, the network shows the
useful practical quality of generalising to unseen categories similar to the
ones in the training set, something that most existing approaches fail to
handle.Comment: 16 pages including appendix; Published at CVPR 201
Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference
We propose a Branch-and-Cut (B&C) method for solving general MAP-MRF
inference problems. The core of our method is a very efficient bounding
procedure, which combines scalable semidefinite programming (SDP) and a
cutting-plane method for seeking violated constraints. In order to further
speed up the computation, several strategies have been exploited, including
model reduction, warm start and removal of inactive constraints.
We analyze the performance of the proposed method under different settings,
and demonstrate that our method either outperforms or performs on par with
state-of-the-art approaches. Especially when the connectivities are dense or
when the relative magnitudes of the unary costs are low, we achieve the best
reported results. Experiments show that the proposed algorithm achieves better
approximation than the state-of-the-art methods within a variety of time
budgets on challenging non-submodular MAP-MRF inference problems.Comment: 21 page
- …