99 research outputs found

    BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks

    Full text link
    We present a simple and effective framework for simultaneous semantic segmentation and instance segmentation with Fully Convolutional Networks (FCNs). The method, called BiSeg, predicts instance segmentation as a posterior in Bayesian inference, where semantic segmentation is used as a prior. We extend the idea of position-sensitive score maps used in recent methods to a fusion of multiple score maps at different scales and partition modes, and adopt it as a robust likelihood for instance segmentation inference. As both Bayesian inference and map fusion are performed per pixel, BiSeg is a fully convolutional end-to-end solution that inherits all the advantages of FCNs. We demonstrate state-of-the-art instance segmentation accuracy on PASCAL VOC.Comment: BMVC201

    Deeply Learning the Messages in Message Passing Inference

    Full text link
    Deep structured output learning shows great promise in tasks like semantic image segmentation. We proffer a new, efficient deep structured model learning scheme, in which we show how deep Convolutional Neural Networks (CNNs) can be used to estimate the messages in message passing inference for structured prediction with Conditional Random Fields (CRFs). With such CNN message estimators, we obviate the need to learn or evaluate potential functions for message calculation. This confers significant efficiency for learning, since otherwise when performing structured learning for a CRF with CNN potentials it is necessary to undertake expensive inference for every stochastic gradient iteration. The network output dimension for message estimation is the same as the number of classes, in contrast to the network output for general CNN potential functions in CRFs, which is exponential in the order of the potentials. Hence CNN message learning has fewer network parameters and is more scalable for cases that a large number of classes are involved. We apply our method to semantic image segmentation on the PASCAL VOC 2012 dataset. We achieve an intersection-over-union score of 73.4 on its test set, which is the best reported result for methods using the VOC training images alone. This impressive performance demonstrates the effectiveness and usefulness of our CNN message learning method.Comment: 11 pages. Appearing in Proc. The Twenty-ninth Annual Conference on Neural Information Processing Systems (NIPS), 2015, Montreal, Canad

    ParseNet: Looking Wider to See Better

    Full text link
    We present a technique for adding global context to deep convolutional networks for semantic segmentation. The approach is simple, using the average feature for a layer to augment the features at each location. In addition, we study several idiosyncrasies of training, significantly increasing the performance of baseline networks (e.g. from FCN). When we add our proposed global feature, and a technique for learning normalization parameters, accuracy increases consistently even over our improved versions of the baselines. Our proposed approach, ParseNet, achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines, and near current state-of-the-art performance on PASCAL VOC 2012 semantic segmentation with a simple approach. Code is available at https://github.com/weiliu89/caffe/tree/fcn .Comment: ICLR 2016 submissio
    • …
    corecore