669 research outputs found
Exploiting surroundedness for saliency detection: a boolean map approach
We demonstrate the usefulness of surroundedness for eye fixation prediction by proposing a Boolean Map based Saliency model (BMS). In our formulation, an image is characterized by a set of binary images, which are generated by randomly thresholding the image's feature maps in a whitened feature space. Based on a Gestalt principle of figure-ground segregation, BMS computes a saliency map by discovering surrounded regions via topological analysis of Boolean maps. Furthermore, we draw a connection between BMS and the Minimum Barrier Distance to provide insight into why and how BMS can properly captures the surroundedness cue via Boolean maps. The strength of BMS is verified by its simplicity, efficiency and superior performance compared with 10 state-of-the-art methods on seven eye tracking benchmark datasets.US National Science Foundation; 1059218; 1029430http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdfAccepted manuscrip
Lucid Data Dreaming for Video Object Segmentation
Convolutional networks reach top quality in pixel-level video object
segmentation but require a large amount of training data (1k~100k) to deliver
such results. We propose a new training strategy which achieves
state-of-the-art results across three evaluation datasets while using 20x~1000x
less annotated data than competing methods. Our approach is suitable for both
single and multiple object segmentation. Instead of using large training sets
hoping to generalize across domains, we generate in-domain training data using
the provided annotation on the first frame of each video to synthesize ("lucid
dream") plausible future video frames. In-domain per-video training data allows
us to train high quality appearance- and motion-based models, as well as tune
the post-processing stage. This approach allows to reach competitive results
even when training from only a single annotated frame, without ImageNet
pre-training. Our results indicate that using a larger training set is not
automatically better, and that for the video object segmentation task a smaller
training set that is closer to the target domain is more effective. This
changes the mindset regarding how many training samples and general
"objectness" knowledge are required for the video object segmentation task.Comment: Accepted in International Journal of Computer Vision (IJCV
- …