Search CORE

3,900 research outputs found

Semantic Video CNNs through Representation Warping

Author: Gadde Raghudeep
Gehler Peter V.
Jampani Varun
Publication venue
Publication date: 01/01/2017
Field of study

In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very little extra computational cost. This module is called NetWarp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network representations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training. Experiments validate that the proposed approach incurs only little extra computational cost, while improving performance, when video streams are available. We achieve new state-of-the-art results on the CamVid and Cityscapes benchmark datasets and show consistent improvements over different baseline networks. Our code and models will be available at http://segmentation.is.tue.mpg.deComment: ICCV 201

arXiv.org e-Print Archive

MPG.PuRe

Deep Extreme Cut: From Extreme Points to Object Segmentation

Author: Caelles Sergi
Maninis Kevis-Kokitsi
Pont-Tuset Jordi
Van Gool Luc
Publication venue
Publication date: 27/03/2018
Field of study

This paper explores the use of extreme points in an object (left-most, right-most, top, bottom pixels) as input to obtain precise object segmentation for images and videos. We do so by adding an extra channel to the image in the input of a convolutional neural network (CNN), which contains a Gaussian centered in each of the extreme points. The CNN learns to transform this information into a segmentation of an object that matches those extreme points. We demonstrate the usefulness of this approach for guided segmentation (grabcut-style), interactive segmentation, video object segmentation, and dense segmentation annotation. We show that we obtain the most precise results to date, also with less user input, in an extensive and varied selection of benchmarks and datasets. All our models and code are publicly available on http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr/.Comment: CVPR 2018 camera ready. Project webpage and code: http://www.vision.ee.ethz.ch/~cvlsegmentation/dextr

arXiv.org e-Print Archive

Crossref

Lucid Data Dreaming for Video Object Segmentation

Author: Benenson Rodrigo
Brox Thomas
Ilg Eddy
Khoreva Anna
Schiele Bernt
Publication venue
Publication date: 01/01/2019
Field of study

Convolutional networks reach top quality in pixel-level video object segmentation but require a large amount of training data (1k~100k) to deliver such results. We propose a new training strategy which achieves state-of-the-art results across three evaluation datasets while using 20x~1000x less annotated data than competing methods. Our approach is suitable for both single and multiple object segmentation. Instead of using large training sets hoping to generalize across domains, we generate in-domain training data using the provided annotation on the first frame of each video to synthesize ("lucid dream") plausible future video frames. In-domain per-video training data allows us to train high quality appearance- and motion-based models, as well as tune the post-processing stage. This approach allows to reach competitive results even when training from only a single annotated frame, without ImageNet pre-training. Our results indicate that using a larger training set is not automatically better, and that for the video object segmentation task a smaller training set that is closer to the target domain is more effective. This changes the mindset regarding how many training samples and general "objectness" knowledge are required for the video object segmentation task.Comment: Accepted in International Journal of Computer Vision (IJCV

arXiv.org e-Print Archive

MPG.PuRe

Piano Crossing - Walking on a Keyboard

Author: Batagelj Borut
Kverh Bojan
Lipanje Matevž
Solina Franc
Publication venue: Faculty of Graphic Arts, University of Zagreb
Publication date: 01/01/2010
Field of study

Piano Crossing is an interactive art installation which turns a pedestrian crossing marked with white stripes into a piano keyboard so that pedestrians can generate music by walking over it. Matching tones are generated when a pedestrian is over a particular stripe or key. A digital camera is directed at the crossing from above. A special computer vision application was developed that maps the stripes of the pedestrian crossing to piano keys and which detects over which key is the center of gravity of every pedestrian in the image at any given moment. Special black stripes are added to the crossing, which represent also the black piano keys. The application consists of two parts: (1) initialization, where the model of the abstract piano keyboard is mapped to the image of the pedestrian crossing and (2) the detection of pedestrians on the crossing so that musical tones can be generated according to their locations. The art installation Piano crossing was presented to the public for the first time during the 51st Jazz Festival in Ljubljana in July 2010

Directory of Open Access Journals

ePrints.FRI

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia