Search CORE

10,766 research outputs found

User-Centric Learning and Evaluation of Interactive Segmentation Systems

Author: A. Blake
A. Blake
A. Sorokin
B. C. Russell
B. Taskar
C. Rother
C. Rother
Carsten Rother
Christoph Rhemann
D. Batra
D. Singaraju
E. N. Mortensen
H. Nickisch
Hannes Nickisch
I. Tsochantaridis
J. Liu
K. McGuinness
K. McGuinness
L. Ahn von
L. Grady
L. Wasserman
M. Szummer
O. Duchenne
P. Kohli
P. Kohli
Pushmeet Kohli
R. Szeliski
S. Nowozin
S. Vicente
S. Vijayanarasimhan
S. Vijayanarasimhan
S. Vijayanarasimhan
T. Finley
V. Gulshan
X. Bai
Y. Boykov
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Learning to Singulate Objects using a Push Proposal Network

Author: Burgard Wolfram
Eitel Andreas
Hauff Nico
Publication venue
Publication date: 05/02/2018
Field of study

Learning to act in unstructured environments, such as cluttered piles of objects, poses a substantial challenge for manipulation robots. We present a novel neural network-based approach that separates unknown objects in clutter by selecting favourable push actions. Our network is trained from data collected through autonomous interaction of a PR2 robot with randomly organized tabletop scenes. The model is designed to propose meaningful push actions based on over-segmented RGB-D images. We evaluate our approach by singulating up to 8 unknown objects in clutter. We demonstrate that our method enables the robot to perform the task with a high success rate and a low number of required push actions. Our results based on real-world experiments show that our network is able to generalize to novel objects of various sizes and shapes, as well as to arbitrary object configurations. Videos of our experiments can be viewed at http://robotpush.cs.uni-freiburg.deComment: International Symposium on Robotics Research (ISRR) 2017, videos: http://robotpush.cs.uni-freiburg.d

arXiv.org e-Print Archive

Crossref

Click Carving: Segmenting Objects in Video with Point Clicks

Author: Grauman Kristen
Jain Suyog Dutt
Publication venue
Publication date: 05/07/2016
Field of study

We present a novel form of interactive video object segmentation where a few clicks by the user helps the system produce a full spatio-temporal segmentation of the object of interest. Whereas conventional interactive pipelines take the user's initialization as a starting point, we show the value in the system taking the lead even in initialization. In particular, for a given video frame, the system precomputes a ranked list of thousands of possible segmentation hypotheses (also referred to as object region proposals) using image and motion cues. Then, the user looks at the top ranked proposals, and clicks on the object boundary to carve away erroneous ones. This process iterates (typically 2-3 times), and each time the system revises the top ranked proposal set, until the user is satisfied with a resulting segmentation mask. Finally, the mask is propagated across the video to produce a spatio-temporal object tube. On three challenging datasets, we provide extensive comparisons with both existing work and simpler alternative methods. In all, the proposed Click Carving approach strikes an excellent balance of accuracy and human effort. It outperforms all similarly fast methods, and is competitive or better than those requiring 2 to 12 times the effort.Comment: A preliminary version of the material in this document was filed as University of Texas technical report no. UT AI16-0

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Crowdsourcing in Computer Vision

Author: Fei-Fei Li
Grauman Kristen
Kovashka Adriana
Russakovsky Olga
Publication venue: 'Now Publishers'
Publication date: 01/01/2016
Field of study

Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

arXiv.org e-Print Archive

Crossref