Search CORE

3 research outputs found

Source-free Depth for Object Pop-out

Author: Demonceaux Cédric
Fan Deng-Ping
Paudel Danda Pani
Timofte Radu
Van Gool Luc
Wang Jingjing
Wang Shuo
Wu Zongwei
Publication venue
Publication date: 10/12/2022
Field of study

Depth cues are known to be useful for visual perception. However, direct measurement of depth is often impracticable. Fortunately, though, modern learning-based methods offer promising depth maps by inference in the wild. In this work, we adapt such depth inference models for object segmentation using the objects' ``pop-out'' prior in 3D. The ``pop-out'' is a simple composition prior that assumes objects reside on the background surface. Such compositional prior allows us to reason about objects in the 3D space. More specifically, we adapt the inferred depth maps such that objects can be localized using only 3D information. Such separation, however, requires knowledge about contact surface which we learn using the weak supervision of the segmentation mask. Our intermediate representation of contact surface, and thereby reasoning about objects purely in 3D, allows us to better transfer the depth knowledge into semantics. The proposed adaptation method uses only the depth model without needing the source data used for training, making the learning process efficient and practical. Our experiments on eight datasets of two challenging tasks, namely camouflaged object detection and salient object detection, consistently demonstrate the benefit of our method in terms of both performance and generalizability

arXiv.org e-Print Archive

Object Discovery with a Mobile Robot

Author: Mason Julian
Publication venue
Publication date
Field of study

The world is full of objects: cups, phones, computers, books, andcountless other things. For many tasks, robots need to understand thatthis object is a stapler, that object is a textbook, and this otherobject is a gallon of milk. The classic approach to this problem isobject recognition, which classifies each observation into one ofseveral previously-defined classes. While modern object recognitionalgorithms perform well, they require extensive supervised training:in a standard benchmark, the training data average more than fourhundred images of each object class.The cost of manually labeling the training data prohibits thesetechniques from scaling to general environments. Homes and workplacescan contain hundreds of unique objects, and the objects in oneenvironment may not appear in another.We propose a different approach: object discovery. Rather than rely onmanual labeling, we describe unsupervised algorithms that leverage theunique capabilities of a mobile robot to discover the objects (andclasses of objects) in an environment. Because our algorithms areunsupervised, they scale gracefully to large, general environmentsover long periods of time. To validate our results, we collected 67robotic runs through a large office environment. This dataset, whichwe have made available to the community, is the largest of its kind.At each step, we treat the problem as one of robotics, not disembodiedcomputer vision. The scale and quality of our results demonstrate themerit of this perspective, and prove the practicality of long-termlarge-scale object discovery.Dissertatio

DukeSpace

Image composition for object pop-out

Author: Alexei A. Efros
Hongwen Kang
Martial Hebert
Takeo Kanade
Publication venue
Publication date: 01/01/2009
Field of study

We propose a new data-driven framework for novel object detection and segmentation, or “object pop-out”. Traditionally, this task is approached via background subtraction, which requires continuous observation from a stationary camera. Instead, we consider this an image matching problem. We detect novel objects in the scene using an unordered, sparse database of previously captured images of the same general environment. The problem is formulated in a new image composition framework: 1) given an input image, we find a small set of similar matching images; 2) each of the matches is aligned with the input by proposing a set of homography transformations; 3) regions from different transformed matches are stitched together into a single composite image that best matches the input; 4) the difference between the input and the composite is used to “pop-out ” new or changed objects. Figure 1. Given a single input image (a), we are able to “explain” it with bits and pieces of similar images taken previously (b), so as to generate a faithful representation of the input image (c) and detect the novel object (d). 1

CiteSeerX

Crossref