5 research outputs found
Context-driven Object Detection and Segmentation with Auxiliary Information
One fundamental problem in computer vision and robotics is to
localize objects of interest in an image. The task can either be
formulated as an object detection problem if the objects are
described by a set of pose parameters, or an object segmentation
one if we recover object boundary precisely. A key issue in
object detection and segmentation concerns exploiting the spatial
context, as local evidence is often insufficient to determine
object pose in the presence of heavy occlusions or large object
appearance variations. This thesis addresses the object detection
and segmentation problem in such adverse conditions with
auxiliary depth data provided by RGBD cameras. We focus on four
main issues in context-aware object detection and segmentation:
1) what are the effective context representations? 2) how can we
work with limited and imperfect depth data? 3) how to design
depth-aware features and integrate depth cues into conventional
visual inference tasks? 4) how to make use of unlabeled data to
relax the labeling requirements for training data?
We discuss three object detection and segmentation scenarios
based on varying amounts of available auxiliary information. In
the first case, depth data are available for model training but
not available for testing. We propose a structured Hough voting
method for detecting objects with heavy occlusion in indoor
environments, in which we extend the Hough hypothesis space to
include both the object's location, and its visibility pattern.
We design a new score function that accumulates votes for object
detection and occlusion prediction. In addition, we explore the
correlation between objects and their environment, building a
depth-encoded object-context model based on RGBD data. In the
second case, we address the problem of localizing glass objects
with noisy and incomplete depth data. Our method integrates the
intensity and depth information from a single view point, and
builds a Markov Random Field that predicts glass boundary and
region jointly. In addition, we propose a nonparametric,
data-driven label transfer scheme for local glass boundary
estimation. A weighted voting scheme based on a joint feature
manifold is adopted to integrate depth and appearance cues, and
we learn a distance metric on the depth-encoded feature manifold.
In the third case, we make use of unlabeled data to relax the
annotation requirements for object detection and segmentation,
and propose a novel data-dependent margin distribution learning
criterion for boosting, which utilizes the intrinsic geometric
structure of datasets. One key aspect of this method is that it
can seamlessly incorporate unlabeled data by including a graph
Laplacian regularizer. We demonstrate the performance of our
models and compare with baseline methods on several real-world
object detection and segmentation tasks, including indoor object
detection, glass object segmentation and foreground segmentation
in video
Regularized Boost for Semi-Supervised Learning
Semi-supervised inductive learning concerns how to learn a decision rule from a data set containing both labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes local smoothness constraints among data into account during ensemble learning. In this paper, we introduce a local smoothness regularizer to semi-supervised boosting algorithms based on the universal optimization framework of margin cost functionals. Our regularizer is applicable to existing semi-supervised boosting algorithms to improve their generalization and speed up their training. Comparative results on synthetic, benchmark and real world tasks demonstrate the effectiveness of our local smoothness regularizer. We discuss relevant issues and relate our regularizer to previous work.