Human detection, tracking and segmentation from low-level to high-level vision

Abstract

The goal of this research is to detect, segment and track a human body as well as estimate its limb configuration from cluttered background. These are fundamental research issues that have attracted intensive attention in the computer vision community because of their wide applications. Meanwhile they also remain to be ones of the most challenging research issues largely due to the ubiquitous visual ambiguities in images/videos. The other challenging factor is the ill-posed nature of the problems. Inspired by the recent findings in cognitive psychology, we adopt several biologically plausible approaches to attack these challenging problems. This dissertation provides a comprehensive study of human detection, tracking and segmentation that covers several research issues ranging from low, middle, and high-level vision.In low-level vision, we investigate video segmentation where the main challenge is the non-convex classification problem, and we develop a cascaded multi-layer segmentation framework where no-convex classification problems are addressed in a split-and-merge paradigm by combining merits of both statistical modeling and graph theory.In middle-level vision, we propose a segmentation based hypothesis-and-test paradigm to achieve joint localization and segmentation that exploits the complementary nature of region-based and edge-based shape priors. In addition, we integrate both priors into a Graph-cut framework to improve the segmentation results.In high-level vision, our research has two related parts. First, we propose a hybrid body representation that embraces part-whole shape priors and part-based spatial prior for integrated pose recognition, localization and segmentation in a given image. Second, we further combine spatial and temporal priors in an integrated online learning and inference framework, where body parts can be detected, localized and segmented simultaneously from a video sequence. Both of them are supported by previous low-level and mid-level vision tasks.Experimental results show that the proposed algorithms can achieve accurate and robust tracking, localization and segmentation results for different walking subjects with significant appearance and motion variability and under cluttered background

    Similar works