In order to perform autonomous sequential manipulation tasks, perception in
cluttered scenes remains a critical challenge for robots. In this paper, we
propose a probabilistic approach for robust sequential scene estimation and
manipulation - Sequential Scene Understanding and Manipulation(SUM). SUM
considers uncertainty due to discriminative object detection and recognition in
the generative estimation of the most likely object poses maintained over time
to achieve a robust estimation of the scene under heavy occlusions and
unstructured environment. Our method utilizes candidates from discriminative
object detector and recognizer to guide the generative process of sampling
scene hypothesis, and each scene hypotheses is evaluated against the
observations. Also SUM maintains beliefs of scene hypothesis over robot
physical actions for better estimation and against noisy detections. We conduct
extensive experiments to show that our approach is able to perform robust
estimation and manipulation