Abstract—We present a novel local spatiotemporal approach to produce motion segmentation and dense temporal trajectories from an image sequence. A common representation of image sequences is a 3D spatiotemporal volume ðx; y; tÞ, and its corresponding mathematical formalism is the fiber bundle. However, directly enforcing the spatiotemporal smoothness constraint is difficult in the fiber bundle representation. Thus, we convert the representation into a new 5D space ðx; y; t; vx;vyÞ with an additional velocity domain, where each moving object produces a separate 3D smooth layer. The smoothness constraint is now enforced by extracting 3D layers using the tensor voting framework in a single step that solves both correspondence and segmentation simultaneously. Motion segmentation is achieved by identifying those layers and the dense temporal trajectories are obtained by converting the layers back into the fiber bundle representation. We proceed to address three applications (tracking, mosaic, and 3D reconstruction) that are hard to solve from the video stream directly because of the segmentation and dense matching steps but become straightforward with our framework. The approach does not make restrictive assumptions about the observed scene or camera motion and is therefore generally applicable. We present results on a number of data sets. Index Terms—Motion analysis, tensor voting, optical flow, segmentation, mosaicking. Ç
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.