We propose a three-scale hierarchical representation of scenes and objects and show how this representation is suitable for both computer vision capture of models from images and efficient photo-realistic graphics rendering. The model consists of: (1) a conventional triangulated geometry on the macro-scale; (2) a displacement map, introducing pixel-wise depth with respect to each planar model facet (triangle) on the meso level; (3) a photo-realistic micro-structure represented by an appearance basis spanning viewpoint variation in texture space. We implement a capture and rendering system for this model. Conventional Shape-From-Silhouette or Structure-From-Motion is used to capture the coarse macro geometry, variational shape and reflectance estimation for the meso-level, and texture basis optimization for the micro level. For efficiency the meso and micro level routines are both HW accelerated. Photo-realistic capture of complex scenes is thus possible in a few minutes using budget cameras and PC’s, and rendering is real-time. Experimental results and videos show models from regular images of humans and objects
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.