5 research outputs found
Multi-View Inpainting for Image-Based Scene Editing and Rendering
International audienceWe propose a method to remove objects such as people and cars from multi-view urban image datasets, enabling free-viewpoint Image-Based Rendering (IBR) in the edited scenes. Our method combines information from multi-view 3D reconstruction with image inpainting techniques, by formulating the problem as an optimization of a global patch-based objective function. We use IBR techniques to reproject information from neighboring views, and 3D multi-view stereo reconstruction to perform multi-view coherent initialization for inpainting of pixels not filled by reprojection. Our algorithm performs multi-view consistent inpainting for color and 3D by blending reprojections with patch-based image inpaint-ing. We run our algorithm on casually captured datasets, and Google Street View data, removing objects such as cars, people and pillars, showing that our approach produces results of sufficient quality for free-viewpoint IBR on " cleaned up " scenes, as well as IBR scene editing, such as limited displacement of real objects
Recommended from our members
Leveraging Depth for 3D Scene Perception
3D scene perception aims to understand the geometric and semantic information of the surrounding environment. It is crucial in many downstream applications, such as autonomous driving, robotics, AR/VR, and human-computer interaction. Despite its significance, understanding the 3D scene has been a challenging task, due to the complex interactions between objects, heavy occlusions, cluttered indoor environments, major appearance, viewpoint and scale changes, etc. The study of 3D scene perception has been significantly reshaped by the powerful deep learning models. These models are capable of leveraging large-scale training data to achieve outstanding performance. Learning-based models unlock new challenges and opportunities in the field.In this dissertation, we first present learning-based approaches to estimate depth maps, one of the crucial information in many 3D scene perception models. We describe two overlooked challenges in learning monocular depth estimators and present our proposed solutions. Specifically, we address the high-level domain gap between real and synthetic training data and the shift in camera pose distribution between training and testing data. Following that we present two application-driven works that leverage depth maps to achieve better 3D scene perception. We explore in detail the tasks of reference-based image inpainting and 3D object instance tracking in scenes from egocentric videos
Automatic semantic and geometric enrichment of CityGML 3D building models of varying architectural styles with HOG-based template matching
While the number of 3D geo-spatial digital models of buildings with cultural heritage interest is burgeoning, most lack semantic annotation that could be used to inform users of mobile and desktop applications about the architectural features and origins of the buildings. Additionally, while automated reconstruction of 3D building models is an active research area, the labelling of architectural features (objects) is comparatively less well researched, while distinguishing between different architectural styles is less well researched still. Meanwhile, the successful automatic identification of architectural objects, typified by a comparatively less symmetrical or less regular distribution of objects on façades, particularly on older buildings, has so far eluded researchers.
This research has addressed these issues by automating the semantic and geometric enrichment of existing 3D building models by using Histogram of Oriented Gradients (HOG)-based template matching. The methods are applied to the texture maps of 3D building models of 20th century styles, of Georgian-Regency (1715-1830) style and of the Norman (1066 to late 12th century) style, where the amalgam of styles present on buildings of the latter style necessitates detection of styles of the Gothic tradition (late 12th century to present day).
The most successful results were obtained when applying a set of heuristics including the use of real world dimensions, while a Support Vector Machine (SVM)-based machine learning approach was found effective in obviating the need for thresholds on matchscores when making detection decisions