653 research outputs found

    SurfelMeshing: Online Surfel-Based Mesh Reconstruction

    Full text link
    We address the problem of mesh reconstruction from live RGB-D video, assuming a calibrated camera and poses provided externally (e.g., by a SLAM system). In contrast to most existing approaches, we do not fuse depth measurements in a volume but in a dense surfel cloud. We asynchronously (re)triangulate the smoothed surfels to reconstruct a surface mesh. This novel approach enables to maintain a dense surface representation of the scene during SLAM which can quickly adapt to loop closures. This is possible by deforming the surfel cloud and asynchronously remeshing the surface where necessary. The surfel-based representation also naturally supports strongly varying scan resolution. In particular, it reconstructs colors at the input camera's resolution. Moreover, in contrast to many volumetric approaches, ours can reconstruct thin objects since objects do not need to enclose a volume. We demonstrate our approach in a number of experiments, showing that it produces reconstructions that are competitive with the state-of-the-art, and we discuss its advantages and limitations. The algorithm (excluding loop closure functionality) is available as open source at https://github.com/puzzlepaint/surfelmeshing .Comment: Version accepted to IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    Plenoptic Signal Processing for Robust Vision in Field Robotics

    Get PDF
    This thesis proposes the use of plenoptic cameras for improving the robustness and simplicity of machine vision in field robotics applications. Dust, rain, fog, snow, murky water and insufficient light can cause even the most sophisticated vision systems to fail. Plenoptic cameras offer an appealing alternative to conventional imagery by gathering significantly more light over a wider depth of field, and capturing a rich 4D light field structure that encodes textural and geometric information. The key contributions of this work lie in exploring the properties of plenoptic signals and developing algorithms for exploiting them. It lays the groundwork for the deployment of plenoptic cameras in field robotics by establishing a decoding, calibration and rectification scheme appropriate to compact, lenslet-based devices. Next, the frequency-domain shape of plenoptic signals is elaborated and exploited by constructing a filter which focuses over a wide depth of field rather than at a single depth. This filter is shown to reject noise, improving contrast in low light and through attenuating media, while mitigating occluders such as snow, rain and underwater particulate matter. Next, a closed-form generalization of optical flow is presented which directly estimates camera motion from first-order derivatives. An elegant adaptation of this "plenoptic flow" to lenslet-based imagery is demonstrated, as well as a simple, additive method for rendering novel views. Finally, the isolation of dynamic elements from a static background is considered, a task complicated by the non-uniform apparent motion caused by a mobile camera. Two elegant closed-form solutions are presented dealing with monocular time-series and light field image pairs. This work emphasizes non-iterative, noise-tolerant, closed-form, linear methods with predictable and constant runtimes, making them suitable for real-time embedded implementation in field robotics applications

    One at A Time: Multi-step Volumetric Probability Distribution Diffusion for Depth Estimation

    Full text link
    Recent works have explored the fundamental role of depth estimation in multi-view stereo (MVS) and semantic scene completion (SSC). They generally construct 3D cost volumes to explore geometric correspondence in depth, and estimate such volumes in a single step relying directly on the ground truth approximation. However, such problem cannot be thoroughly handled in one step due to complex empirical distributions, especially in challenging regions like occlusions, reflections, etc. In this paper, we formulate the depth estimation task as a multi-step distribution approximation process, and introduce a new paradigm of modeling the Volumetric Probability Distribution progressively (step-by-step) following a Markov chain with Diffusion models (VPDD). Specifically, to constrain the multi-step generation of volume in VPDD, we construct a meta volume guidance and a confidence-aware contextual guidance as conditional geometry priors to facilitate the distribution approximation. For the sampling process, we further investigate an online filtering strategy to maintain consistency in volume representations for stable training. Experiments demonstrate that our plug-and-play VPDD outperforms the state-of-the-arts for tasks of MVS and SSC, and can also be easily extended to different baselines to get improvement. It is worth mentioning that we are the first camera-based work that surpasses LiDAR-based methods on the SemanticKITTI dataset

    Advances in 3D Neural Stylization: A Survey

    Full text link
    Modern artificial intelligence provides a novel way of producing digital art in styles. The expressive power of neural networks enables the realm of visual style transfer methods, which can be used to edit images, videos, and 3D data to make them more artistic and diverse. This paper reports on recent advances in neural stylization for 3D data. We provide a taxonomy for neural stylization by considering several important design choices, including scene representation, guidance data, optimization strategies, and output styles. Building on such taxonomy, our survey first revisits the background of neural stylization on 2D images, and then provides in-depth discussions on recent neural stylization methods for 3D data, where we also provide a mini-benchmark on artistic stylization methods. Based on the insights gained from the survey, we then discuss open challenges, future research, and potential applications and impacts of neural stylization.Comment: 26 page

    Widening the view angle of auto-multiscopic display, denoising low brightness light field data and 3D reconstruction with delicate details

    Get PDF
    This doctoral thesis will present the results of my work into widening the viewing angle of the auto-multiscopic display, denoising light filed data the enhancement of captured light filed data captured in low light circumstance, and the attempts on reconstructing the subject surface with delicate details from microscopy image sets. The automultiscopic displays carefully control the distribution of emitted light over space, direction (angle) and time so that even a static image displayed can encode parallax across viewing directions (light field). This allows simultaneous observation by multiple viewers, each perceiving 3D from their own (correct) perspective. Currently, the illusion can only be effectively maintained over a narrow range of viewing angles. We propose and analyze a simple solution to widen the range of viewing angles for automultiscopic displays that use parallax barriers. We insert a refractive medium, with a high refractive index, between the display and parallax barriers. The inserted medium warps the exitant lightfield in a way that increases the potential viewing angle. We analyze the consequences of this warp and build a prototype with a 93% increase in the effective viewing angle. Additionally, we developed an integral images synthesis method that can address the refraction introduced by the inserted medium efficiently without the use of ray tracing. Capturing light field image with a short exposure time is preferable for eliminating the motion blur but it also leads to low brightness in a low light environment, which results in a low signal noise ratio. Most light field denoising methods apply regular 2D image denoising method to the sub-aperture images of a 4D light field directly, but it is not suitable for focused light field data whose sub-aperture image resolution is too low to be applied regular denoising methods. Therefore, we propose a deep learning denoising method based on micro lens images of focused light field to denoise the depth map and the original micro lens image set simultaneously, and achieved high quality total focused images from the low focused light field data. In areas like digital museum, remote researching, 3D reconstruction with delicate details of subjects is desired and technology like 3D reconstruction based on macro photography has been used successfully for various purposes. We intend to push it further by using microscope rather than macro lens, which is supposed to be able to capture the microscopy level details of the subject. We design and implement a scanning method which is able to capture microscopy image set from a curve surface based on robotic arm, and the 3D reconstruction method suitable for the microscopy image set

    Ray Tracing Gems

    Get PDF
    This book is a must-have for anyone serious about rendering in real time. With the announcement of new ray tracing APIs and hardware to support them, developers can easily create real-time applications with ray tracing as a core component. As ray tracing on the GPU becomes faster, it will play a more central role in real-time rendering. Ray Tracing Gems provides key building blocks for developers of games, architectural applications, visualizations, and more. Experts in rendering share their knowledge by explaining everything from nitty-gritty techniques that will improve any ray tracer to mastery of the new capabilities of current and future hardware. What you'll learn: The latest ray tracing techniques for developing real-time applications in multiple domains Guidance, advice, and best practices for rendering applications with Microsoft DirectX Raytracing (DXR) How to implement high-performance graphics for interactive visualizations, games, simulations, and more Who this book is for: Developers who are looking to leverage the latest APIs and GPU technology for real-time rendering and ray tracing Students looking to learn about best practices in these areas Enthusiasts who want to understand and experiment with their new GPU

    Towards Intelligent Telerobotics: Visualization and Control of Remote Robot

    Get PDF
    Human-machine cooperative or co-robotics has been recognized as the next generation of robotics. In contrast to current systems that use limited-reasoning strategies or address problems in narrow contexts, new co-robot systems will be characterized by their flexibility, resourcefulness, varied modeling or reasoning approaches, and use of real-world data in real time, demonstrating a level of intelligence and adaptability seen in humans and animals. The research I focused is in the two sub-field of co-robotics: teleoperation and telepresence. We firstly explore the ways of teleoperation using mixed reality techniques. I proposed a new type of display: hybrid-reality display (HRD) system, which utilizes commodity projection device to project captured video frame onto 3D replica of the actual target surface. It provides a direct alignment between the frame of reference for the human subject and that of the displayed image. The advantage of this approach lies in the fact that no wearing device needed for the users, providing minimal intrusiveness and accommodating users eyes during focusing. The field-of-view is also significantly increased. From a user-centered design standpoint, the HRD is motivated by teleoperation accidents, incidents, and user research in military reconnaissance etc. Teleoperation in these environments is compromised by the Keyhole Effect, which results from the limited field of view of reference. The technique contribution of the proposed HRD system is the multi-system calibration which mainly involves motion sensor, projector, cameras and robotic arm. Due to the purpose of the system, the accuracy of calibration should also be restricted within millimeter level. The followed up research of HRD is focused on high accuracy 3D reconstruction of the replica via commodity devices for better alignment of video frame. Conventional 3D scanner lacks either depth resolution or be very expensive. We proposed a structured light scanning based 3D sensing system with accuracy within 1 millimeter while robust to global illumination and surface reflection. Extensive user study prove the performance of our proposed algorithm. In order to compensate the unsynchronization between the local station and remote station due to latency introduced during data sensing and communication, 1-step-ahead predictive control algorithm is presented. The latency between human control and robot movement can be formulated as a linear equation group with a smooth coefficient ranging from 0 to 1. This predictive control algorithm can be further formulated by optimizing a cost function. We then explore the aspect of telepresence. Many hardware designs have been developed to allow a camera to be placed optically directly behind the screen. The purpose of such setups is to enable two-way video teleconferencing that maintains eye-contact. However, the image from the see-through camera usually exhibits a number of imaging artifacts such as low signal to noise ratio, incorrect color balance, and lost of details. Thus we develop a novel image enhancement framework that utilizes an auxiliary color+depth camera that is mounted on the side of the screen. By fusing the information from both cameras, we are able to significantly improve the quality of the see-through image. Experimental results have demonstrated that our fusion method compares favorably against traditional image enhancement/warping methods that uses only a single image
    • …
    corecore