9,981 research outputs found

    Implicit 3D Orientation Learning for 6D Object Detection from RGB Images

    Get PDF
    We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization. This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Our pipeline achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain. We also evaluate on the LineMOD dataset where we can compete with other synthetically trained approaches. We further increase performance by correcting 3D orientation estimates to account for perspective errors when the object deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode

    A framework for realistic 3D tele-immersion

    Get PDF
    Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems

    Plane-Based Optimization of Geometry and Texture for RGB-D Reconstruction of Indoor Scenes

    Full text link
    We present a novel approach to reconstruct RGB-D indoor scene with plane primitives. Our approach takes as input a RGB-D sequence and a dense coarse mesh reconstructed by some 3D reconstruction method on the sequence, and generate a lightweight, low-polygonal mesh with clear face textures and sharp features without losing geometry details from the original scene. To achieve this, we firstly partition the input mesh with plane primitives, simplify it into a lightweight mesh next, then optimize plane parameters, camera poses and texture colors to maximize the photometric consistency across frames, and finally optimize mesh geometry to maximize consistency between geometry and planes. Compared to existing planar reconstruction methods which only cover large planar regions in the scene, our method builds the entire scene by adaptive planes without losing geometry details and preserves sharp features in the final mesh. We demonstrate the effectiveness of our approach by applying it onto several RGB-D scans and comparing it to other state-of-the-art reconstruction methods.Comment: in International Conference on 3D Vision 2018; Models and Code: see https://github.com/chaowang15/plane-opt-rgbd. arXiv admin note: text overlap with arXiv:1905.0885

    Bio-Inspired Multi-Spectral Image Sensor and Augmented Reality Display for Near-Infrared Fluorescence Image-Guided Surgery

    Get PDF
    Background: Cancer remains a major public health problem worldwide and poses a huge economic burden. Near-infrared (NIR) fluorescence image-guided surgery (IGS) utilizes molecular markers and imaging instruments to identify and locate tumors during surgical resection. Unfortunately, current state-of-the-art NIR fluorescence imaging systems are bulky, costly, and lack both fluorescence sensitivity under surgical illumination and co-registration accuracy between multimodal images. Additionally, the monitor-based display units are disruptive to the surgical workflow and are suboptimal at indicating the 3-dimensional position of labeled tumors. These major obstacles have prevented the wide acceptance of NIR fluorescence imaging as the standard of care for cancer surgery. The goal of this dissertation is to enhance cancer treatment by developing novel image sensors and presenting the information using holographic augmented reality (AR) display to the physician in intraoperative settings. Method: By mimicking the visual system of the Morpho butterfly, several single-chip, color-NIR fluorescence image sensors and systems were developed with CMOS technologies and pixelated interference filters. Using a holographic AR goggle platform, an NIR fluorescence IGS display system was developed. Optoelectronic evaluation was performed on the prototypes to evaluate the performance of each component, and small animal models and large animal models were used to verify the overall effectiveness of the integrated systems at cancer detection. Result: The single-chip bio-inspired multispectral logarithmic image sensor I developed has better main performance indicators than the state-of-the-art NIR fluorescence imaging instruments. The image sensors achieve up to 140 dB dynamic range. The sensitivity under surgical illumination achieves 6108 V/(mW/cm2), which is up to 25 times higher. The signal-to-noise ratio is up to 56 dB, which is 11 dB greater. These enable high sensitivity fluorescence imaging under surgical illumination. The pixelated interference filters enable temperature-independent co-registration accuracy between multimodal images. Pre-clinical trials with small animal model demonstrate that the sensor can achieve up to 95% sensitivity and 94% specificity with tumor-targeted NIR molecular probes. The holographic AR goggle provides the physician with a non-disruptive 3-dimensional display in the clinical setup. This is the first display system that co-registers a virtual image with human eyes and allows video rate image transmission. The imaging system is tested in the veterinary science operating room on canine patients with naturally occurring cancers. In addition, a time domain pulse-width-modulation address-event-representation multispectral image sensor and a handheld multispectral camera prototype are developed. Conclusion: The major problems of current state-of-the-art NIR fluorescence imaging systems are successfully solved. Due to enhanced performance and user experience, the bio-inspired sensors and augmented reality display system will give medical care providers much needed technology to enable more accurate value-based healthcare
    • 

    corecore