9,981 research outputs found
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
We propose a real-time RGB-based pipeline for object detection and 6D pose
estimation. Our novel 3D orientation estimation is based on a variant of the
Denoising Autoencoder that is trained on simulated views of a 3D model using
Domain Randomization. This so-called Augmented Autoencoder has several
advantages over existing methods: It does not require real, pose-annotated
training data, generalizes to various test sensors and inherently handles
object and view symmetries. Instead of learning an explicit mapping from input
images to object poses, it provides an implicit representation of object
orientations defined by samples in a latent space. Our pipeline achieves
state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D
domain. We also evaluate on the LineMOD dataset where we can compete with other
synthetically trained approaches. We further increase performance by correcting
3D orientation estimates to account for perspective errors when the object
deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode
A framework for realistic 3D tele-immersion
Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems
Plane-Based Optimization of Geometry and Texture for RGB-D Reconstruction of Indoor Scenes
We present a novel approach to reconstruct RGB-D indoor scene with plane
primitives. Our approach takes as input a RGB-D sequence and a dense coarse
mesh reconstructed by some 3D reconstruction method on the sequence, and
generate a lightweight, low-polygonal mesh with clear face textures and sharp
features without losing geometry details from the original scene. To achieve
this, we firstly partition the input mesh with plane primitives, simplify it
into a lightweight mesh next, then optimize plane parameters, camera poses and
texture colors to maximize the photometric consistency across frames, and
finally optimize mesh geometry to maximize consistency between geometry and
planes. Compared to existing planar reconstruction methods which only cover
large planar regions in the scene, our method builds the entire scene by
adaptive planes without losing geometry details and preserves sharp features in
the final mesh. We demonstrate the effectiveness of our approach by applying it
onto several RGB-D scans and comparing it to other state-of-the-art
reconstruction methods.Comment: in International Conference on 3D Vision 2018; Models and Code: see
https://github.com/chaowang15/plane-opt-rgbd. arXiv admin note: text overlap
with arXiv:1905.0885
Bio-Inspired Multi-Spectral Image Sensor and Augmented Reality Display for Near-Infrared Fluorescence Image-Guided Surgery
Background: Cancer remains a major public health problem worldwide and poses a huge economic burden. Near-infrared (NIR) fluorescence image-guided surgery (IGS) utilizes molecular markers and imaging instruments to identify and locate tumors during surgical resection. Unfortunately, current state-of-the-art NIR fluorescence imaging systems are bulky, costly, and lack both fluorescence sensitivity under surgical illumination and co-registration accuracy between multimodal images. Additionally, the monitor-based display units are disruptive to the surgical workflow and are suboptimal at indicating the 3-dimensional position of labeled tumors. These major obstacles have prevented the wide acceptance of NIR fluorescence imaging as the standard of care for cancer surgery. The goal of this dissertation is to enhance cancer treatment by developing novel image sensors and presenting the information using holographic augmented reality (AR) display to the physician in intraoperative settings.
Method: By mimicking the visual system of the Morpho butterfly, several single-chip, color-NIR fluorescence image sensors and systems were developed with CMOS technologies and pixelated interference filters. Using a holographic AR goggle platform, an NIR fluorescence IGS display system was developed. Optoelectronic evaluation was performed on the prototypes to evaluate the performance of each component, and small animal models and large animal models were used to verify the overall effectiveness of the integrated systems at cancer detection.
Result: The single-chip bio-inspired multispectral logarithmic image sensor I developed has better main performance indicators than the state-of-the-art NIR fluorescence imaging instruments. The image sensors achieve up to 140 dB dynamic range. The sensitivity under surgical illumination achieves 6108 V/(mW/cm2), which is up to 25 times higher. The signal-to-noise ratio is up to 56 dB, which is 11 dB greater. These enable high sensitivity fluorescence imaging under surgical illumination. The pixelated interference filters enable temperature-independent co-registration accuracy between multimodal images. Pre-clinical trials with small animal model demonstrate that the sensor can achieve up to 95% sensitivity and 94% specificity with tumor-targeted NIR molecular probes. The holographic AR goggle provides the physician with a non-disruptive 3-dimensional display in the clinical setup. This is the first display system that co-registers a virtual image with human eyes and allows video rate image transmission. The imaging system is tested in the veterinary science operating room on canine patients with naturally occurring cancers. In addition, a time domain pulse-width-modulation address-event-representation multispectral image sensor and a handheld multispectral camera prototype are developed.
Conclusion: The major problems of current state-of-the-art NIR fluorescence imaging systems are successfully solved. Due to enhanced performance and user experience, the bio-inspired sensors and augmented reality display system will give medical care providers much needed technology to enable more accurate value-based healthcare
- âŠ