50,503 research outputs found

    Real time trinocular stereo for tele-immersion

    Get PDF
    Tele-immersion is a technology that augments your space with real-time 3D projections of remote spaces thus facilitating the interaction of people from different places in virtually the same environment. Tele-immersion combines 3D scene recovery from computer vision, and rendering and interaction from computer graphics. We describe the real-time 3D scene acquisition using a new algorithm for trinocular stereo. We extend this method in time by combining motion and stereo in order to increase speed and robustness

    Robust visual servoing in 3d reaching tasks

    Get PDF
    This paper describes a novel approach to the problem of reaching an object in space under visual guidance. The approach is characterized by a great robustness to calibration errors, such that virtually no calibration is required. Servoing is based on binocular vision: a continuous measure of the end-effector motion field, derived from real-time computation of the binocular optical flow over the stereo images, is compared with the actual position of the target and the relative error in the end-effector trajectory is continuously corrected. The paper outlines the general framework of the approach, shows how visual measures are obtained and discusses the synthesis of the controller along with its stability analysis. Real-time experiments are presented to show the applicability of the approach in real 3-D applications

    Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality

    Full text link
    Real-time Stereo Matching is a cornerstone algorithm for many Extended Reality (XR) applications, such as indoor 3D understanding, video pass-through, and mixed-reality games. Despite significant advancements in deep stereo methods, achieving real-time depth inference with high accuracy on a low-power device remains a major challenge. One of the major difficulties is the lack of high-quality indoor video stereo training datasets captured by head-mounted VR/AR glasses. To address this issue, we introduce a novel video stereo synthetic dataset that comprises photorealistic renderings of various indoor scenes and realistic camera motion captured by a 6-DoF moving VR/AR head-mounted display (HMD). This facilitates the evaluation of existing approaches and promotes further research on indoor augmented reality scenarios. Our newly proposed dataset enables us to develop a novel framework for continuous video-rate stereo matching. As another contribution, our dataset enables us to proposed a new video-based stereo matching approach tailored for XR applications, which achieves real-time inference at an impressive 134fps on a standard desktop computer, or 30fps on a battery-powered HMD. Our key insight is that disparity and contextual information are highly correlated and redundant between consecutive stereo frames. By unrolling an iterative cost aggregation in time (i.e. in the temporal dimension), we are able to distribute and reuse the aggregated features over time. This approach leads to a substantial reduction in computation without sacrificing accuracy. We conducted extensive evaluations and comparisons and demonstrated that our method achieves superior performance compared to the current state-of-the-art, making it a strong contender for real-time stereo matching in VR/AR applications

    Multi Camera Stereo and Tracking Patient Motion for SPECT Scanning Systems

    Get PDF
    Patient motion, which causes artifacts in reconstructed images, can be a serious problem in Single Photon Emission Computed Tomography (SPECT) imaging. If patient motion can be detected and quantified, the reconstruction algorithm can compensate for the motion. A real-time multi-threaded Visual Tracking System (VTS) using optical cameras, which will be suitable for deployment in clinical trials, is under development. The VTS tracks patients using multiple video images and image processing techniques, calculating patient motion in three-dimensional space. This research aimed to develop and implement an algorithm for feature matching and stereo location computation using multiple cameras. Feature matching is done based on the epipolar geometry constraints for a pair of images and extended to the multiple view case with an iterative algorithm. Stereo locations of the matches are then computed using sum of squared distances from the projected 3D lines in SPECT coordinates as the error metric. This information from the VTS, when coupled with motion assessment from the emission data itself, can provide a robust compensation for patient motion as part of reconstruction

    Self-Attention Dense Depth Estimation Network for Unrectified Video Sequences

    Full text link
    The dense depth estimation of a 3D scene has numerous applications, mainly in robotics and surveillance. LiDAR and radar sensors are the hardware solution for real-time depth estimation, but these sensors produce sparse depth maps and are sometimes unreliable. In recent years research aimed at tackling depth estimation using single 2D image has received a lot of attention. The deep learning based self-supervised depth estimation methods from the rectified stereo and monocular video frames have shown promising results. We propose a self-attention based depth and ego-motion network for unrectified images. We also introduce non-differentiable distortion of the camera into the training pipeline. Our approach performs competitively when compared to other established approaches that used rectified images for depth estimation
    corecore