213,816 research outputs found

    RPNet: an End-to-End Network for Relative Camera Pose Estimation

    Full text link
    This paper addresses the task of relative camera pose estimation from raw image pixels, by means of deep neural networks. The proposed RPNet network takes pairs of images as input and directly infers the relative poses, without the need of camera intrinsic/extrinsic. While state-of-the-art systems based on SIFT + RANSAC, are able to recover the translation vector only up to scale, RPNet is trained to produce the full translation vector, in an end-to-end way. Experimental results on the Cambridge Landmark dataset show very promising results regarding the recovery of the full translation vector. They also show that RPNet produces more accurate and more stable results than traditional approaches, especially for hard images (repetitive textures, textureless images, etc). To the best of our knowledge, RPNet is the first attempt to recover full translation vectors in relative pose estimation

    Low-Cost Compressive Sensing for Color Video and Depth

    Full text link
    A simple and inexpensive (low-power and low-bandwidth) modification is made to a conventional off-the-shelf color video camera, from which we recover {multiple} color frames for each of the original measured frames, and each of the recovered frames can be focused at a different depth. The recovery of multiple frames for each measured frame is made possible via high-speed coding, manifested via translation of a single coded aperture; the inexpensive translation is constituted by mounting the binary code on a piezoelectric device. To simultaneously recover depth information, a {liquid} lens is modulated at high speed, via a variable voltage. Consequently, during the aforementioned coding process, the liquid lens allows the camera to sweep the focus through multiple depths. In addition to designing and implementing the camera, fast recovery is achieved by an anytime algorithm exploiting the group-sparsity of wavelet/DCT coefficients.Comment: 8 pages, CVPR 201

    Learning to Translate in Real-time with Neural Machine Translation

    Get PDF
    Translating in real-time, a.k.a. simultaneous translation, outputs translation words before the input sentence ends, which is a challenging problem for conventional machine translation methods. We propose a neural machine translation (NMT) framework for simultaneous translation in which an agent learns to make decisions on when to translate from the interaction with a pre-trained NMT environment. To trade off quality and delay, we extensively explore various targets for delay and design a method for beam-search applicable in the simultaneous MT setting. Experiments against state-of-the-art baselines on two language pairs demonstrate the efficacy of the proposed framework both quantitatively and qualitatively.Comment: 10 pages, camera read

    Intention recognition for gaze controlled robotic minimally invasive laser ablation

    Get PDF
    Eye tracking technology has shown promising results for allowing hands-free control of robotically-mounted cameras and tools. However existing systems present only limited capabilities in allowing the full range of camera motions in a safe, intuitive manner. This paper introduces a framework for the recognition of surgeon intention, allowing activation and control of the camera through natural gaze behaviour. The system is resistant to noise such as blinking, while allowing the surgeon to look away safely at any time. Furthermore, this paper presents a novel approach to control the translation of the camera along its optical axis using a combination of eye tracking and stereo reconstruction. Combining eye tracking and stereo reconstruction allows the system to determine which point in 3D space the user is fixating, enabling a translation of the camera to achieve the optimal viewing distance. In addition, the eye tracking information is used to perform automatic laser targeting for laser ablation. The desired target point of the laser, mounted on a separate robotic arm, is determined with the eye tracking thus removing the need to manually adjust the laser's target point before starting each new ablation. The calibration methodology used to obtain millimetre precision for the laser targeting without the aid of visual servoing is described. Finally, a user study validating the system is presented, showing clear improvement with median task times under half of those of a manually controlled robotic system

    Camera motion estimation through planar deformation determination

    Get PDF
    In this paper, we propose a global method for estimating the motion of a camera which films a static scene. Our approach is direct, fast and robust, and deals with adjacent frames of a sequence. It is based on a quadratic approximation of the deformation between two images, in the case of a scene with constant depth in the camera coordinate system. This condition is very restrictive but we show that provided translation and depth inverse variations are small enough, the error on optical flow involved by the approximation of depths by a constant is small. In this context, we propose a new model of camera motion, that allows to separate the image deformation in a similarity and a ``purely'' projective application, due to change of optical axis direction. This model leads to a quadratic approximation of image deformation that we estimate with an M-estimator; we can immediatly deduce camera motion parameters.Comment: 21 pages, version modifi\'ee accept\'e le 20 mars 200

    Improved docking alignment system

    Get PDF
    Improved techniques are provided for the alignment of two objects. The present invention is particularly suited for 3-D translation and 3-D rotational alignment of objects in outer space. A camera is affixed to one object, such as a remote manipulator arm of the spacecraft, while the planar reflective surface is affixed to the other object, such as a grapple fixture. A monitor displays in real-time images from the camera such that the monitor displays both the reflected image of the camera and visible marking on the planar reflective surface when the objects are in proper alignment. The monitor may thus be viewed by the operator and the arm manipulated so that the reflective surface is perpendicular to the optical axis of the camera, the roll of the reflective surface is at a selected angle with respect to the camera, and the camera is spaced a pre-selected distance from the reflective surface
    • …
    corecore