29 research outputs found

    Exploring Convolutional Networks for End-to-End Visual Servoing

    Full text link
    Present image based visual servoing approaches rely on extracting hand crafted visual features from an image. Choosing the right set of features is important as it directly affects the performance of any approach. Motivated by recent breakthroughs in performance of data driven methods on recognition and localization tasks, we aim to learn visual feature representations suitable for servoing tasks in unstructured and unknown environments. In this paper, we present an end-to-end learning based approach for visual servoing in diverse scenes where the knowledge of camera parameters and scene geometry is not available a priori. This is achieved by training a convolutional neural network over color images with synchronised camera poses. Through experiments performed in simulation and on a quadrotor, we demonstrate the efficacy and robustness of our approach for a wide range of camera poses in both indoor as well as outdoor environments.Comment: IEEE ICRA 201

    Avancées récentes en asservissement visuel

    Get PDF
    National audienceLa communauté française est très active dans le domaine de l'asservissement visuel. Cet article se propose d'en présenter les avancées ressentes, aussi bien sur les aspects théoriques (modélisation d'informations visuelles et élaboration de lois de commande assurant diverses propriétés de robustesse, d'invariance, de stabilité, de découplage, etc.) que sur les nouvelles applications traitées (en robotique médicale, sur des engins volants,etc.)

    Visual Servoing from straight lines

    Get PDF
    In this paper we consider the problem of controlling a robotic system by using the projection of 3D straight lines in the image plane of central catadioptric systems. Most of the effort in visual servoing are devoted to points, only few works have investigated the use of lines in visual servoing with traditional cameras and none has explored the case of omnidirectional cameras. First a generic central catadioptric interaction matrix for the projection of 3D straight lines is derived from the projection model of an entire class of camera. Then an image-based control law is designed and validated through simulation results and real experiments with a mobile robot.Dans cet article, nous présentons une stratégie de commande de systèmes robotiques en utilisant comme entrées d'une boucle d'asservissement visuel des primitives relatives à la projection de droites dans le plan image d'une caméra panoramique à point central unique. Afin de réaliser la commande d'un système robotique par asservissement visuel, il est nécessaire d'estimer la matrice d'interaction liant les mouvements de la caméra aux mouvements des primitives visuelles dans l'image. Dans cet article, nous dérivons la forme analytique de la matrice d'interaction générique relative à la projection de droites à partir d'un modèle de projection englobant la classe entière des caméras à point central unique. Elle est ensuite utilisée dans un schéma d'asservissement visuel. Des simulations ainsi que des résultats expérimentaux sur un robot mobile valident l'approche proposée

    Refractive Geometry for Underwater Domes

    Get PDF
    Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere’s center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications

    Refractive Geometry for Underwater Domes

    Get PDF
    Underwater cameras are typically placed behind glass windows to protect them from the water. Spherical glass, a dome port, is well suited for high water pressures at great depth, allows for a large field of view, and avoids refraction if a pinhole camera is positioned exactly at the sphere’s center. Adjusting a real lens perfectly to the dome center is a challenging task, both in terms of how to actually guide the centering process (e.g. visual servoing) and how to measure the alignment quality, but also, how to mechanically perform the alignment. Consequently, such systems are prone to being decentered by some offset, leading to challenging refraction patterns at the sphere that invalidate the pinhole camera model. We show that the overall camera system becomes an axial camera, even for thick domes as used for deep sea exploration and provide a non-iterative way to compute the center of refraction without requiring knowledge of exact air, glass or water properties. We also analyze the refractive geometry at the sphere, looking at effects such as forward- vs. backward decentering, iso-refraction curves and obtain a 6th-degree polynomial equation for forward projection of 3D points in thin domes. We then propose a pure underwater calibration procedure to estimate the decentering from multiple images. This estimate can either be used during adjustment to guide the mechanical position of the lens, or can be considered in photogrammetric underwater applications

    Robust Visual Servo Control and Tracking for the Manipulator of a Planetary Exploration Rover

    Get PDF
    To collect samples and handle tools, planetary exploration rovers commonly employ light-weight robotic manipulators. These can suffer from undesirable positioning imprecision due to erroneous end-effector pose estimates obtained by the manipulators kinematics, leading to the failure of the manipulation task. This thesis presents a vision-based end-effector pose correction pipeline to improve the positioning precision of the end-effector during manipulation tasks. Our approach corrects the end-effector pose by fusing the estimates obtained by the manipulators kinematics with information obtained from monocular vision data. We propose a gradient based method to track a set of active markers within the image stream, which provides us with additional information on the covariance of the retrieved image points. In order to recover the 3D pose estimate of the end-effector, we make use of the maximum likelihood perspective-n-point algorithm, allowing us to propagate the image point uncertainties to their 3D pose covariances. Based on evaluations using recorded ground-truth data, we show that our tracking method leads to a reduction of the kinematic position error by up to 77%. To operate outdoors and under changing illumination conditions, the robustness of the tracking approach is paramount. Based on the propagated covariance information, we employ an Error-State Kalman filter for the rejection of pose outliers and the reduction of pose jitter. Its smoothing capabilities are confirmed in simulation. We further show the application of the vision-based correction pipeline as part of a visual servoing scheme designed for the collection of payload boxes by the manipulator of the Lightweight Rover Unit 2, developed at the Institute for Robotics and Mechatronics of the German Aerospace Center. We propose a switching control scheme that applies a position-based visual servo (PBVS) for movements of the end-effector in free space and switches to a PBVS based hybrid impedance visual servoing scheme for movements in close proximity or in direct contact with the coupling partner, to ensure the safe interaction between the manipulator and the payload. The implemented PBVS approach is lastly evaluated in simulation. We show its successful execution and stability in the vicinity of singularities as well as the avoidance of joint position and velocity limits

    A Unified Hybrid Formulation for Visual SLAM

    Get PDF
    Visual Simultaneous Localization and Mapping (Visual SLAM (VSLAM)), is the process of estimating the six degrees of freedom ego-motion of a camera, from its video feed, while simultaneously constructing a 3D model of the observed environment. Extensive research in the field for the past two decades has yielded real-time and efficient algorithms for VSLAM, allowing various interesting applications in augmented reality, cultural heritage, robotics and the automotive industry, to name a few. The underlying formula behind VSLAM is a mixture of image processing, geometry, graph theory, optimization and machine learning; the theoretical and practical development of these building blocks led to a wide variety of algorithms, each leveraging different assumptions to achieve superiority under the presumed conditions of operation. An exhaustive survey on the topic outlined seven main components in a generic VSLAM pipeline, namely: the matching paradigm, visual initialization, data association, pose estimation, topological/metric map generation, optimization, and global localization. Before claiming VSLAM a solved problem, numerous challenging subjects pertaining to robustness in each of the aforementioned components have to be addressed; namely: resilience to a wide variety of scenes (poorly textured or self repeating scenarios), resilience to dynamic changes (moving objects), and scalability for long-term operation (computational resources awareness and management). Furthermore, current state-of-the art VSLAM pipelines are tailored towards static, basic point cloud reconstructions, an impediment to perception applications such as path planning, obstacle avoidance and object tracking. To address these limitations, this work proposes a hybrid scene representation, where different sources of information extracted solely from the video feed are fused in a hybrid VSLAM system. The proposed pipeline allows for seamless integration of data from pixel-based intensity measurements and geometric entities to produce and make use of a coherent scene representation. The goal is threefold: 1) Increase camera tracking accuracy under challenging motions, 2) improve robustness to challenging poorly textured environments and varying illumination conditions, and 3) ensure scalability and long-term operation by efficiently maintaining a global reusable map representation
    corecore