622 research outputs found

    Markerless visual servoing on unknown objects for humanoid robot platforms

    Full text link
    To precisely reach for an object with a humanoid robot, it is of central importance to have good knowledge of both end-effector, object pose and shape. In this work we propose a framework for markerless visual servoing on unknown objects, which is divided in four main parts: I) a least-squares minimization problem is formulated to find the volume of the object graspable by the robot's hand using its stereo vision; II) a recursive Bayesian filtering technique, based on Sequential Monte Carlo (SMC) filtering, estimates the 6D pose (position and orientation) of the robot's end-effector without the use of markers; III) a nonlinear constrained optimization problem is formulated to compute the desired graspable pose about the object; IV) an image-based visual servo control commands the robot's end-effector toward the desired pose. We demonstrate effectiveness and robustness of our approach with extensive experiments on the iCub humanoid robot platform, achieving real-time computation, smooth trajectories and sub-pixel precisions

    Visual Servoing from Deep Neural Networks

    Get PDF
    We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF visual servoing. The paper describes how to create a dataset simulating various perturbations (occlusions and lighting conditions) from a single real-world image of the scene. A convolutional neural network is fine-tuned using this dataset to estimate the relative pose between two images of the same scene. The output of the network is then employed in a visual servoing control scheme. The method converges robustly even in difficult real-world settings with strong lighting variations and occlusions.A positioning error of less than one millimeter is obtained in experiments with a 6 DOF robot.Comment: fixed authors lis

    Cooperative tasks between humans and robots in industrial environments

    Get PDF
    Collaborative tasks between human operators and robotic manipulators can improve the performance and flexibility of industrial environments. Nevertheless, the safety of humans should always be guaranteed and the behaviour of the robots should be modified when a risk of collision may happen. This paper presents the research that the authors have performed in recent years in order to develop a human-robot interaction system which guarantees human safety by precisely tracking the complete body of the human and by activating safety strategies when the distance between them is too small. This paper not only summarizes the techniques which have been implemented in order to develop this system, but it also shows its application in three real human-robot interaction tasks.The research leading to these results has received funding from the European Communityʹs Seventh Framework Programme (FP7/2007‐2013) under Grant Agreement no. 231640 and the project HANDLE. This research has also been supported by the Spanish Ministry of Education and Science through the research project DPI2011‐22766

    Visual 3-D SLAM from UAVs

    Get PDF
    The aim of the paper is to present, test and discuss the implementation of Visual SLAM techniques to images taken from Unmanned Aerial Vehicles (UAVs) outdoors, in partially structured environments. Every issue of the whole process is discussed in order to obtain more accurate localization and mapping from UAVs flights. Firstly, the issues related to the visual features of objects in the scene, their distance to the UAV, and the related image acquisition system and their calibration are evaluated for improving the whole process. Other important, considered issues are related to the image processing techniques, such as interest point detection, the matching procedure and the scaling factor. The whole system has been tested using the COLIBRI mini UAV in partially structured environments. The results that have been obtained for localization, tested against the GPS information of the flights, show that Visual SLAM delivers reliable localization and mapping that makes it suitable for some outdoors applications when flying UAVs

    Suivi Multi-Locuteurs avec des Informations Audio-Visuelles pour la Perception des Robots

    Get PDF
    Robot perception plays a crucial role in human-robot interaction (HRI). Perception system provides the robot information of the surroundings and enables the robot to give feedbacks. In a conversational scenario, a group of people may chat in front of the robot and move freely. In such situations, robots are expected to understand where are the people, who are speaking, or what are they talking about. This thesis concentrates on answering the first two questions, namely speaker tracking and diarization. We use different modalities of the robot’s perception system to achieve the goal. Like seeing and hearing for a human-being, audio and visual information are the critical cues for a robot in a conversational scenario. The advancement of computer vision and audio processing of the last decade has revolutionized the robot perception abilities. In this thesis, we have the following contributions: we first develop a variational Bayesian framework for tracking multiple objects. The variational Bayesian framework gives closed-form tractable problem solutions, which makes the tracking process efficient. The framework is first applied to visual multiple-person tracking. Birth and death process are built jointly with the framework to deal with the varying number of the people in the scene. Furthermore, we exploit the complementarity of vision and robot motorinformation. On the one hand, the robot’s active motion can be integrated into the visual tracking system to stabilize the tracking. On the other hand, visual information can be used to perform motor servoing. Moreover, audio and visual information are then combined in the variational framework, to estimate the smooth trajectories of speaking people, and to infer the acoustic status of a person- speaking or silent. In addition, we employ the model to acoustic-only speaker localization and tracking. Online dereverberation techniques are first applied then followed by the tracking system. Finally, a variant of the acoustic speaker tracking model based on von-Mises distribution is proposed, which is specifically adapted to directional data. All the proposed methods are validated on datasets according to applications.La perception des robots joue un rôle crucial dans l’interaction homme-robot (HRI). Le système de perception fournit les informations au robot sur l’environnement, ce qui permet au robot de réagir en consequence. Dans un scénario de conversation, un groupe de personnes peut discuter devant le robot et se déplacer librement. Dans de telles situations, les robots sont censés comprendre où sont les gens, ceux qui parlent et de quoi ils parlent. Cette thèse se concentre sur les deux premières questions, à savoir le suivi et la diarisation des locuteurs. Nous utilisons différentes modalités du système de perception du robot pour remplir cet objectif. Comme pour l’humain, l’ouie et la vue sont essentielles pour un robot dans un scénario de conversation. Les progrès de la vision par ordinateur et du traitement audio de la dernière décennie ont révolutionné les capacités de perception des robots. Dans cette thèse, nous développons les contributions suivantes : nous développons d’abord un cadre variationnel bayésien pour suivre plusieurs objets. Le cadre bayésien variationnel fournit des solutions explicites, rendant le processus de suivi très efficace. Cette approche est d’abord appliqué au suivi visuel de plusieurs personnes. Les processus de créations et de destructions sont en adéquation avecle modèle probabiliste proposé pour traiter un nombre variable de personnes. De plus, nous exploitons la complémentarité de la vision et des informations du moteur du robot : d’une part, le mouvement actif du robot peut être intégré au système de suivi visuel pour le stabiliser ; d’autre part, les informations visuelles peuvent être utilisées pour effectuer l’asservissement du moteur. Par la suite, les informations audio et visuelles sont combinées dans le modèle variationnel, pour lisser les trajectoires et déduire le statut acoustique d’une personne : parlant ou silencieux. Pour experimenter un scenario où l’informationvisuelle est absente, nous essayons le modèle pour la localisation et le suivi des locuteurs basé sur l’information acoustique uniquement. Les techniques de déréverbération sont d’abord appliquées, dont le résultat est fourni au système de suivi. Enfin, une variante du modèle de suivi des locuteurs basée sur la distribution de von-Mises est proposée, celle-ci étant plus adaptée aux données directionnelles. Toutes les méthodes proposées sont validées sur des bases de données specifiques à chaque application

    Planar Object Tracking in the Wild: A Benchmark

    Full text link
    Planar object tracking is an actively studied problem in vision-based robotic applications. While several benchmarks have been constructed for evaluating state-of-the-art algorithms, there is a lack of video sequences captured in the wild rather than in constrained laboratory environment. In this paper, we present a carefully designed planar object tracking benchmark containing 210 videos of 30 planar objects sampled in the natural environment. In particular, for each object, we shoot seven videos involving various challenging factors, namely scale change, rotation, perspective distortion, motion blur, occlusion, out-of-view, and unconstrained. The ground truth is carefully annotated semi-manually to ensure the quality. Moreover, eleven state-of-the-art algorithms are evaluated on the benchmark using two evaluation metrics, with detailed analysis provided for the evaluation results. We expect the proposed benchmark to benefit future studies on planar object tracking.Comment: Accepted by ICRA 201
    corecore