15 research outputs found

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade

    On the Challenges of Open World Recognitionunder Shifting Visual Domains

    Get PDF
    Robotic visual systems operating in the wild must act in unconstrained scenarios, under different environmental conditions while facing a variety of semantic concepts, including unknown ones. To this end, recent works tried to empower visual object recognition methods with the capability to i) detect unseen concepts and ii) extended their knowledge over time, as images of new semantic classes arrive. This setting, called Open World Recognition (OWR), has the goal to produce systems capable of breaking the semantic limits present in the initial training set. However, this training set imposes to the system not only its own semantic limits, but also environmental ones, due to its bias toward certain acquisition conditions that do not necessarily reflect the high variability of the real-world. This discrepancy between training and test distribution is called domain-shift. This work investigates whether OWR algorithms are effective under domain-shift, presenting the first benchmark setup for assessing fairly the performances of OWR algorithms, with and without domain-shift. We then use this benchmark to conduct analyses in various scenarios, showing how existing OWR algorithms indeed suffer a severe performance degradation when train and test distributions differ. Our analysis shows that this degradation is only slightly mitigated by coupling OWR with domain generalization techniques, indicating that the mere plug-and-play of existing algorithms is not enough to recognize new and unknown categories in unseen domains. Our results clearly point toward open issues and future research directions, that need to be investigated for building robot visual systems able to function reliably under these challenging yet very real conditions. Code available at https://github.com/DarioFontanel/OWR-VisualDomainsComment: RAL/ICRA 202

    ROVIS: RObust Machine VIsion for Service Robotic System FRIEND

    Get PDF
    Abstract-In this paper the vision architecture, named ROVIS, of the robotic system FRIEND is presented. The main concept of the ROVIS is the inclusion of feedback structures between different components of the vision system as well as between the vision and other modules of the robotic system to achieve high robustness against external influences of the individual system units as well as of the system as whole. The novelty of this work lies in the inclusion of feedback control at different levels of the 2D object recognition system to provide reliable inputs to the 3D object reconstruction and object manipulation modules of the robotic system FRIEND. The idea behind this approach is to change the processing parameters in a closed-loop manner so that the current image processing result at a particular processing level is driven to a desired result. The effectiveness of the ROVIS system is demonstrated through the presentation of experimental results on 3D reconstruction of different objects from FRIEND environment

    Autonomous navigation for guide following in crowded indoor environments

    No full text
    The requirements for assisted living are rapidly changing as the number of elderly patients over the age of 60 continues to increase. This rise places a high level of stress on nurse practitioners who must care for more patients than they are capable. As this trend is expected to continue, new technology will be required to help care for patients. Mobile robots present an opportunity to help alleviate the stress on nurse practitioners by monitoring and performing remedial tasks for elderly patients. In order to produce mobile robots with the ability to perform these tasks, however, many challenges must be overcome. The hospital environment requires a high level of safety to prevent patient injury. Any facility that uses mobile robots, therefore, must be able to ensure that no harm will come to patients whilst in a care environment. This requires the robot to build a high level of understanding about the environment and the people with close proximity to the robot. Hitherto, most mobile robots have used vision-based sensors or 2D laser range finders. 3D time-of-flight sensors have recently been introduced and provide dense 3D point clouds of the environment at real-time frame rates. This provides mobile robots with previously unavailable dense information in real-time. I investigate the use of time-of-flight cameras for mobile robot navigation in crowded environments in this thesis. A unified framework to allow the robot to follow a guide through an indoor environment safely and efficiently is presented. Each component of the framework is analyzed in detail, with real-world scenarios illustrating its practical use. Time-of-flight cameras are relatively new sensors and, therefore, have inherent problems that must be overcome to receive consistent and accurate data. I propose a novel and practical probabilistic framework to overcome many of the inherent problems in this thesis. The framework fuses multiple depth maps with color information forming a reliable and consistent view of the world. In order for the robot to interact with the environment, contextual information is required. To this end, I propose a region-growing segmentation algorithm to group points based on surface characteristics, surface normal and surface curvature. The segmentation process creates a distinct set of surfaces, however, only a limited amount of contextual information is available to allow for interaction. Therefore, a novel classifier is proposed using spherical harmonics to differentiate people from all other objects. The added ability to identify people allows the robot to find potential candidates to follow. However, for safe navigation, the robot must continuously track all visible objects to obtain positional and velocity information. A multi-object tracking system is investigated to track visible objects reliably using multiple cues, shape and color. The tracking system allows the robot to react to the dynamic nature of people by building an estimate of the motion flow. This flow provides the robot with the necessary information to determine where and at what speeds it is safe to drive. In addition, a novel search strategy is proposed to allow the robot to recover a guide who has left the field-of-view. To achieve this, a search map is constructed with areas of the environment ranked according to how likely they are to reveal the guide’s true location. Then, the robot can approach the most likely search area to recover the guide. Finally, all components presented are joined to follow a guide through an indoor environment. The results achieved demonstrate the efficacy of the proposed components

    Desenvolvimento de uma biblioteca para sistemas de visão estereoscópia para robótica móvel

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico. Programa de Pós-Graduação em Engenharia Elétrica.A demanda por aplicações de robótica móvel vem crescendo consideravelmente nos últimos anos. Independente da natureza ou fim, robôs autônomos móveis devem interagir com o mundo para alcançar seus objetivos. Para isto, de alguma maneira precisam obter informações a respeito do ambiente. Dentre diferentes abordagens existentes, bons resultados têm sido alcançados pelo emprego de sistemas de visão estereoscópica. Através de um sistema como este é possível extrair informação tridimensional do meio em que o robô está inserido. A informação tridimensional pode então ser usada para orientar as ações do robô, seja para navegação, reconhecimento ou manipulação. No presente trabalho é apresentado o desenvolvimento de uma biblioteca para sistemas de visão estereoscópica para robótica móvel. Para tal, foram tratados diferentes problemas da estereoscopia, procurando prover mapas de profundidade com detalhes do ambiente suficientes para a operação geral de um robô móvel que tenha um sistema de visão estereoscópica como principal fonte de informação. Neste contexto, são avaliados e propostos modelos, métodos e soluções para diferentes problemas como calibração de câmeras, retificação de imagens, reconstrução e, principalmente, geração de mapas de profundidade densos. Os resultados obtidos demonstram a efetividade da utilização da infra-estrutura disponibilizada e dos métodos propostos no desenvolvimento de aplicações para robótica móvel
    corecore