13 research outputs found

    Towards Active Event Recognition

    No full text
    Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems

    Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

    Full text link
    It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that reduce its uncertainty about the unobserved portions of its environment. Based on this principle, we develop a recurrent neural network-based approach to perform active completion of panoramic natural scenes and 3D object shapes. Crucially, the learned policies are not tied to any recognition task nor to the particular semantic content seen during training. As a result, 1) the learned "look around" behavior is relevant even for new tasks in unseen environments, and 2) training data acquisition involves no manual labeling. Through tests in diverse settings, we demonstrate that our approach learns useful generic policies that transfer to new unseen tasks and environments. Completion episodes are shown at https://goo.gl/BgWX3W

    Combined object recognition approaches for mobile robotics

    Get PDF
    There are numerous solutions to simple object recognition problems when the machine is operating under strict environmental conditions (such as lighting). Object recognition in real-world environments poses greater difficulty however. Ideally mobile robots will function in real-world environments without the aid of fiduciary identifiers. More robust methods are therefore needed to perform object recognition reliably. A combined approach of multiple techniques improves recognition results. Active vision and peripheral-foveal visionā€”systems that are designed to improve the information gathered for the purposes of object recognitionā€”are examined. In addition to active vision and peripheral-foveal vision, five object recognition methods that either make use of some form of active vision or could leverage active vision and/or peripheral-foveal vision systems are also investigated: affine-invariant image patches, perceptual organization, 3D morphable models (3DMMs), active viewpoint, and adaptive color segmentation. The current state-of-the-art in these areas of vision research and observations on areas of future research are presented. Examples of state-of-theart methods employed in other vision applications that have not been used for object recognition are also mentioned. Lastly, the future direction of the research field is hypothesized

    An Innovative SIFT-Based Method for Rigid Video Object Recognition

    Get PDF
    This paper presents an innovative SIFT-based method for rigid video object recognition (hereafter called RVO-SIFT). Just like what happens in the vision system of human being, this method makes the object recognition and feature updating process organically unify together, using both trajectory and feature matching, and thereby it can learn new features not only in the training stage but also in the recognition stage, which can improve greatly the completeness of the video objectā€™s features automatically and, in turn, increases the ratio of correct recognition drastically. The experimental results on real video sequences demonstrate its surprising robustness and efficiency

    Peripheral-foveal vision for real-time object recognition and tracking in video

    No full text
    Human object recognition in a physical 3-d environment is still far superior to that of any robotic vision system. We believe that one reason (out of many) for thisā€”one that has not heretofore been significantly exploited in the artificial vision literatureā€”is that humans use a fovea to fixate on, or near an object, thus obtaining a very high resolution image of the object and rendering it easy to recognize. In this paper, we present a novel method for identifying and tracking objects in multi-resolution digital video of partially cluttered environments. Our method is motivated by biological vision systems and uses a learned ā€œattentive ā€ interest map on a low resolution data stream to direct a high resolution ā€œfovea. ā€ Objects that are recognized in the fovea can then be tracked using peripheral vision. Because object recognition is run only on a small foveal image, our system achieves performance in real-time object recognition and tracking that is well beyond simpler systems.
    corecore