3 research outputs found
Irregular pyramide: a hierarchical structure for exploratory vision
We present a new class of hierarchical structures known as the irregular pyramid .
This pyramid is characterized by a non regular subsampling defined as a function of
the spatial location of the points . This hierarchical structure is made as consistent
as possible with the human visual mapping . This novel structure allows both fine
sampling in the focus area and coarse sampling elsewhere in the scene so resulting
in smaller images. Examples are shown as well as an exploratory vision application
in motion detection .Nous introduisons dans cet article une nouvelle classe de structures hiérarchiques irrégulières où la nature du sous-échantillonnage est une fonction de la position spatiale. Cette structure s'apparente à la structure rétinienne (notion de fovéa et de périphérie). Ce nouveau mécanisme permet de conserver un échantillonnage précis dans la zone de focus tout en résumant le restant de la scène. L'image résultat étant de taille réduite, les traitements qui lui sont appliqués sont plus rapides. Nous présentons des exemples et une validation de cette approche sur des images dynamiques dans le contexte de la vision exploratoire
Implementing Selective Attention in Machines: The Case of Touch-Driven Saccades
Recent paradigms in the fields of robotics and machine perception have emphasized the importance of selective attention mechanisms for perceiving and interacting with the environment. In the case of a system involved in operations requiring a physical interaction with the surrounding environment, a major role is played by the capability of attentively responding to tactile events. By performing somatosensory saccades, the nature of the cutaneous stimulation can be assessed, and new motor actions can be planned. However, the study of touch-driven attention, has almost been neglected by robotics researchers. In this paper the development of visuo-cutaneo coordination for the production of somatosensory saccades is investigated, and a general architecture for integrating different kinds of attentive mechanisms is proposed. The system autonomously discovers the sensorymotor transformation which links tactile events to visual saccades, on the basis of multisensory consistencies and basic, built-in, motor reflexes. Results obtained both with simulations and robotic experiments are analyzed
Recommended from our members
Foveated Vision Models for Search and Recognition
Computer vision has made a significant progress in recent years thanks to advancement in neural network architectures and computing power. At the sensory level, the current machine vision systems sample the visual data uniformly to make predictions about the scene. This is in contrast with the human vision system that has high visual acuity only in a small central region, the fovea, and much coarser sampling away from the center. There has been a renewed interest, particularly in the context of active vision for robotics navigation and scene exploration, to develop biologically motivated methods that can leverage such foveated computations. While foveated vision offers computational savings at or near the region of interest, it requires eye movements to scan the scene for effective image understanding. The hypothesis is that methods that can leverage non-uniform sampling of the field of view together with eye-movements will lead to a new class of active vision systems that are optimized computationally for specific tasks of interest.Inspired by the above observations, this research provides, for the first time, a comprehensive study of the human visual search in the constrained setting of person identification in the wild. A novel video database is created that systematically tests how different parts of a person contribute towards eye-movements and person identification. Our study shows that the search errors can dominate the overall recognition accuracy in human subject experiments. This calls for new strategies for integrating eye tracking with foveated image representations. Towards this two specific approaches are investigated further.In the first approach, a deep neural network based method is developed to model eye movements. Using the long-short-term-memory to model the successive fixations. The proposed method outperforms state of the state of the art performance while simplifying the feature extraction procedure. The second approach focuses on the foveated image model that leverages multiple fixations. A convolutional neural network method is proposed that works directly with the foveated input images that achieves competitive recognition rates compared to standard neural networks operating on the same number of input pixels. Overall the thesis investigates the requirements and implementations that could support active foveated vision, and lays down the ground work for future studies in this area