472 research outputs found
Methods and Apparatus for Autonomous Robotic Control
Sensory processing of visual, auditory, and other sensor information (e.g., visual imagery, LIDAR, RADAR) is conventionally based on "stovepiped," or isolated processing, with little interactions between modules. Biological systems, on the other hand, fuse multi-sensory information to identify nearby objects of interest more quickly, more efficiently, and with higher signal-to-noise ratios. Similarly, examples of the OpenSense technology disclosed herein use neurally inspired processing to identify and locate objects in a robot's environment. This enables the robot to navigate its environment more quickly and with lower computational and power requirements
Doctor of Philosophy
dissertationWe propose to examine a representation which features combined action and perception signals, i.e., instead of having a purely geometric representation of the perceptual data, we include the motor actions, e.g., aiming a camera at an object, which are al
A hierarchical active binocular robot vision architecture for scene exploration and object appearance learning
This thesis presents an investigation of a computational model of hierarchical visual behaviours within an active binocular robot vision architecture. The robot vision system is able to localise multiple instances of the same object class, while simultaneously maintaining vergence and directing its gaze to attend and recognise objects within cluttered, complex scenes. This is achieved by implementing all image analysis in an egocentric symbolic space without creating explicit pixel-space maps and without the need for calibration or other knowledge of the camera geometry. One of the important aspects of the active binocular vision paradigm requires that visual features in both camera eyes must be bound together in order to drive visual search to saccade, locate and recognise putative objects or salient locations in the robot's field of view. The system structure is based on the “attentional spotlight” metaphor of biological systems and a collection of abstract and reactive visual behaviours arranged in a hierarchical structure.
Several studies have shown that the human brain represents and learns objects for recognition by snapshots of 2-dimensional views of the imaged scene that happens to contain the object of interest during active interaction (exploration) of the environment. Likewise, psychophysical findings specify that the primate’s visual cortex represents common everyday objects by a hierarchical structure of their parts or sub-features and, consequently, recognise by simple but imperfect 2D view object part approximations. This thesis incorporates the above observations into an active visual learning behaviour in the hierarchical active binocular robot vision architecture. By actively exploring the object viewing sphere (as higher mammals do), the robot vision system automatically synthesises and creates its own part-based object representation from multiple observations while a human teacher indicates the object and supplies a classification name. Its is proposed to adopt the computational concepts of a visual learning exploration mechanism that controls the accumulation of visual evidence and directs attention towards the spatial salient object parts.
The behavioural structure of the binocular robot vision architecture is loosely modelled by a WHAT and WHERE visual streams. The WHERE stream maintains and binds spatial attention on the object part coordinates that egocentrically characterises the location of the object of interest and extracts spatio-temporal properties of feature coordinates and descriptors. The WHAT stream either determines the identity of an object or triggers a learning behaviour that stores view-invariant feature descriptions of the object part. Therefore, the robot vision is capable to perform a collection of different specific visual tasks such as vergence, detection, discrimination, recognition localisation and multiple same-instance identification. This classification of tasks enables the robot vision system to execute and fulfil specified high-level tasks, e.g. autonomous scene exploration and active object appearance learning
Advanced Knowledge Application in Practice
The integration and interdependency of the world economy leads towards the creation of a global market that offers more opportunities, but is also more complex and competitive than ever before. Therefore widespread research activity is necessary if one is to remain successful on the market. This book is the result of research and development activities from a number of researchers worldwide, covering concrete fields of research
Space-variant picture coding
PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of
the human visual system in order to increase coding efficiency in terms of perceived quality
per bit. This thesis extends space-variant coding research in two directions. The first of
these directions is in foveated coding. Past foveated coding research has been dominated
by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer
and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing
an additive multi-viewer sensitivity function based on an established eye resolution
model, and, from this, a blur map that is optimal in the sense of discarding frequencies in
least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm
is presented for the efficient computation of high-accuracy smoothly space-variant
Gaussian blurring, using a specialised filter bank which approximates perfect space-variant
Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to
the brute force approach of employing a separate low-pass filter at each image location.
The second direction is that of artifi cially increasing the depth-of- field of an image, an
idea borrowed from photography with the advantage of allowing an image to be reduced
in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic
occlusion eff ects as occur in natural blurring, and of handling any number of blurring
and occlusion levels with the same level of computational complexity. The merits of this
coding approach have been investigated by subjective experiments to compare it with
single-viewer foveated image coding. The results found the depth-based preblurring to
generally be significantly preferable to the same level of foveation blurring
From Robot Arm to Intentional Agent: the Articulated Head
Robot arms have come a long way from the humble beginnings of the first Unimate robot at a General Motors plant installed to unload parts from a die-casting machine to the flexible and versatile tool ubiquitous and indispensable in many fields of industrial production nowadays. The other chapters of this book attest to the progress in the field and the plenitude of applications of robot arms. It is still fair, however, to say that currently industrial robot arms are primarily applied in continuously repeated manufacturing task for which they are pre-programmed. They are known for their precision and reliability but in general use only limited sensory input and the changes in the execution of their task due to varying environmental factors are minimal. If one was to compare a robot arm with an animal, even a very simple one, this property of robot arm applications would immediately stand out as one of the most striking differences. Living organisms must sense changes in the environment that are crucial to their survival and must have some flexibility to adjust their behaviour. In most robot arm contexts, such a comparison is currently at best of academic interest, though it might gain relevance very quickly in the future if robot arms are to be used to assist humans to a larger extend than at present. If robot arms will work in close proximity with and directly supporting humans in accomplishing a task, it becomes inevitable for the control system of the robot to have far reaching situational awareness and the capability to adjust its ‘behaviour’ according to the acquired situational information. In addition, robot perception and action have to conform a large degree to the expectations of the human co-worker
Recommended from our members
Foveated Vision Models for Search and Recognition
Computer vision has made a significant progress in recent years thanks to advancement in neural network architectures and computing power. At the sensory level, the current machine vision systems sample the visual data uniformly to make predictions about the scene. This is in contrast with the human vision system that has high visual acuity only in a small central region, the fovea, and much coarser sampling away from the center. There has been a renewed interest, particularly in the context of active vision for robotics navigation and scene exploration, to develop biologically motivated methods that can leverage such foveated computations. While foveated vision offers computational savings at or near the region of interest, it requires eye movements to scan the scene for effective image understanding. The hypothesis is that methods that can leverage non-uniform sampling of the field of view together with eye-movements will lead to a new class of active vision systems that are optimized computationally for specific tasks of interest.Inspired by the above observations, this research provides, for the first time, a comprehensive study of the human visual search in the constrained setting of person identification in the wild. A novel video database is created that systematically tests how different parts of a person contribute towards eye-movements and person identification. Our study shows that the search errors can dominate the overall recognition accuracy in human subject experiments. This calls for new strategies for integrating eye tracking with foveated image representations. Towards this two specific approaches are investigated further.In the first approach, a deep neural network based method is developed to model eye movements. Using the long-short-term-memory to model the successive fixations. The proposed method outperforms state of the state of the art performance while simplifying the feature extraction procedure. The second approach focuses on the foveated image model that leverages multiple fixations. A convolutional neural network method is proposed that works directly with the foveated input images that achieves competitive recognition rates compared to standard neural networks operating on the same number of input pixels. Overall the thesis investigates the requirements and implementations that could support active foveated vision, and lays down the ground work for future studies in this area
Conference on Intelligent Robotics in Field, Factory, Service, and Space (CIRFFSS 1994), volume 1
The AIAA/NASA Conference on Intelligent Robotics in Field, Factory, Service, and Space (CIRFFSS '94) was originally proposed because of the strong belief that America's problems of global economic competitiveness and job creation and preservation can partly be solved by the use of intelligent robotics, which are also required for human space exploration missions. Individual sessions addressed nuclear industry, agile manufacturing, security/building monitoring, on-orbit applications, vision and sensing technologies, situated control and low-level control, robotic systems architecture, environmental restoration and waste management, robotic remanufacturing, and healthcare applications
- …