Search CORE

13 research outputs found

Towards Active Event Recognition

Author: Demiris Y
Ognibene D
Publication venue: AIII Press
Publication date: 31/08/2013
Field of study

Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems

Spiral - Imperial College Digital Repository

Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

Author: Grauman Kristen
Jayaraman Dinesh
Publication venue
Publication date: 21/12/2017
Field of study

It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that reduce its uncertainty about the unobserved portions of its environment. Based on this principle, we develop a recurrent neural network-based approach to perform active completion of panoramic natural scenes and 3D object shapes. Crucially, the learned policies are not tied to any recognition task nor to the particular semantic content seen during training. As a result, 1) the learned "look around" behavior is relevant even for new tasks in unseen environments, and 2) training data acquisition involves no manual labeling. Through tests in diverse settings, we demonstrate that our approach learns useful generic policies that transfer to new unseen tasks and environments. Completion episodes are shown at https://goo.gl/BgWX3W

arXiv.org e-Print Archive

Crossref

Combined object recognition approaches for mobile robotics

Author: Gerard Rusty
Publication venue: Western CEDAR
Publication date: 01/01/2008
Field of study

There are numerous solutions to simple object recognition problems when the machine is operating under strict environmental conditions (such as lighting). Object recognition in real-world environments poses greater difficulty however. Ideally mobile robots will function in real-world environments without the aid of fiduciary identifiers. More robust methods are therefore needed to perform object recognition reliably. A combined approach of multiple techniques improves recognition results. Active vision and peripheral-foveal vision—systems that are designed to improve the information gathered for the purposes of object recognition—are examined. In addition to active vision and peripheral-foveal vision, five object recognition methods that either make use of some form of active vision or could leverage active vision and/or peripheral-foveal vision systems are also investigated: affine-invariant image patches, perceptual organization, 3D morphable models (3DMMs), active viewpoint, and adaptive color segmentation. The current state-of-the-art in these areas of vision research and observations on areas of future research are presented. Examples of state-of-theart methods employed in other vision applications that have not been used for object recognition are also mentioned. Lastly, the future direction of the research field is hypothesized

Western Washington University

An Innovative SIFT-Based Method for Rigid Video Object Recognition

Author: Fengli Zhang
Jian Xiong
Jie Yu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

This paper presents an innovative SIFT-based method for rigid video object recognition (hereafter called RVO-SIFT). Just like what happens in the vision system of human being, this method makes the object recognition and feature updating process organically unify together, using both trajectory and feature matching, and thereby it can learn new features not only in the training stage but also in the recognition stage, which can improve greatly the completeness of the video object’s features automatically and, in turn, increases the ratio of correct recognition drastically. The experimental results on real video sequences demonstrate its surprising robustness and efficiency

Crossref

Directory of Open Access Journals

Recommended from our members

Faces and Viewing Behavior: An Exploratory Investigation

Author: Djamasbi Soussan
Siegel Marisa
Tullis Tom S.
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/09/2012
Field of study

User experience is becoming increasingly important in gaining a competitive advantage in the marketplace. One way to improve user experience is by including images of faces. People are drawn to faces because paying attention to faces has played a significant role in human evolution. Hence, areas on a web page that typically receive less attention from users, such as the right side or below the fold, may benefit from the inclusion of images of faces. Although faces may be useful in attracting attention to particular places on a web page, they may also distract attention from key information. To test this possibility, we conducted two eye tracking studies in which images of faces were placed on areas of a web page that are shown to receive less attention. The results indicated that faces did not increase the number of people who viewed the areas where the faces were located, but that faces affected fixation patterns on these areas. Our results also showed that faces located above the fold of the web page negatively affected the performance of those who were completing tasks

Digital WPI

AIS Electronic Library (AISeL)

Peripheral-foveal vision for real-time object recognition and tracking in video

Author: Arfvidsson Joakin
Baumstarck Paul
Bradski Gary
Chung Sukwon
Gould Stephen
Kaehler Adrian
Messner Marius
Ng Andrew Y.
Sapp Benjamin
Publication venue: Carnegie Mellon University
Publication date: 14/06/2016
Field of study

The Australian National University

Peripheral-foveal vision for real-time object recognition and tracking in video

Author: Adrian Kaehler
Andrew Y. Ng
Gary Bradski
Joakim Arfvidsson
Marius Messner
Paul Baumstarck
Stephen Gould
Sukwon Chung
Publication venue
Publication date
Field of study

Human object recognition in a physical 3-d environment is still far superior to that of any robotic vision system. We believe that one reason (out of many) for this—one that has not heretofore been significantly exploited in the artificial vision literature—is that humans use a fovea to fixate on, or near an object, thus obtaining a very high resolution image of the object and rendering it easy to recognize. In this paper, we present a novel method for identifying and tracking objects in multi-resolution digital video of partially cluttered environments. Our method is motivated by biological vision systems and uses a learned “attentive ” interest map on a low resolution data stream to direct a high resolution “fovea. ” Objects that are recognized in the fovea can then be tracked using peripheral vision. Because object recognition is run only on a small foveal image, our system achieves performance in real-time object recognition and tracking that is well beyond simpler systems.

CiteSeerX