400,996 research outputs found
Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes
We present the Semantic Robot Programming (SRP) paradigm as a convergence of
robot programming by demonstration and semantic mapping. In SRP, a user can
directly program a robot manipulator by demonstrating a snapshot of their
intended goal scene in workspace. The robot then parses this goal as a scene
graph comprised of object poses and inter-object relations, assuming known
object geometries. Task and motion planning is then used to realize the user's
goal from an arbitrary initial scene configuration. Even when faced with
different initial scene configurations, SRP enables the robot to seamlessly
adapt to reach the user's demonstrated goal. For scene perception, we propose
the Discriminatively-Informed Generative Estimation of Scenes and Transforms
(DIGEST) method to infer the initial and goal states of the world from RGBD
images. The efficacy of SRP with DIGEST perception is demonstrated for the task
of tray-setting with a Michigan Progress Fetch robot. Scene perception and task
execution are evaluated with a public household occlusion dataset and our
cluttered scene dataset.Comment: published in ICRA 201
Perceiving pictures
I aim to give a new account of picture perception: of the way our visual system functions when we see something in a picture. My argument relies on the functional distinction between the ventral and dorsal visual subsystems. I propose that it is constitutive of picture perception that our ventral subsystem attributes properties to the depicted scene, whereas our dorsal subsystem attributes properties to the picture surface. This duality elucidates Richard Wollheimâs concept of the âtwofoldnessâ of our experience of pictures: the âvisual awareness not only of what is represented but also of the surface qualities of the representation.â I argue for the following four claims: (a) the depicted scene is represented by ventral perception, (b) the depicted scene is not represented by dorsal perception, (c) the picture surface is represented by dorsal perception, and (d) the picture surface is not necessarily represented by ventral perceptio
Longer fixation duration while viewing face images
The spatio-temporal properties of saccadic eye movements can be influenced by the cognitive demand and the characteristics of the observed scene. Probably due to its crucial role in social communication, it is argued that face perception may involve different cognitive processes compared with non-face object or scene perception. In this study, we investigated whether and how face and natural scene images can influence the patterns of visuomotor activity. We recorded monkeysâ saccadic eye movements as they freely viewed monkey face and natural scene images. The face and natural scene images attracted similar number of fixations, but viewing of faces was accompanied by longer fixations compared with natural scenes. These longer fixations were dependent on the context of facial features. The duration of fixations directed at facial contours decreased when the face images were scrambled, and increased at the later stage of normal face viewing. The results suggest that face and natural scene images can generate different patterns of visuomotor activity. The extra fixation duration on faces may be correlated with the detailed analysis of facial features
DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
Today, there are two major paradigms for vision-based autonomous driving
systems: mediated perception approaches that parse an entire scene to make a
driving decision, and behavior reflex approaches that directly map an input
image to a driving action by a regressor. In this paper, we propose a third
paradigm: a direct perception approach to estimate the affordance for driving.
We propose to map an input image to a small number of key perception indicators
that directly relate to the affordance of a road/traffic state for driving. Our
representation provides a set of compact yet complete descriptions of the scene
to enable a simple controller to drive autonomously. Falling in between the two
extremes of mediated perception and behavior reflex, we argue that our direct
perception representation provides the right level of abstraction. To
demonstrate this, we train a deep Convolutional Neural Network using recording
from 12 hours of human driving in a video game and show that our model can work
well to drive a car in a very diverse set of virtual environments. We also
train a model for car distance estimation on the KITTI dataset. Results show
that our direct perception approach can generalize well to real driving images.
Source code and data are available on our project website
A Theoretical and Experimental Analysis of the Outside World Perception Process
The outside scene is often an important source of information for manual control tasks. Important examples of these are car driving and aircraft control. This paper deals with modelling this visual scene perception process on the basis of linear perspective geometry and the relative motion cues. Model predictions utilizing psychophysical threshold data from base-line experiments and literature of a variety of visual approach tasks are compared with experimental data. Both the performance and workload results illustrate that the model provides a meaningful description of the outside world perception process, with a useful predictive capability
Natural scene statistics mediate the perception of image complexity
Humans are sensitive to complexity and regularity in patterns. The subjective
perception of pattern complexity is correlated to algorithmic
(Kolmogorov-Chaitin) complexity as defined in computer science, but also to the
frequency of naturally occurring patterns. However, the possible mediational
role of natural frequencies in the perception of algorithmic complexity remains
unclear. Here we reanalyze Hsu et al. (2010) through a mediational analysis,
and complement their results in a new experiment. We conclude that human
perception of complexity seems partly shaped by natural scenes statistics,
thereby establishing a link between the perception of complexity and the effect
of natural scene statistics
- âŠ