400,996 research outputs found

    Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes

    Full text link
    We present the Semantic Robot Programming (SRP) paradigm as a convergence of robot programming by demonstration and semantic mapping. In SRP, a user can directly program a robot manipulator by demonstrating a snapshot of their intended goal scene in workspace. The robot then parses this goal as a scene graph comprised of object poses and inter-object relations, assuming known object geometries. Task and motion planning is then used to realize the user's goal from an arbitrary initial scene configuration. Even when faced with different initial scene configurations, SRP enables the robot to seamlessly adapt to reach the user's demonstrated goal. For scene perception, we propose the Discriminatively-Informed Generative Estimation of Scenes and Transforms (DIGEST) method to infer the initial and goal states of the world from RGBD images. The efficacy of SRP with DIGEST perception is demonstrated for the task of tray-setting with a Michigan Progress Fetch robot. Scene perception and task execution are evaluated with a public household occlusion dataset and our cluttered scene dataset.Comment: published in ICRA 201

    Perceiving pictures

    Get PDF
    I aim to give a new account of picture perception: of the way our visual system functions when we see something in a picture. My argument relies on the functional distinction between the ventral and dorsal visual subsystems. I propose that it is constitutive of picture perception that our ventral subsystem attributes properties to the depicted scene, whereas our dorsal subsystem attributes properties to the picture surface. This duality elucidates Richard Wollheim’s concept of the “twofoldness” of our experience of pictures: the “visual awareness not only of what is represented but also of the surface qualities of the representation.” I argue for the following four claims: (a) the depicted scene is represented by ventral perception, (b) the depicted scene is not represented by dorsal perception, (c) the picture surface is represented by dorsal perception, and (d) the picture surface is not necessarily represented by ventral perceptio

    Longer fixation duration while viewing face images

    Get PDF
    The spatio-temporal properties of saccadic eye movements can be influenced by the cognitive demand and the characteristics of the observed scene. Probably due to its crucial role in social communication, it is argued that face perception may involve different cognitive processes compared with non-face object or scene perception. In this study, we investigated whether and how face and natural scene images can influence the patterns of visuomotor activity. We recorded monkeys’ saccadic eye movements as they freely viewed monkey face and natural scene images. The face and natural scene images attracted similar number of fixations, but viewing of faces was accompanied by longer fixations compared with natural scenes. These longer fixations were dependent on the context of facial features. The duration of fixations directed at facial contours decreased when the face images were scrambled, and increased at the later stage of normal face viewing. The results suggest that face and natural scene images can generate different patterns of visuomotor activity. The extra fixation duration on faces may be correlated with the detailed analysis of facial features

    DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving

    Full text link
    Today, there are two major paradigms for vision-based autonomous driving systems: mediated perception approaches that parse an entire scene to make a driving decision, and behavior reflex approaches that directly map an input image to a driving action by a regressor. In this paper, we propose a third paradigm: a direct perception approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving. Our representation provides a set of compact yet complete descriptions of the scene to enable a simple controller to drive autonomously. Falling in between the two extremes of mediated perception and behavior reflex, we argue that our direct perception representation provides the right level of abstraction. To demonstrate this, we train a deep Convolutional Neural Network using recording from 12 hours of human driving in a video game and show that our model can work well to drive a car in a very diverse set of virtual environments. We also train a model for car distance estimation on the KITTI dataset. Results show that our direct perception approach can generalize well to real driving images. Source code and data are available on our project website

    A Theoretical and Experimental Analysis of the Outside World Perception Process

    Get PDF
    The outside scene is often an important source of information for manual control tasks. Important examples of these are car driving and aircraft control. This paper deals with modelling this visual scene perception process on the basis of linear perspective geometry and the relative motion cues. Model predictions utilizing psychophysical threshold data from base-line experiments and literature of a variety of visual approach tasks are compared with experimental data. Both the performance and workload results illustrate that the model provides a meaningful description of the outside world perception process, with a useful predictive capability

    Natural scene statistics mediate the perception of image complexity

    Get PDF
    Humans are sensitive to complexity and regularity in patterns. The subjective perception of pattern complexity is correlated to algorithmic (Kolmogorov-Chaitin) complexity as defined in computer science, but also to the frequency of naturally occurring patterns. However, the possible mediational role of natural frequencies in the perception of algorithmic complexity remains unclear. Here we reanalyze Hsu et al. (2010) through a mediational analysis, and complement their results in a new experiment. We conclude that human perception of complexity seems partly shaped by natural scenes statistics, thereby establishing a link between the perception of complexity and the effect of natural scene statistics
    • 

    corecore