94,090 research outputs found

    MANIPULATION ACTION UNDERSTANDING FOR OBSERVATION AND EXECUTION

    Get PDF
    Modern intelligent agents will need to learn the actions that humans perform. They will need to recognize these actions when they see them and they will need to perform these actions themselves. We want to propose a cognitive system that interprets human manipulation actions from perceptual information (image and depth data) and consists of perceptual modules and reasoning modules that are in interaction with each other. The contributions of this work are given along two core problems at the heart of action understanding: a.) the grounding of relevant information about actions in perception (the perception - action integration problem), and b.) the organization of perceptual and high-level symbolic information for interpreting the actions (the sequencing problem). At the high level, actions are represented with the Manipulation Action Context-free Grammar (MACFG) , a syntactic grammar and associated parsing algorithms, which organizes actions as a sequence of sub-events. Each sub-event is described by the hand (as well as grasp type), movements (actions) and the objects and tools involved, and the relevant information about these quantities is obtained from biological-inspired perception modules. These modules track the hands and objects and recognize the hand grasp, actions, segmentation, and action consequences. Furthermore, a probabilistic semantic parsing framework based on CCG (Combinatory Categorial Grammar) theory is adopted to model the semantic meaning of human manipulation actions. Additionally, the lesson from the findings on mirror neurons is that the two processes of interpreting visually observed action and generating actions, should share the same underlying cognitive process. Recent studies have shown that grammatical structures underlie the representation of manipulation actions, which are used both to understand and to execute these actions. Analogically, understanding manipulation actions is like understanding language, while executing them is like generating language. Experiments on two tasks, 1) a robot observing people performing manipulation actions, and 2) a robot then executing manipulation actions accordingly, are presented to validate the formalism. The technical parts of this thesis are devoted to the experimental setting of task (1), while the task (2) is given as a live demonstration

    Learning the Semantics of Manipulation Action

    Full text link
    In this paper we present a formal computational framework for modeling manipulation actions. The introduced formalism leads to semantics of manipulation action and has applications to both observing and understanding human manipulation actions as well as executing them with a robotic mechanism (e.g. a humanoid robot). It is based on a Combinatory Categorial Grammar. The goal of the introduced framework is to: (1) represent manipulation actions with both syntax and semantic parts, where the semantic part employs λ\lambda-calculus; (2) enable a probabilistic semantic parsing schema to learn the λ\lambda-calculus representation of manipulation action from an annotated action corpus of videos; (3) use (1) and (2) to develop a system that visually observes manipulation actions and understands their meaning while it can reason beyond observations using propositional logic and axiom schemata. The experiments conducted on a public available large manipulation action dataset validate the theoretical framework and our implementation

    Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes

    Full text link
    We present the Semantic Robot Programming (SRP) paradigm as a convergence of robot programming by demonstration and semantic mapping. In SRP, a user can directly program a robot manipulator by demonstrating a snapshot of their intended goal scene in workspace. The robot then parses this goal as a scene graph comprised of object poses and inter-object relations, assuming known object geometries. Task and motion planning is then used to realize the user's goal from an arbitrary initial scene configuration. Even when faced with different initial scene configurations, SRP enables the robot to seamlessly adapt to reach the user's demonstrated goal. For scene perception, we propose the Discriminatively-Informed Generative Estimation of Scenes and Transforms (DIGEST) method to infer the initial and goal states of the world from RGBD images. The efficacy of SRP with DIGEST perception is demonstrated for the task of tray-setting with a Michigan Progress Fetch robot. Scene perception and task execution are evaluated with a public household occlusion dataset and our cluttered scene dataset.Comment: published in ICRA 201

    Neuronal bases of structural coherence in contemporary dance observation

    Get PDF
    The neuronal processes underlying dance observation have been the focus of an increasing number of brain imaging studies over the past decade. However, the existing literature mainly dealt with effects of motor and visual expertise, whereas the neural and cognitive mechanisms that underlie the interpretation of dance choreographies remained unexplored. Hence, much attention has been given to the Action Observation Network (AON) whereas the role of other potentially relevant neuro-cognitive mechanisms such as mentalizing (theory of mind) or language (narrative comprehension) in dance understanding is yet to be elucidated. We report the results of an fMRI study where the structural coherence of short contemporary dance choreographies was manipulated parametrically using the same taped movement material. Our participants were all trained dancers. The whole-brain analysis argues that the interpretation of structurally coherent dance phrases involves a subpart (Superior Parietal) of the AON as well as mentalizing regions in the dorsomedial Prefrontal Cortex. An ROI analysis based on a similar study using linguistic materials (Pallier et al. 2011) suggests that structural processing in language and dance might share certain neural mechanisms

    Data-Driven Grasp Synthesis - A Survey

    Full text link
    We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.Comment: 20 pages, 30 Figures, submitted to IEEE Transactions on Robotic
    • …
    corecore