224,608 research outputs found

    Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning

    Full text link
    Skilled robotic manipulation benefits from complex synergies between non-prehensile (e.g. pushing) and prehensile (e.g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help displace objects to make pushing movements more precise and collision-free. In this work, we demonstrate that it is possible to discover and learn these synergies from scratch through model-free deep reinforcement learning. Our method involves training two fully convolutional networks that map from visual observations to actions: one infers the utility of pushes for a dense pixel-wise sampling of end effector orientations and locations, while the other does the same for grasping. Both networks are trained jointly in a Q-learning framework and are entirely self-supervised by trial and error, where rewards are provided from successful grasps. In this way, our policy learns pushing motions that enable future grasps, while learning grasps that can leverage past pushes. During picking experiments in both simulation and real-world scenarios, we find that our system quickly learns complex behaviors amid challenging cases of clutter, and achieves better grasping success rates and picking efficiencies than baseline alternatives after only a few hours of training. We further demonstrate that our method is capable of generalizing to novel objects. Qualitative results (videos), code, pre-trained models, and simulation environments are available at http://vpg.cs.princeton.eduComment: To appear at the International Conference On Intelligent Robots and Systems (IROS) 2018. Project webpage: http://vpg.cs.princeton.edu Summary video: https://youtu.be/-OkyX7Zlhi

    Adaptive intermittent control: A computational model explaining motor intermittency observed in human behavior

    Get PDF
    It is a fundamental question how our brain performs a given motor task in a real-time fashion with the slow sensorimotor system. Computational theory proposed an influential idea of feed-forward control, but it has mainly treated the case that the movement is ballistic (such as reaching) because the motor commands should be calculated in advance of movement execution. As a possible mechanism for operating feed-forward control in continuous motor tasks (such as target tracking), we propose a control model called "adaptive intermittent control" or "segmented control," that brain adaptively divides the continuous time axis into discrete segments and executes feed-forward control in each segment. The idea of intermittent control has been proposed in the fields of control theory, biological modeling and nonlinear dynamical system. Compared with these previous models, the key of the proposed model is that the system speculatively determines the segmentation based on the future prediction and its uncertainty. The result of computer simulation showed that the proposed model realized faithful visuo-manual tracking with realistic sensorimotor delays and with less computational costs (i.e., with fewer number of segments). Furthermore, it replicated "motor intermittency", that is, intermittent discontinuities commonly observed in human movement trajectories. We discuss that the temporally segmented control is an inevitable strategy for brain which has to achieve a given task with small computational (or cognitive) cost, using a slow control system in an uncertain variable environment, and the motor intermittency is the side-effect of this strategy

    The very same thing: Extending the object token concept to incorporate causal constraints on individual identity

    Get PDF
    The contributions of feature recognition, object categorization, and recollection of episodic memories to the re-identification of a perceived object as the very same thing encountered in a previous perceptual episode are well understood in terms of both cognitive-behavioral phenomenology and neurofunctional implementation. Human beings do not, however, rely solely on features and context to re-identify individuals; in the presence of featural change and similarly-featured distractors, people routinely employ causal constraints to establish object identities. Based on available cognitive and neurofunctional data, the standard object-token based model of individual re-identification is extended to incorporate the construction of unobserved and hence fictive causal histories (FCHs) of observed objects by the pre-motor action planning system. Cognitive-behavioral and implementation-level predictions of this extended model and methods for testing them are outlined. It is suggested that functional deficits in the construction of FCHs are associated with clinical outcomes in both Autism Spectrum Disorders and later-stage stage Alzheimer's disease.\u
    • …
    corecore