1,453 research outputs found

    Temporal Segmentation of Pair-Wise Interaction Phases in Sequential Manipulation Demonstrations

    Get PDF
    International audienceWe consider the problem of learning from complex sequential demonstrations. We propose to analyze demonstrations in terms of the concurrent interaction phases which arise between pairs of involved bodies (hand-object and object-object). These interaction phases are the key to decompose a full demonstration into its atomic manipulation actions and to extract their respective consequences. In particular, one may assume that the goal of each interaction phase is to achieve specific geometric constraints between objects. This generalizes previous Learning from Demonstration approaches by considering not just the motion of the end-effector but also the relational properties of the objects' motion. We present a linear-chain Conditional Random Field model to detect the pair-wise interaction phases and extract the geometric constraints that are established in the environment, which represent a high-level task oriented description of the demonstrated manipulation. We test our system on single- and multi-agent demonstrations of assembly tasks, respectively of a wooden toolbox and a plastic chair

    Online Robot Introspection via Wrench-based Action Grammars

    Full text link
    Robotic failure is all too common in unstructured robot tasks. Despite well-designed controllers, robots often fail due to unexpected events. How do robots measure unexpected events? Many do not. Most robots are driven by the sense-plan act paradigm, however more recently robots are undergoing a sense-plan-act-verify paradigm. In this work, we present a principled methodology to bootstrap online robot introspection for contact tasks. In effect, we are trying to enable the robot to answer the question: what did I do? Is my behavior as expected or not? To this end, we analyze noisy wrench data and postulate that the latter inherently contains patterns that can be effectively represented by a vocabulary. The vocabulary is generated by segmenting and encoding the data. When the wrench information represents a sequence of sub-tasks, we can think of the vocabulary forming a sentence (set of words with grammar rules) for a given sub-task; allowing the latter to be uniquely represented. The grammar, which can also include unexpected events, was classified in offline and online scenarios as well as for simulated and real robot experiments. Multiclass Support Vector Machines (SVMs) were used offline, while online probabilistic SVMs were are used to give temporal confidence to the introspection result. The contribution of our work is the presentation of a generalizable online semantic scheme that enables a robot to understand its high-level state whether nominal or abnormal. It is shown to work in offline and online scenarios for a particularly challenging contact task: snap assemblies. We perform the snap assembly in one-arm simulated and real one-arm experiments and a simulated two-arm experiment. This verification mechanism can be used by high-level planners or reasoning systems to enable intelligent failure recovery or determine the next most optima manipulation skill to be used.Comment: arXiv admin note: substantial text overlap with arXiv:1609.0494

    Robots Learning Manipulation Tasks from Demonstrations and Practice

    Get PDF
    Developing personalized cognitive robots that help with everyday tasks is one of the on-going topics in robotics research. Such robots should have the capability to learn skills and perform tasks in new situations. In this thesis, we study three research problems to explore the learning methods of robots in the setting of manipulation tasks. In the first problem, we investigate hand movement learning from human demonstrations. For practical purposes, we propose a system for learning hand actions from markerless demonstrations, which are captured using the Kinect sensor. The algorithm autonomously segments an example trajectory into multiple action units, each described by a movement primitive, and forms a task-specific model. With that, similar movements for different scenarios can be generated, and performed on Baxter Robots. The second problem aims to address learning robot movement adaptation under various environmental constraints. A common approach is to adopt motion primitives to generate target motions from demonstrations. However, their generalization capability is weak for novel environments. Additionally, traditional motion generation methods do not consider versatile constraints from different users, tasks, and environments. In this work, we propose a co-active learning framework for learning to adapt the movement of robot end-effectors for manipulation tasks. It is designed to adapt the original imitation trajectories, which are learned from demonstrations, to novel situations with different constraints. The framework also considers user feedback towards the adapted trajectories, and it learns to adapt movement through human-in-the-loop interactions. Experiments on a humanoid platform validate the effectiveness of our approach. In order to further adapt robots to perform more complex manipulation tasks, as the third problem, we are investigating a framework that the robot could not only plan and execute the sequential task in a new environment, but also refine its actions by learning subgoals through re-planning/re-execution during the practice. A sequential task is naturally considered as a sequence of pre-learned action primitives, each action primitive has its own goal parameters corresponding to the subgoal. We propose a system to learn the subgoals distribution of given task model using reinforcement learning by iteratively updating the parameters in the trials. As a result, by considering the learned subgoals distribution in sequential motion planning, the proposed framework could adaptively select better subgoals to generate movements for robot to execute the task successfully. We implement the framework for the task of ''openning a microwave'' involving a sequence of primitive actions and subgoals and validate it on Baxter platform

    Robot Programming from Demonstration, Feedback and Transfer

    Get PDF
    International audienceThis paper presents a novel approach for robot instruction for assembly tasks. We consider that robot programming can be made more efficient, precise and intuitive if we leverage the advantages of complementary approaches such as learning from demonstration, learning from feedback and knowledge transfer. Starting from low-level demonstrations of assembly tasks, the system is able to extract a high-level relational plan of the task. A graphical user interface (GUI) allows then the user to iteratively correct the acquired knowledge by refining high-level plans, and low-level geometrical knowledge of the task. This combination leads to a faster programming phase, more precise than just demonstrations, and more intuitive than just through a GUI. A final process allows to reuse high-level task knowledge for similar tasks in a transfer learning fashion. Finally we present a user study illustrating the advantages of this approach

    From Line Drawings to Human Actions: Deep Neural Networks for Visual Data Representation

    No full text
    In recent years, deep neural networks have been very successful in computer vision, speech recognition, and artificial intelligent systems. The rapid growth of data and fast increasing computational tools provide solid foundations for the applications which rely on the learning of large scale deep neural networks with millions of parameters. The deep learning approaches have been proved to be able to learn powerful representations of the inputs in various tasks, such as image classification, object recognition, and scene understanding. This thesis demonstrates the generality and capacity of deep learning approaches through a series of case studies including image matching and human activity understanding. In these studies, I explore the combinations of the neural network models with existing machine learning techniques and extend the deep learning approach for each task. Four related tasks are investigated: 1) image matching through similarity learning; 2) human action prediction; 3) finger force estimation in manipulation actions; and 4) bimodal learning for human action understanding. Deep neural networks have been shown to be very efficient in supervised learning. Further, in some tasks, one would like to group the features of the samples in the same category close to each other, in additional to the discriminative representation. Such kind of properties is desired in a number of applications, such as semantic retrieval, image quality measurement, and social network analysis, etc. My first study is to develop a similarity learning method based on deep neural networks for image matching between sketch images and 3D models. In this task, I propose to use Siamese network to learn similarities of sketches and develop a novel method for sketch based 3D shape retrieval. The proposed method can successfully learn the representations of sketch images as well as the similarities, then the 3D shape retrieval problem can be solved with off-the-shelf nearest neighbor methods. After studying the representation learning methods for static inputs, my focus turns to learning the representations of sequential data. To be specific, I focus on manipulation actions, because they are widely used in the daily life and play important parts in the human-robot collaboration system. Deep neural networks have been shown to be powerful to represent short video clips [Donahue et al., 2015]. However, most existing methods consider the action recognition problem as a classification task. These methods assume the inputs are pre-segmented videos and the outputs are category labels. In the scenarios such as the human-robot collaboration system, the ability to predict the ongoing human actions at an early stage is highly important. I first attempt to address this issue with a fast manipulation action prediction method. Then I build the action prediction model based on Long Short-Term Memory (LSTM) architecture. The proposed approach processes the sequential inputs as continuous signals and keeps updating the prediction of the intended action based on the learned action representations. Further, I study the relationships between visual inputs and the physical information, such as finger forces, that involved in the manipulation actions. This is motivated by recent studies in cognitive science which show that the subject’s intention is strongly related to the hand movements during an action execution. Human observers can interpret other’s actions in terms of movements and forces, which can be used to repeat the observed actions. If a robot system has the ability to estimate the force feedbacks, it can learn how to manipulate an object by watching human demonstrations. In this work, the finger forces are estimated by only watching the movement of hands. A modified LSTM model is used to regress the finger forces from video frames. To facilitate this study, a specially designed sensor glove has been used to collect data of finger forces, and a new dataset has been collected to provide synchronized streams of videos and finger forces. Last, I investigate the usefulness of physical information in human action recognition, which is an application of bimodal learning, where both the vision inputs and the additional information are used to learn the action representation. My study demonstrates that, by combining additional information with the vision inputs, the accuracy of human action recognition can be improved steadily. I extend the LSTM architecture to accept both video frames and sensor data as bimodal inputs to predict the action. A hallucination network is jointly trained to approximate the representations of the additional inputs. During the testing stage, the hallucination network generates approximated representations that used for classification. In this way, the proposed method does not rely on the additional inputs for testing

    Robot programming from demonstration, feedback and transfer

    Full text link
    • …
    corecore