3 research outputs found

    Robot Learning and Execution of Collaborative Manipulation Plans from YouTube Cooking Videos

    Full text link
    People often watch videos on the web to learn how to cook new recipes, assemble furniture or repair a computer. We wish to enable robots with the very same capability. This is challenging; there is a large variation in manipulation actions and some videos even involve multiple persons, who collaborate by sharing and exchanging objects and tools. Furthermore, the learned representations need to be general enough to be transferable to robotic systems. On the other hand, previous work has shown that the space of human manipulation actions has a linguistic, hierarchical structure that relates actions to manipulated objects and tools. Building upon this theory of language for action, we propose a framework for understanding and executing demonstrated action sequences from full-length, unconstrained cooking videos on the web. The framework takes as input a cooking video annotated with object labels and bounding boxes, and outputs a collaborative manipulation action plan for one or more robotic arms. We demonstrate performance of the system in a standardized dataset of 100 YouTube cooking videos, as well as in three full-length Youtube videos that include collaborative actions between two participants. We additionally propose an open-source platform for executing the learned plans in a simulation environment as well as with an actual robotic arm

    Robust human motion prediction for safe and efficient human-robot interaction

    No full text
    Thesis: Ph. D. in Autonomous Systems, Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2019Cataloged from PDF version of thesis.Includes bibliographical references (pages 175-188).From robotic co-workers in factories to assistive robots in homes, human-robot interaction (HRI) has the potential to revolutionize a large array of domains by enabling robotic assistance where it was previously not possible. Introducing robots into human-occupied domains, however, requires strong consideration for the safety and efficiency of the interaction. One particularly effective method of supporting safe an efficient human-robot interaction is through the use of human motion prediction. By predicting where a person might reach or walk toward in the upcoming moments, a robot can adjust its motions to proactively resolve motion conflicts and avoid impeding the person's movements. Current approaches to human motion prediction, however, often lack the robustness required for real-world deployment. Many methods are designed for predicting specific types of tasks and motions, and do not necessarily generalize well to other domains.It is also possible that no single predictor is suitable for predicting motion in a given scenario, and that multiple predictors are needed. Due to these drawbacks, without expert knowledge in the field of human motion prediction, it is difficult to deploy prediction on real robotic systems. Another key limitation of current human motion prediction approaches lies in deficiencies in partial trajectory alignment. Alignment of partially executed motions to a representative trajectory for a motion is a key enabling technology for many goal-based prediction methods. Current approaches of partial trajectory alignment, however, do not provide satisfactory alignments for many real-world trajectories. Specifically, due to reliance on Euclidean distance metrics, overlapping trajectory regions and temporary stops lead to large alignment errors.In this thesis, I introduce two frameworks designed to improve the robustness of human motion prediction in order to facilitate its use for safe and efficient human-robot interaction. First, I introduce the Multiple-Predictor System (MPS), a datadriven approach that uses given task and motion data in order to synthesize a high performing predictor by automatically identifying informative prediction features and combining the strengths of complementary prediction methods. With the use of three distinct human motion datasets, I show that using the MPS leads to lower prediction error in a variety of HRI scenarios, and allows for accurate prediction for a range of time horizons. Second, in order to address the drawbacks of prior alignment techniques, I introduce the Bayesian ESTimator for Partial Trajectory Alignment (BEST-PTA).This Bayesian estimation framework uses a combination of optimization, supervised learning, and unsupervised learning components that are trained and synthesized based on a given set of example trajectories. Through an evaluation on three human motion datasets, I show that BEST-PTA reduces alignment error when compared to state-of-the-art baselines. Furthermore, I demonstrate that this improved alignment reduces human motion prediction error. Lastly, in order to assess the utility of the developed methods for improving safety and efficiency in HRI, I introduce an integrated framework combining prediction with robot planning in time. I describe an implementation and evaluation of this framework on a real physical system. Through this demonstration, I show that the developed approach leads to automatically derived adaptive robot behavior. I show that the developed framework leads to improvements in quantitative metrics of safety and efficiency with the use of a simulated evaluation."Funded by the NASA Space Technology Research Fellowship Program and the National Science Foundation"--Page 6by Przemyslaw A. Lasota.Ph. D. in Autonomous SystemsPh.D.inAutonomousSystems Massachusetts Institute of Technology, Department of Aeronautics and Astronautic
    corecore