773 research outputs found

    Generative and predictive models for robust manipulation

    Get PDF
    Probabilistic modelling of manipulation skills, perception and uncertainty pose many challenges at different stages of a typical robot manipulation pipeline. This thesis is about devising algorithms and strategies for improving robustness in object manipulation skills acquired from demonstration and derived from learnt physical models in non-prehensile tasks such as pushing. Manipulation skills can be made robust in different ways: first by improving time performance for grasp synthesis, second by employing active perceptual strategies that exploit generated grasp action hypothesis to more efficiently gather task-relevant information for grasp generation, and finally via exploiting predictive uncertainty in learnt physical models. Hence, robust manipulation skills emerge from the interplay of a triad of capabilities: generative modelling for action synthesis, active perception, and finally learning and exploiting uncertainty in physical interactions. This thesis addresses these problems by • Showing how parametric models for approximating multimodal distributions can be used as a computationally faster method for generative grasp synthesis. • Exploiting generative methods for dexterous grasp synthesis and investigating how active vision strategies can be applied to improve grasp execution safety, success rate, and utilise fewer camera views of an object for grasp generation. • Outlining methods to model and exploit predictive uncertainty from learnt forward models to achieve robust, uncertainty-averse non-prehensile manipulation, such as push manipulation. In particular, the thesis: (i) presents a framework for generative grasp synthesis with applications for real-time grasp synthesis suitable for multi-fingered robot hands; (ii) describes a sensorisation method for under-actuated hands, such as the Pisa/IIT SoftHand, which allows us to deploy the aforementioned grasp synthesis framework to this type of robotic hand; (iii) provides an active vision approach for view selection that makes use of generative grasp synthesis methods to perform perceptual predictions in order to leverage grasp performance, taking into account grasp execution safety and contact information; and (iv) finally, going beyond prehensile skills, provides an approach to model and exploit predictive uncertainty from learnt physics applied to push manipulation. Experimental results are presented in simulation and on real robot platforms to validate the proposed methods

    Probabilistic mixture-based image modelling

    Get PDF
    summary:During the last decade we have introduced probabilistic mixture models into image modelling area, which present highly atypical and extremely demanding applications for these models. This difficulty arises from the necessity to model tens thousands correlated data simultaneously and to reliably learn such unusually complex mixture models. Presented paper surveys these novel generative colour image models based on multivariate discrete, Gaussian or Bernoulli mixtures, respectively and demonstrates their major advantages and drawbacks on texture modelling applications. Our mixture models are restricted to represent two-dimensional visual information. Thus a measured 3D multi-spectral texture is spectrally factorized and corresponding multivariate mixture models are further learned from single orthogonal mono-spectral components and used to synthesise and enlarge these mono-spectral factor components. Texture synthesis is based on easy computation of arbitrary conditional distributions from the model. Finally single synthesised mono-spectral texture planes are transformed into the required synthetic multi-spectral texture. Such models can easily serve not only for texture enlargement but also for segmentation, restoration, and retrieval or to model single factors in unusually complex seven dimensional Bidirectional Texture Function (BTF) space models. The strengths and weaknesses of the presented discrete, Gaussian or Bernoulli mixture based approaches are demonstrated on several colour texture examples

    Generative Models for Learning Robot Manipulation Skills from Humans

    Get PDF
    A long standing goal in artificial intelligence is to make robots seamlessly interact with humans in performing everyday manipulation skills. Learning from demonstrations or imitation learning provides a promising route to bridge this gap. In contrast to direct trajectory learning from demonstrations, many problems arise in interactive robotic applications that require higher contextual level understanding of the environment. This requires learning invariant mappings in the demonstrations that can generalize across different environmental situations such as size, position, orientation of objects, viewpoint of the observer, etc. In this thesis, we address this challenge by encapsulating invariant patterns in the demonstrations using probabilistic learning models for acquiring dexterous manipulation skills. We learn the joint probability density function of the demonstrations with a hidden semi-Markov model, and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The model exploits the invariant segments (also termed as sub-goals, options or actions) in the demonstrations and adapts the movement in accordance with the external environmental situations such as size, position and orientation of the objects in the environment using a task-parameterized formulation. We incorporate high-dimensional sensory data for skill acquisition by parsimoniously representing the demonstrations using statistical subspace clustering methods and exploit the coordination patterns in latent space. To adapt the models on the fly and/or teach new manipulation skills online with the streaming data, we formulate a non-parametric scalable online sequence clustering algorithm with Bayesian non-parametric mixture models to avoid the model selection problem while ensuring tractability under small variance asymptotics. We exploit the developed generative models to perform manipulation skills with remotely operated vehicles over satellite communication in the presence of communication delays and limited bandwidth. A set of task-parameterized generative models are learned from the demonstrations of different manipulation skills provided by the teleoperator. The model captures the intention of teleoperator on one hand and provides assistance in performing remote manipulation tasks on the other hand under varying environmental situations. The assistance is formulated under time-independent shared control, where the model continuously corrects the remote arm movement based on the current state of the teleoperator; and/or time-dependent autonomous control, where the model synthesizes the movement of the remote arm for autonomous skill execution. Using the proposed methodology with the two-armed Baxter robot as a mock-up for semi-autonomous teleoperation, we are able to learn manipulation skills such as opening a valve, pick-and-place an object by obstacle avoidance, hot-stabbing (a specialized underwater task akin to peg-in-a-hole task), screw-driver target snapping, and tracking a carabiner in as few as 4 - 8 demonstrations. Our study shows that the proposed manipulation assistance formulations improve the performance of the teleoperator by reducing the task errors and the execution time, while catering for the environmental differences in performing remote manipulation tasks with limited bandwidth and communication delays

    Efficient Belief Propagation for Perception and Manipulation in Clutter

    Full text link
    Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy. In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations. To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides. To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios. To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd

    Parametric Human Movements:Learning, Synthesis, Recognition, and Tracking

    Get PDF

    Object-centric generative models for robot perception and action

    Get PDF
    The system of robot manipulation involves a pipeline consisting of the perception of objects in the environment and the planning of actions in 3D space. Deep learning approaches are employed to segment scenes into components of objects and then learn object-centric features to predict actions for downstream tasks. Despite having achieved promising performance in several manipulation tasks, supervised approaches lack inductive biases related to general properties of objects. Recent advances show that by encoding and reconstructing scenes in an object-centric fashion, the model can discover object-like entities from raw data without human supervision. Moreover, by reconstructing the discovered objects, the model can learn a variational latent space that captures the various shapes and textures of the objects, regularised by a chosen prior distribution. In this thesis, we investigate the properties of this learned object-centric latent space and develop novel object-centric generative models (OCGMs) that can be applied to real-world robotics scenarios. In the first part of this thesis, we investigate a tool-synthesis task which leverages a learned latent space to optimise a wide range of tools applied to a reaching task. Given an image that illustrates the obstacles and the reaching target in the scene, an affordance predictor is trained to predict the feasibility of the tool for the given task. To imitate human tool-use experiences, feasibility labels are acquired from simulated trial-and-errors of the reaching task. We found that by employing an activation maximisation step, the model can synthesis proper tools for the given tasks with high accuracy. Moreover, the tool-synthesis process indicates the existence of a task-relevant trajectory in the learned latent space that can be found by a trained affordance predictor. The second part of this thesis focuses on the development of novel OCGMs and their applications to robotic tasks. We first introduce a 2D OCGM that is deployed to robot manipulation datasets in both simulation and real-world scenarios. Despite the intensive interactions between robot arm and objects, we find the model discovers meaningful object entities from the raw observations without any human supervision. We next upgrade the 2D OCGM to 3D by leveraging NeRFs as decoders to explicitly model the 3D geometry of objects and the background. To disentangle the object spatial information from its appearance information, we propose a minimum volume principle for unsupervised 6D pose estimation of the objects. Considering the occlusion in the scene, we further improve the pose estimation by introducing a shape completion module to imagine the unobserved parts of the objects before the pose estimation step. In the end, we successfully apply the model in real-world robotics scenarios and compare its performance in several tasks including the 3D reconstruction, object-centric latent representation learning, 6D pose estimation for object rearrangement, against several baselines. We find that despite being an unsupervised approach, our model achieves improved performance across a range of different real-world tasks

    Generative neural data synthesis for autonomous systems

    Get PDF
    A significant number of Machine Learning methods for automation currently rely on data-hungry training techniques. The lack of accessible training data often represents an insurmountable obstacle, especially in the fields of robotics and automation, where acquiring new data can be far from trivial. Additional data acquisition is not only often expensive and time-consuming, but occasionally is not even an option. Furthermore, the real world applications sometimes have commercial sensitivity issues associated with the distribution of the raw data. This doctoral thesis explores bypassing the aforementioned difficulties by synthesising new realistic and diverse datasets using the Generative Adversarial Network (GAN). The success of this approach is demonstrated empirically through solving a variety of case-specific data-hungry problems, via application of novel GAN-based techniques and architectures. Specifically, it starts with exploring the use of GANs for the realistic simulation of the extremely high-dimensional underwater acoustic imagery for the purpose of training both teleoperators and autonomous target recognition systems. We have developed a method capable of generating realistic sonar data of any chosen dimension by image-translation GANs with Markov principle. Following this, we apply GAN-based models to robot behavioural repertoire generation, that enables a robot manipulator to successfully overcome unforeseen impedances, such as unknown sets of obstacles and random broken joints scenarios. Finally, we consider dynamical system identification for articulated robot arms. We show how using diversity-driven GAN models to generate exploratory trajectories can allow dynamic parameters to be identified more efficiently and accurately than with conventional optimisation approaches. Together, these results show that GANs have the potential to benefit a variety of robotics learning problems where training data is currently a bottleneck

    Principles, opportunism and seeing in design : a computational approach

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture; and, (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1991.Includes bibliographical references (leaves 100-103).This thesis introduces elements of a theory of design activity and a computational framework for developing design systems. The theory stresses the opportunistic nature of designing and the complementary roles of focus and distraction, the interdependence of evaluation and generation, the multiplicity of ways of seeing over the history of a design session versus the exclusivity of a given way of seeing over an arbitrarily short period, and the incommensurability of criteria used to evaluate a design. The thesis argues for a principle based rather than rule based approach to designing design systems, and highlights the manifest nature of design documents. The Discursive Generator is presented as a computational framework for implementing specific design systems, and a simple system for arranging blocks according to a set of formal principles is developed by way of illustration. Both shape grammars and constraint based systems are used to contrast current trends in design automation with the discursive approach advocated in the thesis. The Discursive Generator is shown to have some important properties lacking in other types of system, such as dynamism, robustness and the ability to deal with partial designs. When studied in terms of a search metaphor, the Discursive Generator is shown to exhibit behavior which is radically different from some traditional search techniques, and to avoid some of the well-known difficulties associated with them.by Pegor H. Papazian.M.S
    corecore