249 research outputs found

    Structured machine learning models for robustness against different factors of variability in robot control

    Get PDF
    An important feature of human sensorimotor skill is our ability to learn to reuse them across different environmental contexts, in part due to our understanding of attributes of variability in these environments. This thesis explores how the structure of models used within learning for robot control could similarly help autonomous robots cope with variability, hence achieving skill generalisation. The overarching approach is to develop modular architectures that judiciously combine different forms of inductive bias for learning. In particular, we consider how models and policies should be structured in order to achieve robust behaviour in the face of different factors of variation - in the environment, in objects and in other internal parameters of a policy - with the end goal of more robust, accurate and data-efficient skill acquisition and adaptation. At a high level, variability in skill is determined by variations in constraints presented by the external environment, and in task-specific perturbations that affect the specification of optimal action. A typical example of environmental perturbation would be variation in lighting and illumination, affecting the noise characteristics of perception. An example of task perturbations would be variation in object geometry, mass or friction, and in the specification of costs associated with speed or smoothness of execution. We counteract these factors of variation by exploring three forms of structuring: utilising separate data sets curated according to the relevant factor of variation, building neural network models that incorporate this factorisation into the very structure of the networks, and learning structured loss functions. The thesis is comprised of four projects exploring this theme within robotics planning and prediction tasks. Firstly, in the setting of trajectory prediction in crowded scenes, we explore a modular architecture for learning static and dynamic environmental structure. We show that factorising the prediction problem from the individual representations allows for robust and label efficient forward modelling, and relaxes the need for full model re-training in new environments. This modularity explicitly allows for a more flexible and interpretable adaptation of trajectory prediction models to using pre-trained state of the art models. We show that this results in more efficient motion prediction and allows for performance comparable to the state-of-the-art supervised 2D trajectory prediction. Next, in the domain of contact-rich robotic manipulation, we consider a modular architecture that combines model-free learning from demonstration, in particular dynamic movement primitives (DMP), with modern model-free reinforcement learning (RL), using both on-policy and off-policy approaches. We show that factorising the skill learning problem to skill acquisition and error correction through policy adaptation strategies such as residual learning can help improve the overall performance of policies in the context of contact-rich manipulation. Our empirical evaluation demonstrates how to best do this with DMPs and propose “residual Learning from Demonstration“ (rLfD), a framework that combines DMPs with RL to learn a residual correction policy. Our evaluations, performed both in simulation and on a physical system, suggest that applying residual learning directly in task space and operating on the full pose of the robot can significantly improve the overall performance of DMPs. We show that rLfD offers a gentle to the joints solution that improves the task success and generalisation of DMPs. Last but not least, our study shows that the extracted correction policies can be transferred to different geometries and frictions through few-shot task adaptation. Third, we employ meta learning to learn time-invariant reward functions, wherein both the objectives of a task (i.e., the reward functions) and the policy for performing that task optimally are learnt simultaneously. We propose a novel inverse reinforcement learning (IRL) formulation that allows us to 1) vary the length of execution by learning time-invariant costs, and 2) relax the temporal alignment requirements for learning from demonstration. We apply our method to two different types of cost formulations and evaluate their performance in the context of learning reward functions for simulated placement and peg in hole tasks executed on a 7DoF Kuka IIWA arm. Our results show that our approach enables learning temporally invariant rewards from misaligned demonstration that can also generalise spatially to out of distribution tasks. Finally, we employ our observations to evaluate adversarial robustness in the context of transfer learning from a source trained on CIFAR 100 to a target network trained on CIFAR 10. Specifically, we study the effects of using robust optimisation in the source and target networks. This allows us to identify transfer learning strategies under which adversarial defences are successfully retained, in addition to revealing potential vulnerabilities. We study the extent to which adversarially robust features can preserve their defence properties against black and white-box attacks under three different transfer learning strategies. Our empirical evaluations give insights on how well adversarial robustness under transfer learning can generalise.

    Robot manipulator skill learning and generalising through teleoperation

    Get PDF
    Robot manipulators have been widely used for simple repetitive, and accurate tasks in industrial plants, such as pick and place, assembly and welding etc., but it is still hard to deploy in human-centred environments for dexterous manipulation tasks, such as medical examination and robot-assisted healthcare. These tasks are not only related to motion planning and control but also to the compliant interaction behaviour of robots, e.g. motion control, force regulation and impedance adaptation simultaneously under dynamic and unknown environments. Recently, with the development of collaborative robotics (cobots) and machine learning, robot skill learning and generalising have attained increasing attention from robotics, machine learning and neuroscience communities. Nevertheless, learning complex and compliant manipulation skills, such as manipulating deformable objects, scanning the human body and folding clothes, is still challenging for robots. On the other hand, teleoperation, also namely remote operation or telerobotics, has been an old research area since 1950, and there have been a number of applications such as space exploration, telemedicine, marine vehicles and emergency response etc. One of its advantages is to combine the precise control of robots with human intelligence to perform dexterous and safety-critical tasks from a distance. In addition, telepresence allows remote operators could feel the actual interaction between the robot and the environment, including the vision, sound and haptic feedback etc. Especially under the development of various augmented reality (AR), virtual reality (VR) and wearable devices, intuitive and immersive teleoperation have received increasing attention from robotics and computer science communities. Thus, various human-robot collaboration (HRC) interfaces based on the above technologies were developed to integrate robot control and telemanipulation by human operators for robot skills learning from human beings. In this context, robot skill learning could benefit teleoperation by automating repetitive and tedious tasks, and teleoperation demonstration and interaction by human teachers also allow the robot to learn progressively and interactively. Therefore, in this dissertation, we study human-robot skill transfer and generalising through intuitive teleoperation interfaces for contact-rich manipulation tasks, including medical examination, manipulating deformable objects, grasping soft objects and composite layup in manufacturing. The introduction, motivation and objectives of this thesis are introduced in Chapter 1. In Chapter 2, a literature review on manipulation skills acquisition through teleoperation is carried out, and the motivation and objectives of this thesis are discussed subsequently. Overall, the main contents of this thesis have three parts: Part 1 (Chapter 3) introduces the development and controller design of teleoperation systems with multimodal feedback, which is the foundation of this project for robot learning from human demonstration and interaction. In Part 2 (Chapters 4, 5, 6 and 7), we studied primitive skill library theory, behaviour tree-based modular method, and perception-enhanced method to improve the generalisation capability of learning from the human demonstrations. And several applications were employed to evaluate the effectiveness of these methods.In Part 3 (Chapter 8), we studied the deep multimodal neural networks to encode the manipulation skill, especially the multimodal perception information. This part conducted physical experiments on robot-assisted ultrasound scanning applications.Chapter 9 summarises the contributions and potential directions of this thesis. Keywords: Learning from demonstration; Teleoperation; Multimodal interface; Human-in-the-loop; Compliant control; Human-robot interaction; Robot-assisted sonography

    Behavior-specific proprioception models for robotic force estimation: a machine learning approach

    Get PDF
    Robots that support humans in physically demanding tasks require accurate force sensing capabilities. A common way to achieve this is by monitoring the interaction with the environment directly with dedicated force sensors. Major drawbacks of such special purpose sensors are the increased costs and the reduced payload of the robot platform. Instead, this thesis investigates how the functionality of such sensors can be approximated by utilizing force estimation approaches. Most of today’s robots are equipped with rich proprioceptive sensing capabilities where even a robotic arm, e.g., the UR5, provides access to more than hundred sensor readings. Following this trend, it is getting feasible to utilize a wide variety of sensors for force estimation purposes. Human proprioception allows estimating forces such as the weight of an object by prior experience about sensory-motor patterns. Applying a similar approach to robots enables them to learn from previous demonstrations without the need of dedicated force sensors. This thesis introduces Behavior-Specific Proprioception Models (BSPMs), a novel concept for enhancing robotic behavior with estimates of the expected proprioceptive feedback. A main methodological contribution is the operationalization of the BSPM approach using data-driven machine learning techniques. During a training phase, the behavior is continuously executed while recording proprioceptive sensor readings. The training data acquired from these demonstrations represents ground truth about behavior-specific sensory-motor experiences, i.e., the influence of performed actions and environmental conditions on the proprioceptive feedback. This data acquisition procedure does not require expert knowledge about the particular robot platform, e.g., kinematic chains or mass distribution, which is a major advantage over analytical approaches. The training data is then used to learn BSPMs, e.g. using lazy learning techniques or artificial neural networks. At runtime, the BSPMs provide estimates of the proprioceptive feedback that can be compared to actual sensations. The BSPM approach thus extends classical programming by demonstrations methods where only movement data is learned and enables robots to accurately estimate forces during behavior execution

    Modeling variation of human motion

    Get PDF
    The synthesis of realistic human motion with large variations and different styles has a growing interest in simulation applications such as the game industry, psychological experiments, and ergonomic analysis. The statistical generative models are used by motion controllers in our motion synthesis framework to create new animations for different scenarios. Data-driven motion synthesis approaches are powerful tools for producing high-fidelity character animations. With the development of motion capture technologies, more and more motion data are publicly available now. However, how to efficiently reuse a large amount of motion data to create new motions for arbitrary scenarios poses challenges, especially for unsupervised motion synthesis. This thesis presents a series of works that analyze and model the variations of human motion data. The goal is to learn statistical generative models to create any number of new human animations with rich variations and styles. The work of the thesis will be presented in three main chapters. We first explore how variation is represented in motion data. Learning a compact latent space that can expressively contain motion variation is essential for modeling motion data. We propose a novel motion latent space learning approach that can intrinsically tackle the spatialtemporal properties of motion data. Secondly, we present our Morphable Graph framework for human motion modeling and synthesis for assembly workshop scenarios. A series of studies have been conducted to apply statistical motion modeling and synthesis approaches for complex assembly workshop use cases. Learning the distribution of motion data can provide a compact representation of motion variations and convert motion synthesis tasks to optimization problems. Finally, we show how the style variations of human activities can be modeled with a limited number of examples. Natural human movements display a rich repertoire of styles and personalities. However, it is difficult to get enough examples for data-driven approaches. We propose a conditional variational autoencoder (CVAE) to combine large variations in the neutral motion database and style information from a limited number of examples.Die Synthese realistischer menschlicher Bewegungen mit großen Variationen und unterschiedlichen Stilen ist fĂŒr Simulationsanwendungen wie die Spieleindustrie, psychologische Experimente und ergonomische Analysen von wachsendem Interesse. Datengetriebene BewegungssyntheseansĂ€tze sind leistungsstarke Werkzeuge fĂŒr die Erstellung realitĂ€tsgetreuer Charakteranimationen. Mit der Entwicklung von Motion-Capture-Technologien sind nun immer mehr Motion-Daten öffentlich verfĂŒgbar. Die effiziente Wiederverwendung einer großen Menge von Motion-Daten zur Erstellung neuer Bewegungen fĂŒr beliebige Szenarien stellt jedoch eine Herausforderung dar, insbesondere fĂŒr die unĂŒberwachte Bewegungssynthesemethoden. Das Lernen der Verteilung von Motion-Daten kann eine kompakte ReprĂ€sentation von Bewegungsvariationen liefern und Bewegungssyntheseaufgaben in Optimierungsprobleme umwandeln. In dieser Dissertation werden eine Reihe von Arbeiten vorgestellt, die die Variationen menschlicher Bewegungsdaten analysieren und modellieren. Das Ziel ist es, statistische generative Modelle zu erlernen, um eine beliebige Anzahl neuer menschlicher Animationen mit reichen Variationen und Stilen zu erstellen. In unserem Bewegungssynthese-Framework werden die statistischen generativen Modelle von Bewegungscontrollern verwendet, um neue Animationen fĂŒr verschiedene Szenarien zu erstellen. Die Arbeit in dieser Dissertation wird in drei Hauptkapiteln vorgestellt. Wir untersuchen zunĂ€chst, wie Variation in Bewegungsdaten dargestellt wird. Das Erlernen eines kompakten latenten Raums, der Bewegungsvariationen ausdrucksvoll enthalten kann, ist fĂŒr die Modellierung von Bewegungsdaten unerlĂ€sslich. Wir schlagen einen neuartigen Ansatz zum Lernen des latenten Bewegungsraums vor, der die rĂ€umlich-zeitlichen Eigenschaften von Bewegungsdaten intrinsisch angehen kann. Zweitens stellen wir unser Morphable Graph Framework fĂŒr die menschliche Bewegungsmodellierung und -synthese fĂŒr Montage-Workshop- Szenarien vor. Es wurde eine Reihe von Studien durchgefĂŒhrt, um statistische Bewegungsmodellierungs und syntheseansĂ€tze fĂŒr komplexe AnwendungsfĂ€lle in MontagewerkstĂ€tten anzuwenden. Schließlich zeigen wir anhand einer begrenzten Anzahl von Beispielen, wie die Stilvariationen menschlicher AktivitĂ€ten modelliertwerden können. NatĂŒrliche menschliche Bewegungen weisen ein reiches Repertoire an Stilen und Persönlichkeiten auf. Es ist jedoch schwierig, genĂŒgend Beispiele fĂŒr datengetriebene AnsĂ€tze zu erhalten. Wir schlagen einen Conditional Variational Autoencoder (CVAE) vor, um große Variationen in der neutralen Bewegungsdatenbank und Stilinformationen aus einer begrenzten Anzahl von Beispielen zu kombinieren. Wir zeigen, dass unser Ansatz eine beliebige Anzahl von natĂŒrlich aussehenden Variationen menschlicher Bewegungen mit einem Ă€hnlichen Stil wie das Ziel erzeugen kann

    Real-Time Hybrid Visual Servoing of a Redundant Manipulator via Deep Reinforcement Learning

    Get PDF
    Fixtureless assembly may be necessary in some manufacturing tasks and environ-ments due to various constraints but poses challenges for automation due to non-deterministic characteristics not favoured by traditional approaches to industrial au-tomation. Visual servoing methods of robotic control could be effective for sensitive manipulation tasks where the desired end-effector pose can be ascertained via visual cues. Visual data is complex and computationally expensive to process but deep reinforcement learning has shown promise for robotic control in vision-based manipu-lation tasks. However, these methods are rarely used in industry due to the resources and expertise required to develop application-specific systems and prohibitive train-ing costs. Training reinforcement learning models in simulated environments offers a number of benefits for the development of robust robotic control algorithms by reducing training time and costs, and providing repeatable benchmarks for which algorithms can be tested, developed and eventually deployed on real robotic control environments. In this work, we present a new simulated reinforcement learning envi-ronment for developing accurate robotic manipulation control systems in fixtureless environments. Our environment incorporates a contemporary collaborative industrial robot, the KUKA LBR iiwa, with the goal of positioning its end effector in a generic fixtureless environment based on a visual cue. Observational inputs are comprised of the robotic joint positions and velocities, as well as two cameras, whose positioning reflect hybrid visual servoing with one camera attached to the robotic end-effector, and another observing the workspace respectively. We propose a state-of-the-art deep reinforcement learning approach to solving the task environment and make prelimi-nary assessments of the efficacy of this approach to hybrid visual servoing methods for the defined problem environment. We also conduct a series of experiments ex-ploring the hyperparameter space in the proposed reinforcement learning method. Although we could not prove the efficacy of a deep reinforcement approach to solving the task environment with our initial results, we remain confident that such an ap-proach could be feasible to solving this industrial manufacturing challenge and that our contributions in this work in terms of the novel software provide a good basis for the exploration of reinforcement learning approaches to hybrid visual servoing in accurate manufacturing contexts

    On the Use of Large Area Tactile Feedback for Contact Data Processing and Robot Control

    Get PDF
    The progress in microelectronics and embedded systems has recently enabled the realization of devices for robots functionally similar to the human skin, providing large area tactile feedback over the whole robot body. The availability of such kind of systems, commonly referred to as extit{robot skins}, makes possible to measure the contact pressure distribution applied on the robot body over an arbitrary area. Large area tactile systems open new scenarios on contact processing, both for control and cognitive level processing, enabling the interpretation of physical contacts. The contents proposed in this thesis address these topics by proposing techniques exploiting large area tactile feedback for: (i) contact data processing and classification; (ii) robot control

    Centralized learning and planning : for cognitive robots operating in human domains

    Get PDF

    Architectures for online simulation-based inference applied to robot motion planning

    Get PDF
    Robotic systems have enjoyed significant adoption in industrial and field applications in structured environments, where clear specifications of the task and observations are available. Deploying robots in unstructured and dynamic environments remains a challenge, being addressed through emerging advances in machine learning. The key open issues in this area include the difficulty of achieving coverage of all factors of variation in the domain of interest, satisfying safety constraints, etc. One tool that has played a crucial role in addressing these issues is simulation - which is used to generate data, and sometimes as a world representation within the decision-making loop. When physical simulation modules are used in this way, a number of computational problems arise. Firstly, a suitable simulation representation and fidelity is required for the specific task of interest. Secondly, we need to perform parameter inference of physical variables being used in the simulation models. Thirdly, there is the need for data assimilation, which must be achieved in real-time if the resulting model is to be used within the online decision-making loop. These are the motivating problems for this thesis. In the first section of the thesis, we tackle the inference problem with respect to a fluid simulation model, where a sensorised UAV performs path planning with the objective of acquiring data including gas concentration/identity and IMU-based wind estimation readings. The task for the UAV is to localise the source of a gas leak, while accommodating the subsequent dispersion of the gas in windy conditions. We present a formulation of this problem that allows us to perform online and real-time active inference efficiently through problem-specific simplifications. In the second section of the thesis, we explore the problem of robot motion planning when the true state is not fully observable, and actions influence how much of the state is subsequently observed. This is motivated by the practical problem of a robot performing suction in the surgical automation setting. The objective is the efficient removal of liquid while respecting a safety constraint - to not touch the underlying tissue if possible. If the problem were represented in full generality, as one of planning under uncertainty and hidden state, it could be hard to find computationally efficient solutions. Once again, we make problem-specific simplifications. Crucially, instead of reasoning in general about fluid flows and arbitrary surfaces, we exploit the observations that the decision can be informed by the contour tree skeleton of the volume, and the configurations in which the fluid would come to rest if unperturbed. This allows us to address the problem as one of iterative shortest path computation, whose costs are informed by a model estimating the shape of the underlying surface. In the third and final section of the thesis, we propose a model for real-time parameter estimation directly from raw pixel observations. Through the use of a Variational Recurrent Neural Network model, where the latent space is further structured by penalising for fit to data from a physical simulation, we devise an efficient online inference scheme. This is first shown in the context of a representative dynamic manipulation task for a robot. This task involves reasoning about a bouncing ball that it must catch – using as input the raw video from an environment-mounted camera and accommodating noise and variations in the object and environmental conditions. We then show that the same architecture lends itself to solving inference problems involving more complex dynamics, by applying this to measurement inversion of ultrafast X-Ray scattering data to infer molecular geometry
    • 

    corecore