    Shared Autonomy via Hindsight Optimization

    In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly

    Generative Models for Learning Robot Manipulation Skills from Humans

    A long standing goal in artificial intelligence is to make robots seamlessly interact with humans in performing everyday manipulation skills. Learning from demonstrations or imitation learning provides a promising route to bridge this gap. In contrast to direct trajectory learning from demonstrations, many problems arise in interactive robotic applications that require higher contextual level understanding of the environment. This requires learning invariant mappings in the demonstrations that can generalize across different environmental situations such as size, position, orientation of objects, viewpoint of the observer, etc. In this thesis, we address this challenge by encapsulating invariant patterns in the demonstrations using probabilistic learning models for acquiring dexterous manipulation skills. We learn the joint probability density function of the demonstrations with a hidden semi-Markov model, and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The model exploits the invariant segments (also termed as sub-goals, options or actions) in the demonstrations and adapts the movement in accordance with the external environmental situations such as size, position and orientation of the objects in the environment using a task-parameterized formulation. We incorporate high-dimensional sensory data for skill acquisition by parsimoniously representing the demonstrations using statistical subspace clustering methods and exploit the coordination patterns in latent space. To adapt the models on the fly and/or teach new manipulation skills online with the streaming data, we formulate a non-parametric scalable online sequence clustering algorithm with Bayesian non-parametric mixture models to avoid the model selection problem while ensuring tractability under small variance asymptotics. We exploit the developed generative models to perform manipulation skills with remotely operated vehicles over satellite communication in the presence of communication delays and limited bandwidth. A set of task-parameterized generative models are learned from the demonstrations of different manipulation skills provided by the teleoperator. The model captures the intention of teleoperator on one hand and provides assistance in performing remote manipulation tasks on the other hand under varying environmental situations. The assistance is formulated under time-independent shared control, where the model continuously corrects the remote arm movement based on the current state of the teleoperator; and/or time-dependent autonomous control, where the model synthesizes the movement of the remote arm for autonomous skill execution. Using the proposed methodology with the two-armed Baxter robot as a mock-up for semi-autonomous teleoperation, we are able to learn manipulation skills such as opening a valve, pick-and-place an object by obstacle avoidance, hot-stabbing (a specialized underwater task akin to peg-in-a-hole task), screw-driver target snapping, and tracking a carabiner in as few as 4 - 8 demonstrations. Our study shows that the proposed manipulation assistance formulations improve the performance of the teleoperator by reducing the task errors and the execution time, while catering for the environmental differences in performing remote manipulation tasks with limited bandwidth and communication delays

    A task learning mechanism for the telerobots

    Telerobotic systems have attracted growing attention because of their superiority in the dangerous or unknown interaction tasks. It is very challengeable to exploit such systems to implement complex tasks in an autonomous way. In this paper, we propose a task learning framework to represent the manipulation skill demonstrated by a remotely controlled robot.Gaussian mixture model is utilized to encode and parametrize the smooth task trajectory according to the observations from the demonstrations. After encoding the demonstrated trajectory, a new task trajectory is generated based on the variability information of the learned model. Experimental results have demonstrated the feasibility of the proposed method

    A robot learning method with physiological interface for teleoperation systems

    The human operator largely relies on the perception of remote environmental conditions to make timely and correct decisions in a prescribed task when the robot is teleoperated in a remote place. However, due to the unknown and dynamic working environments, the manipulator's performance and efficiency of the human-robot interaction in the tasks may degrade significantly. In this study, a novel method of human-centric interaction, through a physiological interface was presented to capture the information details of the remote operation environments. Simultaneously, in order to relieve workload of the human operator and to improve efficiency of the teleoperation system, an updated regression method was proposed to build up a nonlinear model of demonstration for the prescribed task. Considering that the demonstration data were of various lengths, dynamic time warping algorithm was employed first to synchronize the data over time before proceeding with other steps. The novelty of this method lies in the fact that both the task-specific information and the muscle parameters from the human operator have been taken into account in a single task; therefore, a more natural and safer interaction between the human and the robot could be achieved. The feasibility of the proposed method was demonstrated by experimental results

    Task-oriented joint design of communication and computing for Internet of Skills

    Nowadays, the internet is taking a revolutionary step forward, which is known as Internet of Skills. The Internet of Skills is a concept that refers to a network of sensors, actuators, and machines that enable knowledge, skills, and expertise delivery between people and machines, regardless of their geographical locations. This concept allows an immersive remote operation and access to expertise through virtual and augmented reality, haptic communications, robotics, and other cutting-edge technologies with various applications, including remote surgery and diagnosis in healthcare, remote laboratory and training in education, remote driving in transportation, and advanced manufacturing in Industry 4.0. In this thesis, we investigate three fundamental communication requirements of Internet of Skills applications, namely ultra-low latency, ultra-high reliability, and wireless resource utilization efficiency. Although 5G communications provide cutting-edge solutions for achieving ultra-low latency and ultra-high reliability with good resource utilization efficiency, meeting these requirements is difficult, particularly in long-distance communications where the distance between source and destination is more than 300 km, considering delays and reliability issues in networking components as well as physical limits of the speed of light. Furthermore, resource utilization efficiency must be improved further to accommodate the rapidly increasing number of mobile devices. Therefore, new design techniques that take into account both communication and computing systems with the task-oriented approach are urgently needed to satisfy conflicting latency and reliability requirements while improving resource utilization efficiency. First, we design and implement a 5G-based teleoperation prototype for Internet of Skills applications. We presented two emerging Internet of Skills use cases in healthcare and education. We conducted extensive experiments evaluating local and long-distance communication latency and reliability to gain insights into the current capabilities and limitations. From our local experiments in laboratory environment where both operator and robot in the same room, we observed that communication latency is around 15 ms with a 99.9% packet reception rate (communication reliability). However, communication latency increases up to 2 seconds in long-distance scenarios (between the UK and China), while it is around 50-300 ms within the UK experiments. In addition, our observations revealed that communication reliability and overall system performance do not exhibit a direct correlation. Instead, the number of consecutive packet drops emerged as the decisive factor influencing the overall system performance and user quality of experience. In light of these findings, we proposed a two-way timeout approach. We discarded stale packets to mitigate waiting times effectively and, in turn, reduce the latency. Nevertheless, we observed that the proposed approach reduced latency at the expense of reliability, thus verifying the challenge of the conflicting latency and reliability requirements. Next, we propose a task-oriented prediction and communication co-design framework to meet conflicting latency and reliability requirements. The proposed framework demonstrates the task-oriented joint design of communication and computing systems, where we considered packet losses in communications and prediction errors in prediction algorithms to derive the upper bound for overall system reliability. We revealed the tradeoff between overall system reliability and resource utilization efficiency, where we consider 5G NR as an example communication system. The proposed framework is evaluated with real-data samples and generated synthetic data samples. From the results, the proposed framework achieves better latency and reliability tradeoff with a 77.80% resource utilization efficiency improvement compared to a task-agnostic benchmark. In addition, we demonstrate that deploying a predictor at the receiver side achieves better overall reliability compared to a system that predictor at the transmitter. Finally, we propose an intelligent mode-switching framework to address the resource utilization challenge. We jointly design the communication, user intention recognition, and modeswitching systems to reduce communication load subject to joint task completion probability. We reveal the tradeoff between task prediction accuracy and task observation length, showing that higher prediction accuracy can be achieved when the task observation length increases. The proposed framework achieves more than 90% task prediction accuracy with 60% observation length. We train a DRL agent with real-world data from our teleoperation prototype for modeswitching between teleoperation and autonomous modes. Our results show that the proposed framework achieves up to 50% communication load reduction with similar task completion probability compared to conventional teleoperation

    Towards Skill Transfer via Learning-Based Guidance in Human-Robot Interaction

    This thesis presents learning-based guidance (LbG) approaches that aim to transfer skills from human to robot. The approaches capture the temporal and spatial information of human motions and teach robot to assist human in human-robot collaborative tasks. In such physical human-robot interaction (pHRI) environments, learning from demonstrations (LfD) enables this transferring skill. Demonstrations can be provided through kinesthetic teaching and/or teleoperation. In kinesthetic teaching, humans directly guide robot’s body to perform a task while in teleoperation, demonstrations can be done through motion/vision-based systems or haptic devices. In this work, the LbG approaches are developed through kinesthetic teaching and teleoperation in both virtual and physical environments. First, this thesis compares and analyzes the capability of two types of statistical models, generative and discriminative, to generate haptic guidance (HG) forces as well as segment and recognize gestures for pHRI that can be used in virtual minimally invasive surgery (MIS) training. In this learning-based approach, the knowledge and experience of experts are modeled to improve the unpredictable motions of novice trainees. Two statistical models, hidden Markov model (HMM) and hidden Conditional Random Fields (HCRF), are used to learn gestures from demonstrations in a virtual MIS related task. The models are developed to automatically recognize and segment gestures as well as generate guidance forces. In practice phase, the guidance forces are adaptively calculated in real time regarding gesture similarities among user motion and the gesture models. Both statistical models can successfully capture the gestures of the user and provide adaptive HG, however, results show the superiority of HCRF, as a discriminative method, compared to HMM, as a generative method, in terms of user performance. In addition, LbG approaches are developed for kinesthetic HRI simulations that aim to transfer the skills of expert surgeons to resident trainees. The discriminative nature of HCRF is incorporated into the approach to produce LbG forces and discriminate the skill levels of users. To experimentally evaluate this kinesthetic-based approach, a femur bone drilling simulation is developed in which residents are provided haptic feedback based on real computed tomography (CT) data that enable them to feel the variable stiffness of bone layers. Orthepaedic surgeons require to adjust drilling force since bone layers have different stiffness. In the learning phase, using the simulation, an expert HCRF model is trained from expert surgeons demonstration to learn the stiffness variations of different bone layers. A novice HCRF model is also developed from the demonstration of novice residents to discriminate the skill levels of a new trainee. During the practice phase, the learning-based approach, which encoded the stiffness variations, guides the trainees to perform training tasks similar to experts motions. Finally, in contrast to other parts of the thesis, an LbG approach is developed through teleoperation in physical environment. The approach assists operators to navigate a teleoperated robot through a haptic steering wheel and a haptic gas pedal. A set of expert operator demonstrations are used to develop maneuvering skill model. The temporal and spatial variation of demonstrations are learned using HMM as the skill model. A modified Gaussian Mixture regression (GMR) in combination with the HMM is also developed to robustly produce the motion during reproduction. The GMR calculates outcome motions from a joint probability density function of data rather than directly model the regression function. In addition, the distance between the robot and obstacles is incorporated into the impedance control to generate guidance forces that also assist operators with avoiding obstacle collisions. Using different forms of variable impedance control, guidance forces are computed in real time with respect to the similarities between the maneuver of users and the skill model. This encourages users to navigate a robot similar to the expert operators. The results show that user performance is improved in terms of number of collisions, task completion time, and average closeness to obstacles
