9 research outputs found

    Gaze-based, context-aware robotic system for assisted reaching and grasping

    Get PDF
    Assistive robotic systems endeavour to support those with movement disabilities, enabling them to move again and regain functionality. Main issue with these systems is the complexity of their low-level control, and how to translate this to simpler, higher level commands that are easy and intuitive for a human user to interact with. We have created a multi-modal system, consisting of different sensing, decision making and actuating modalities, leading to intuitive, human-in-the-loop assistive robotics. The system takes its cue from the user's gaze, to decode their intentions and implement low-level motion actions to achieve high-level tasks. This results in the user simply having to look at the objects of interest, for the robotic system to assist them in reaching for those objects, grasping them, and using them to interact with other objects. We present our method for 3D gaze estimation, and grammars-based implementation of sequences of action with the robotic system. The 3D gaze estimation is evaluated with 8 subjects, showing an overall accuracy of 4.68±0.14cm4.68\pm0.14cm. The full system is tested with 5 subjects, showing successful implementation of 100%100\% of reach to gaze point actions and full implementation of pick and place tasks in 96\%, and pick and pour tasks in 76%76\% of cases. Finally we present a discussion on our results and what future work is needed to improve the system

    Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation

    Full text link
    Robotic systems are ever more capable of automation and fulfilment of complex tasks, particularly with reliance on recent advances in intelligent systems, deep learning and artificial intelligence. However, as robots and humans come closer in their interactions, the matter of interpretability, or explainability of robot decision-making processes for the human grows in importance. A successful interaction and collaboration will only take place through mutual understanding of underlying representations of the environment and the task at hand. This is currently a challenge in deep learning systems. We present a hierarchical deep reinforcement learning system, consisting of a low-level agent handling the large actions/states space of a robotic system efficiently, by following the directives of a high-level agent which is learning the high-level dynamics of the environment and task. This high-level agent forms a representation of the world and task at hand that is interpretable for a human operator. The method, which we call Dot-to-Dot, is tested on a MuJoCo-based model of the Fetch Robotics Manipulator, as well as a Shadow Hand, to test its performance. Results show efficient learning of complex actions/states spaces by the low-level agent, and an interpretable representation of the task and decision-making process learned by the high-level agent

    Context change and triggers for human intention recognition

    Get PDF
    In human-robot interaction, understanding human intention is important to smooth interaction between humans and robots. Proactive human-robot interactions are the trend. They rely on recognising human intentions to complete tasks. The reasoning is accomplished based on the current human state, environment and context, and human intention recognition and prediction. Many factors may affect human intention, including clues which are difficult to recognise directly from the action but may be perceived from the change in the environment or context. The changes that affect human intention are the triggers and serve as strong evidence for identifying human intention. Therefore, detecting such changes and identifying such triggers are the promising approach to assist in human intention recognition. This paper discusses the current state of art in human intention recognition in human-computer interaction and illustrates the importance of context change and triggers for human intention recognition in a variety of examples

    MIDAS: Deep learning human action intention prediction from natural eye movement patterns

    Get PDF
    Eye movements have long been studied as a window into the attentional mechanisms of the human brain and made accessible as novelty style human-machine interfaces. However, not everything that we gaze upon, is something we want to interact with; this is known as the Midas Touch problem for gaze interfaces. To overcome the Midas Touch problem, present interfaces tend not to rely on natural gaze cues, but rather use dwell time or gaze gestures. Here we present an entirely data-driven approach to decode human intention for object manipulation tasks based solely on natural gaze cues. We run data collection experiments where 16 participants are given manipulation and inspection tasks to be performed on various objects on a table in front of them. The subjects' eye movements are recorded using wearable eye-trackers allowing the participants to freely move their head and gaze upon the scene. We use our Semantic Fovea, a convolutional neural network model to obtain the objects in the scene and their relation to gaze traces at every frame. We then evaluate the data and examine several ways to model the classification task for intention prediction. Our evaluation shows that intention prediction is not a naive result of the data, but rather relies on non-linear temporal processing of gaze cues. We model the task as a time series classification problem and design a bidirectional Long-Short-Term-Memory (LSTM) network architecture to decode intentions. Our results show that we can decode human intention of motion purely from natural gaze cues and object relative position, with 91.9%91.9\% accuracy. Our work demonstrates the feasibility of natural gaze as a Zero-UI interface for human-machine interaction, i.e., users will only need to act naturally, and do not need to interact with the interface itself or deviate from their natural eye movement patterns

    Gaze-based, context-aware robotic system for assisted reaching and grasping

    No full text
    Assistive robotic systems endeavour to support those with movement disabilities, enabling them to move againand regain functionality. Main issue with these systems is the complexity of their low-level control, and how to translate thisto simpler, higher level commands that are easy and intuitivefor a human user to interact with. We have created a multi-modal system, consisting of different sensing, decision makingand actuating modalities, to create intuitive, human-in-the-loopassistive robotics. The system takes its cue from the user’s gaze,to decode their intentions and implement lower-level motionactions and achieve higher level tasks. This results in the usersimply having to look at the objects of interest, for the robotic system to assist them in reaching for those objects, grasping them, and using them to interact with other objects. We presentour method for 3D gaze estimation, and action grammars-basedimplementation of sequences of action through the robotic system. The 3D gaze estimation is evaluated with 8 subjects,showing an overall accuracy of 4.68±0.14cm. The full systemis tested with 5 subjects, showing successful implementation of 100% of reach to gaze point actions and full implementationof pick and place tasks in 96%, and pick and pour tasks in76% of cases. Finally we present a discussion on our results and what future work is needed to improve the system
    corecore