5 research outputs found
Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
In this paper, we propose the Interactive Text2Pickup (IT2P) network for
human-robot collaboration which enables an effective interaction with a human
user despite the ambiguity in user's commands. We focus on the task where a
robot is expected to pick up an object instructed by a human, and to interact
with the human when the given instruction is vague. The proposed network
understands the command from the human user and estimates the position of the
desired object first. To handle the inherent ambiguity in human language
commands, a suitable question which can resolve the ambiguity is generated. The
user's answer to the question is combined with the initial command and given
back to the network, resulting in more accurate estimation. The experiment
results show that given unambiguous commands, the proposed method can estimate
the position of the requested object with an accuracy of 98.49% based on our
test dataset. Given ambiguous language commands, we show that the accuracy of
the pick up task increases by 1.94 times after incorporating the information
obtained from the interaction.Comment: 8 pages, 9 figure
Mutual-cognition for proactive human-robot collaboration: A mixed reality-enabled visual reasoning-based method
Human-Robot Collaboration (HRC) is key to achieving the flexible automation required by the mass personalization trend, especially towards human-centric intelligent manufacturing. Nevertheless, existing HRC systems suffer from poor task understanding and poor ergonomic satisfaction, which impede empathetic teamwork skills in task execution. To overcome the bottleneck, a Mixed Reality (MR) and visual reasoning-based method is proposed in this research, providing mutual-cognitive task assignment for human and robotic agentsā operations. Firstly, an MR-enabled mutual-cognitive HRC architecture is proposed, with the characteristic of monitoring Digital Twins states, reasoning co-working strategies, and providing cognitive services. Secondly, a visual reasoning approach is introduced, which learns scene interpretation from the visual perception of each agentās actions and environmental changes to make task planning strategies satisfying humanārobot operation needs. Lastly, a safe, ergonomic, and proactive robot motion planning algorithm is proposed to let a robot execute generated co-working strategies, while a human operator is supported with intuitive task operation guidance in the MR environment, achieving empathetic collaboration. Through a demonstration of a disassembly task of aging Electric Vehicle Batteries, the experimental result facilitates cognitive intelligence in Proactive HRC for flexible automation
CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents
In this paper, we focus on inferring whether the given user command is clear,
ambiguous, or infeasible in the context of interactive robotic agents utilizing
large language models (LLMs). To tackle this problem, we first present an
uncertainty estimation method for LLMs to classify whether the command is
certain (i.e., clear) or not (i.e., ambiguous or infeasible). Once the command
is classified as uncertain, we further distinguish it between ambiguous or
infeasible commands leveraging LLMs with situational aware context in a
zero-shot manner. For ambiguous commands, we disambiguate the command by
interacting with users via question generation with LLMs. We believe that
proper recognition of the given commands could lead to a decrease in
malfunction and undesired actions of the robot, enhancing the reliability of
interactive robot agents. We present a dataset for robotic situational
awareness, consisting pair of high-level commands, scene descriptions, and
labels of command type (i.e., clear, ambiguous, or infeasible). We validate the
proposed method on the collected dataset, pick-and-place tabletop simulation.
Finally, we demonstrate the proposed approach in real-world human-robot
interaction experiments, i.e., handover scenarios