2 research outputs found
Sample-Efficient Training of Robotic Guide Using Human Path Prediction Network
Training a robot that engages with people is challenging, because it is
expensive to involve people in a robot training process requiring numerous data
samples. This paper proposes a human path prediction network (HPPN) and an
evolution strategy-based robot training method using virtual human movements
generated by the HPPN, which compensates for this sample inefficiency problem.
We applied the proposed method to the training of a robotic guide for visually
impaired people, which was designed to collect multimodal human response data
and reflect such data when selecting the robot's actions. We collected 1,507
real-world episodes for training the HPPN and then generated over 100,000
virtual episodes for training the robot policy. User test results indicate that
our trained robot accurately guides blindfolded participants along a goal path.
In addition, by the designed reward to pursue both guidance accuracy and human
comfort during the robot policy training process, our robot leads to improved
smoothness in human motion while maintaining the accuracy of the guidance. This
sample-efficient training method is expected to be widely applicable to all
robots and computing machinery that physically interact with humans