Using eye-gaze to forecast human pose in everyday pick and place actions

Abstract

Collaborative robots that operate alongside hu- mans require the ability to understand their intent and forecast their pose. Among the various indicators of intent, the eye gaze is particularly important as it signals action towards the gazed object. By observing a person’s gaze, one can effectively predict the object of interest and subsequently, forecast the person’s pose. We leverage this and present a method that forecasts the human pose using gaze information for everyday pick and place actions in a home environment. Our method first attends to fixations to locate the coordinates of the object of interest before inputting said coordinates to a pose forecasting network. Experiments on the MoGaze dataset show that our gaze network lowers the errors of existing pose forecasting methods and that incorporating prior in the form of textual instructions further lowers the errors by a significant amount. Furthermore, the use of eye gaze now allows a simple multilayer perceptron network to directly forecast the keypose

    Similar works