In the context of human-supervised autonomy, we study the problem of optimal
fidelity selection for a human operator performing an underwater visual search
task. Human performance depends on various cognitive factors such as workload
and fatigue. We perform human experiments in which participants perform two
tasks simultaneously: a primary task, which is subject to evaluation, and a
secondary task to estimate their workload. The primary task requires
participants to search for underwater mines in videos, while the secondary task
involves a simple visual test where they respond when a green light displayed
on the side of their screens turns red. Videos arrive as a Poisson process and
are stacked in a queue to be serviced by the human operator. The operator can
choose to watch the video with either normal or high fidelity, with normal
fidelity videos playing at three times the speed of high fidelity ones.
Participants receive rewards for their accuracy in mine detection for each
primary task and penalties based on the number of videos waiting in the queue.
We consider the workload of the operator as a hidden state and model the
workload dynamics as an Input-Output Hidden Markov Model (IOHMM). We use a
Partially Observable Markov Decision Process (POMDP) to learn an optimal
fidelity selection policy, where the objective is to maximize total rewards.
Our results demonstrate improved performance when videos are serviced based on
the optimal fidelity policy compared to a baseline where humans choose the
fidelity level themselves