In this paper we focus on the problem of learning an optimal policy for
Active Visual Search (AVS) of objects in known indoor environments with an
online setup. Our POMP method uses as input the current pose of an agent (e.g.
a robot) and a RGB-D frame. The task is to plan the next move that brings the
agent closer to the target object. We model this problem as a Partially
Observable Markov Decision Process solved by a Monte-Carlo planning approach.
This allows us to make decisions on the next moves by iterating over the known
scenario at hand, exploring the environment and searching for the object at the
same time. Differently from the current state of the art in Reinforcement
Learning, POMP does not require extensive and expensive (in time and
computation) labelled data so being very agile in solving AVS in small and
medium real scenarios. We only require the information of the floormap of the
environment, an information usually available or that can be easily extracted
from an a priori single exploration run. We validate our method on the publicly
available AVD benchmark, achieving an average success rate of 0.76 with an
average path length of 17.1, performing close to the state of the art but
without any training needed. Additionally, we show experimentally the
robustness of our method when the quality of the object detection goes from
ideal to faulty.Comment: Accepted at BMVC202