Perceiving the surrounding environment in terms of objects is useful for any
general purpose intelligent agent. In this paper, we investigate a fundamental
mechanism making object perception possible, namely the identification of
spatio-temporally invariant structures in the sensorimotor experience of an
agent. We take inspiration from the Sensorimotor Contingencies Theory to define
a computational model of this mechanism through a sensorimotor, unsupervised
and predictive approach. Our model is based on processing the unsupervised
interaction of an artificial agent with its environment. We show how
spatio-temporally invariant structures in the environment induce regularities
in the sensorimotor experience of an agent, and how this agent, while building
a predictive model of its sensorimotor experience, can capture them as densely
connected subgraphs in a graph of sensory states connected by motor commands.
Our approach is focused on elementary mechanisms, and is illustrated with a set
of simple experiments in which an agent interacts with an environment. We show
how the agent can build an internal model of moving but spatio-temporally
invariant structures by performing a Spectral Clustering of the graph modeling
its overall sensorimotor experiences. We systematically examine properties of
the model, shedding light more globally on the specificities of the paradigm
with respect to methods based on the supervised processing of collections of
static images.Comment: 24 pages, 10 figures, published in Frontiers Robotics and A