2 research outputs found
End-to-End Pixel-Based Deep Active Inference for Body Perception and Action
We present a pixel-based deep active inference algorithm (PixelAI) inspired
by human body perception and action. Our algorithm combines the free-energy
principle from neuroscience, rooted in variational inference, with deep
convolutional decoders to scale the algorithm to directly deal with raw visual
input and provide online adaptive inference. Our approach is validated by
studying body perception and action in a simulated and a real Nao robot.
Results show that our approach allows the robot to perform 1) dynamical body
estimation of its arm using only monocular camera images and 2) autonomous
reaching to "imagined" arm poses in the visual space. This suggests that robot
and human body perception and action can be efficiently solved by viewing both
as an active inference problem guided by ongoing sensory input
Robot in the mirror: toward an embodied computational model of mirror self-recognition
Self-recognition or self-awareness is a capacity attributed typically only to
humans and few other species. The definitions of these concepts vary and little
is known about the mechanisms behind them. However, there is a Turing test-like
benchmark: the mirror self-recognition, which consists in covertly putting a
mark on the face of the tested subject, placing her in front of a mirror, and
observing the reactions. In this work, first, we provide a mechanistic
decomposition, or process model, of what components are required to pass this
test. Based on these, we provide suggestions for empirical research. In
particular, in our view, the way the infants or animals reach for the mark
should be studied in detail. Second, we develop a model to enable the humanoid
robot Nao to pass the test. The core of our technical contribution is learning
the appearance representation and visual novelty detection by means of learning
the generative model of the face with deep auto-encoders and exploiting the
prediction error. The mark is identified as a salient region on the face and
reaching action is triggered, relying on a previously learned mapping to arm
joint angles. The architecture is tested on two robots with a completely
different face.Comment: To appear in KI - K\"unstliche Intelligenz - German Journal of
Artificial Intelligence - Springe