We present a biologically motivated integrated vision system that is capable of online learning of several objects and faces in a unified representation. The training is unconstrained in the sense that arbitrary objects can be freely presented in front of a stereo camera system and labeled by speech input. We combine biological principles such as appearance-based representation in topographical feature detection hierarchies and context-driven transfer between different levels of object memory. The learning is driven by interactively sharing attention between user and system. It is fully online and avoids an artificial separation of the interaction into training and test phases
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.