10,872 research outputs found
Introduction: The Fourth International Workshop on Epigenetic Robotics
As in the previous editions, this workshop is trying to be a forum for multi-disciplinary research ranging from developmental psychology to neural sciences (in its widest sense) and robotics including computational studies. This is a two-fold aim of, on the one hand, understanding the brain through engineering embodied systems and, on the other hand, building artificial epigenetic systems. Epigenetic contains in its meaning the idea that we are interested in studying development through interaction with the environment. This idea entails the embodiment of the system, the situatedness in the environment, and of course a prolonged period of postnatal development when this interaction can actually take place. This is still a relatively new endeavor although the seeds of the developmental robotics community were already in the air since the nineties (Berthouze and Kuniyoshi, 1998; Metta et al., 1999; Brooks et al., 1999; Breazeal, 2000; Kozima and Zlatev, 2000). A few had the intuition – see Lungarella et al. (2003) for a comprehensive review – that, intelligence could not be possibly engineered simply by copying systems that are “ready made” but rather that the development of the system fills a major role. This integration of disciplines raises the important issue of learning on the multiple scales of developmental time, that is, how to build systems that eventually can learn in any environment rather than program them for a specific environment. On the other hand, the hope is that robotics might become a new tool for brain science similarly to what simulation and modeling have become for the study of the motor system. Our community is still pretty much evolving and “under construction” and for this reason, we tried to encourage submissions from the psychology community. Additionally, we invited four neuroscientists and no roboticists for the keynote lectures. We received a record number of submissions (more than 50), and given the overall size and duration of the workshop together with our desire to maintain a single-track format, we had to be more selective than ever in the review process (a 20% acceptance rate on full papers). This is, if not an index of quality, at least an index of the interest that gravitates around this still new discipline
Mining Object Parts from CNNs via Active Question-Answering
Given a convolutional neural network (CNN) that is pre-trained for object
classification, this paper proposes to use active question-answering to
semanticize neural patterns in conv-layers of the CNN and mine part concepts.
For each part concept, we mine neural patterns in the pre-trained CNN, which
are related to the target part, and use these patterns to construct an And-Or
graph (AOG) to represent a four-layer semantic hierarchy of the part. As an
interpretable model, the AOG associates different CNN units with different
explicit object parts. We use an active human-computer communication to
incrementally grow such an AOG on the pre-trained CNN as follows. We allow the
computer to actively identify objects, whose neural patterns cannot be
explained by the current AOG. Then, the computer asks human about the
unexplained objects, and uses the answers to automatically discover certain CNN
patterns corresponding to the missing knowledge. We incrementally grow the AOG
to encode new knowledge discovered during the active-learning process. In
experiments, our method exhibits high learning efficiency. Our method uses
about 1/6-1/3 of the part annotations for training, but achieves similar or
better part-localization performance than fast-RCNN methods.Comment: Published in CVPR 201
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
This paper presents a self-supervised method for visual detection of the
active speaker in a multi-person spoken interaction scenario. Active speaker
detection is a fundamental prerequisite for any artificial cognitive system
attempting to acquire language in social settings. The proposed method is
intended to complement the acoustic detection of the active speaker, thus
improving the system robustness in noisy conditions. The method can detect an
arbitrary number of possibly overlapping active speakers based exclusively on
visual information about their face. Furthermore, the method does not rely on
external annotations, thus complying with cognitive development. Instead, the
method uses information from the auditory modality to support learning in the
visual domain. This paper reports an extensive evaluation of the proposed
method using a large multi-person face-to-face interaction dataset. The results
show good performance in a speaker dependent setting. However, in a speaker
independent setting the proposed method yields a significantly lower
performance. We believe that the proposed method represents an essential
component of any artificial cognitive system or robotic platform engaging in
social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System
- …