2,184 research outputs found
Toward Abstraction from Multi-modal Data: Empirical Studies on Multiple Time-scale Recurrent Models
The abstraction tasks are challenging for multi- modal sequences as they
require a deeper semantic understanding and a novel text generation for the
data. Although the recurrent neural networks (RNN) can be used to model the
context of the time-sequences, in most cases the long-term dependencies of
multi-modal data make the back-propagation through time training of RNN tend to
vanish in the time domain. Recently, inspired from Multiple Time-scale
Recurrent Neural Network (MTRNN), an extension of Gated Recurrent Unit (GRU),
called Multiple Time-scale Gated Recurrent Unit (MTGRU), has been proposed to
learn the long-term dependencies in natural language processing. Particularly
it is also able to accomplish the abstraction task for paragraphs given that
the time constants are well defined. In this paper, we compare the MTRNN and
MTGRU in terms of its learning performances as well as their abstraction
representation on higher level (with a slower neural activation). This was done
by conducting two studies based on a smaller data- set (two-dimension time
sequences from non-linear functions) and a relatively large data-set
(43-dimension time sequences from iCub manipulation tasks with multi-modal
data). We conclude that gated recurrent mechanisms may be necessary for
learning long-term dependencies in large dimension multi-modal data-sets (e.g.
learning of robot manipulation), even when natural language commands was not
involved. But for smaller learning tasks with simple time-sequences, generic
version of recurrent models, such as MTRNN, were sufficient to accomplish the
abstraction task.Comment: Accepted by IJCNN 201
Semantic memory
The Encyclopedia of Human Behavior, Second Edition is a comprehensive three-volume reference source on human action and reaction, and the thoughts, feelings, and physiological functions behind those actions
The Mechanics of Embodiment: A Dialogue on Embodiment and Computational Modeling
Embodied theories are increasingly challenging traditional views of cognition by arguing that conceptual representations that constitute our knowledge are grounded in sensory and motor experiences, and processed at this sensorimotor level, rather than being represented and processed abstractly in an amodal conceptual system. Given the established empirical foundation, and the relatively underspecified theories to date, many researchers are extremely interested in embodied cognition but are clamouring for more mechanistic implementations. What is needed at this stage is a push toward explicit computational models that implement sensory-motor grounding as intrinsic to cognitive processes. In this article, six authors from varying backgrounds and approaches address issues concerning the construction of embodied computational models, and illustrate what they view as the critical current and next steps toward mechanistic theories of embodiment. The first part has the form of a dialogue between two fictional characters: Ernest, the �experimenter�, and Mary, the �computational modeller�. The dialogue consists of an interactive sequence of questions, requests for clarification, challenges, and (tentative) answers, and touches the most important aspects of grounded theories that should inform computational modeling and, conversely, the impact that computational modeling could have on embodied theories. The second part of the article discusses the most important open challenges for embodied computational modelling
Supervised cross-modal factor analysis for multiple modal data classification
In this paper we study the problem of learning from multiple modal data for
purpose of document classification. In this problem, each document is composed
two different modals of data, i.e., an image and a text. Cross-modal factor
analysis (CFA) has been proposed to project the two different modals of data to
a shared data space, so that the classification of a image or a text can be
performed directly in this space. A disadvantage of CFA is that it has ignored
the supervision information. In this paper, we improve CFA by incorporating the
supervision information to represent and classify both image and text modals of
documents. We project both image and text data to a shared data space by factor
analysis, and then train a class label predictor in the shared space to use the
class label information. The factor analysis parameter and the predictor
parameter are learned jointly by solving one single objective function. With
this objective function, we minimize the distance between the projections of
image and text of the same document, and the classification error of the
projection measured by hinge loss function. The objective function is optimized
by an alternate optimization strategy in an iterative algorithm. Experiments in
two different multiple modal document data sets show the advantage of the
proposed algorithm over other CFA methods
Learning Multi-Object Symbols for Manipulation with Attentive Deep Effect Predictors
In this paper, we propose a concept learning architecture that enables a
robot to build symbols through self-exploration by interacting with a varying
number of objects. Our aim is to allow a robot to learn concepts without
constraints, such as a fixed number of interacted objects or pre-defined
symbolic structures. As such, the sought architecture should be able to build
symbols for objects such as single objects that can be grasped, object stacks
that cannot be grasped together, or other composite dynamic structures. Towards
this end, we propose a novel architecture, a self-attentive predictive
encoder-decoder network with binary activation layers. We show the validity of
the proposed network through a robotic manipulation setup involving a varying
number of rigid objects. The continuous sensorimotor experience of the robot is
used by the proposed network to form effect predictors and symbolic structures
that describe the interaction of the robot in a discrete way. We showed that
the robot acquired reasoning capabilities to encode interaction dynamics of a
varying number of objects in different configurations using the discovered
symbols. For example, the robot could reason that (possible multiple numbers
of) objects on top of another object would move together if the object below is
moved by the robot. We also showed that the discovered symbols can be used for
planning to reach goals by training a higher-level neural network that makes
pure symbolic reasoning.Comment: 7 pages, 7 figure
Objects, words, and actions. Some reasons why embodied models are badly needed in cognitive psychology
In the present chapter we report experiments on the relationships between visual objects and action and between words and actions. Results show that seeing an object activates motor information, and that also language is grounded in perceptual and motor systems. They are discussed within the framework of embodied cognitive science. We argue that models able to reproduce the experiments should be embodied organisms, whose brain is simulated with neural networks and whose body is as similar as possible to humans\u27 body. We also claim that embodied models are badly needed in cognitive psychology, as they could help to solve some open issues. Finally, we discuss potential implications of the use of embodied models for embodied theories of cognition
Recommended from our members
Anchoring Knowledge in Interaction: Towards a Harmonic Subsymbolic/Symbolic Framework and Architecture of Computational Cognition
We outline a proposal for a research program leading to a new paradigm, architectural framework, and prototypical implementation, for the cognitively inspired anchoring of an agent’s learning, knowledge formation, and higher reasoning abilities in real-world interactions: Learning through interaction in real-time in a real environment triggers the incremental accumulation and repair of knowledge that leads to the formation of theories at a higher level of abstraction. The transformations at this higher level filter down and inform the learning process as part of a permanent cycle of learning through experience, higher-order deliberation, theory formation and revision.
The envisioned framework will provide a precise computational theory, algorithmic descriptions, and an implementation in cyber-physical systems, addressing the lifting of action patterns from the subsymbolic to the symbolic knowledge level, effective methods for theory formation, adaptation, and evolution, the anchoring of knowledge-level objects, real-world interactions and manipulations, and the realization and evaluation of such a system in different scenarios. The expected results can provide new foundations for future agent architectures, multi-agent systems, robotics, and cognitive systems, and can facilitate a deeper understanding of the development and interaction in human-technological settings
TOWARDS THE GROUNDING OF ABSTRACT CATEGORIES IN COGNITIVE ROBOTS
The grounding of language in humanoid robots is a fundamental problem, especially
in social scenarios which involve the interaction of robots with human beings. Indeed,
natural language represents the most natural interface for humans to interact
and exchange information about concrete entities like KNIFE, HAMMER and abstract
concepts such as MAKE, USE. This research domain is very important not
only for the advances that it can produce in the design of human-robot communication
systems, but also for the implication that it can have on cognitive science.
Abstract words are used in daily conversations among people to describe events and
situations that occur in the environment. Many scholars have suggested that the
distinction between concrete and abstract words is a continuum according to which
all entities can be varied in their level of abstractness.
The work presented herein aimed to ground abstract concepts, similarly to concrete
ones, in perception and action systems. This permitted to investigate how different
behavioural and cognitive capabilities can be integrated in a humanoid robot in
order to bootstrap the development of higher-order skills such as the acquisition of
abstract words. To this end, three neuro-robotics models were implemented.
The first neuro-robotics experiment consisted in training a humanoid robot to perform
a set of motor primitives (e.g. PUSH, PULL, etc.) that hierarchically combined
led to the acquisition of higher-order words (e.g. ACCEPT, REJECT). The
implementation of this model, based on a feed-forward artificial neural networks,
permitted the assessment of the training methodology adopted for the grounding of
language in humanoid robots.
In the second experiment, the architecture used for carrying out the first study
was reimplemented employing recurrent artificial neural networks that enabled the
temporal specification of the action primitives to be executed by the robot. This
permitted to increase the combinations of actions that can be taught to the robot
for the generation of more complex movements.
For the third experiment, a model based on recurrent neural networks that integrated
multi-modal inputs (i.e. language, vision and proprioception) was implemented for
the grounding of abstract action words (e.g. USE, MAKE). Abstract representations
of actions ("one-hot" encoding) used in the other two experiments, were replaced
with the joints values recorded from the iCub robot sensors.
Experimental results showed that motor primitives have different activation patterns
according to the action's sequence in which they are embedded. Furthermore, the
performed simulations suggested that the acquisition of concepts related to abstract
action words requires the reactivation of similar internal representations activated
during the acquisition of the basic concepts, directly grounded in perceptual and
sensorimotor knowledge, contained in the hierarchical structure of the words used
to ground the abstract action words.This study was financed by the EU project RobotDoC (235065) from the Seventh
Framework Programme (FP7), Marie Curie Actions Initial Training Network
- …