1,949 research outputs found
The Mechanics of Embodiment: A Dialogue on Embodiment and Computational Modeling
Embodied theories are increasingly challenging traditional views of cognition by arguing that conceptual representations that constitute our knowledge are grounded in sensory and motor experiences, and processed at this sensorimotor level, rather than being represented and processed abstractly in an amodal conceptual system. Given the established empirical foundation, and the relatively underspecified theories to date, many researchers are extremely interested in embodied cognition but are clamouring for more mechanistic implementations. What is needed at this stage is a push toward explicit computational models that implement sensory-motor grounding as intrinsic to cognitive processes. In this article, six authors from varying backgrounds and approaches address issues concerning the construction of embodied computational models, and illustrate what they view as the critical current and next steps toward mechanistic theories of embodiment. The first part has the form of a dialogue between two fictional characters: Ernest, the �experimenter�, and Mary, the �computational modeller�. The dialogue consists of an interactive sequence of questions, requests for clarification, challenges, and (tentative) answers, and touches the most important aspects of grounded theories that should inform computational modeling and, conversely, the impact that computational modeling could have on embodied theories. The second part of the article discusses the most important open challenges for embodied computational modelling
Visually Grounded Language Learning: a review of language games, datasets, tasks, and models
In recent years, several machine learning models have been proposed. They are
trained with a language modelling objective on large-scale text-only data. With
such pretraining, they can achieve impressive results on many Natural Language
Understanding and Generation tasks. However, many facets of meaning cannot be
learned by ``listening to the radio" only. In the literature, many
Vision+Language (V+L) tasks have been defined with the aim of creating models
that can ground symbols in the visual modality. In this work, we provide a
systematic literature review of several tasks and models proposed in the V+L
field. We rely on Wittgenstein's idea of `language games' to categorise such
tasks into 3 different families: 1) discriminative games, 2) generative games,
and 3) interactive games. Our analysis of the literature provides evidence that
future work should be focusing on interactive games where communication in
Natural Language is important to resolve ambiguities about object referents and
action plans and that physical embodiment is essential to understand the
semantics of situations and events. Overall, these represent key requirements
for developing grounded meanings in neural models.Comment: Preprint for JAIR before copyeditin
Language-conditioned Learning for Robotic Manipulation: A Survey
Language-conditioned robotic manipulation represents a cutting-edge area of
research, enabling seamless communication and cooperation between humans and
robotic agents. This field focuses on teaching robotic systems to comprehend
and execute instructions conveyed in natural language. To achieve this, the
development of robust language understanding models capable of extracting
actionable insights from textual input is essential. In this comprehensive
survey, we systematically explore recent advancements in language-conditioned
approaches within the context of robotic manipulation. We analyze these
approaches based on their learning paradigms, which encompass reinforcement
learning, imitation learning, and the integration of foundational models, such
as large language models and vision-language models. Furthermore, we conduct an
in-depth comparative analysis, considering aspects like semantic information
extraction, environment & evaluation, auxiliary tasks, and task representation.
Finally, we outline potential future research directions in the realm of
language-conditioned learning for robotic manipulation, with the topic of
generalization capabilities and safety issues. The GitHub repository of this
paper can be found at
https://github.com/hk-zh/language-conditioned-robot-manipulation-model
Vocal Interactivity in-and-between Humans, Animals, and Robots
Almost all animals exploit vocal signals for a range of ecologically motivated purposes: detecting predators/prey and marking territory, expressing emotions, establishing social relations, and sharing information. Whether it is a bird raising an alarm, a whale calling to potential partners, a dog responding to human commands, a parent reading a story with a child, or a business-person accessing stock prices using Siri, vocalization provides a valuable communication channel through which behavior may be coordinated and controlled, and information may be distributed and acquired. Indeed, the ubiquity of vocal interaction has led to research across an extremely diverse array of fields, from assessing animal welfare, to understanding the precursors of human language, to developing voice-based human–machine interaction. Opportunities for cross-fertilization between these fields abound; for example, using artificial cognitive agents to investigate contemporary theories of language grounding, using machine learning to analyze different habitats or adding vocal expressivity to the next generation of language-enabled autonomous social agents. However, much of the research is conducted within well-defined disciplinary boundaries, and many fundamental issues remain. This paper attempts to redress the balance by presenting a comparative review of vocal interaction within-and-between humans, animals, and artificial agents (such as robots), and it identifies a rich set of open research questions that may benefit from an interdisciplinary analysis
Symbol Emergence in Cognitive Developmental Systems: a Survey
OAPA Humans use signs, e.g., sentences in a spoken language, for communication and thought. Hence, symbol systems like language are crucial for our communication with other agents and adaptation to our real-world environment. The symbol systems we use in our human society adaptively and dynamically change over time. In the context of artificial intelligence (AI) and cognitive systems, the symbol grounding problem has been regarded as one of the central problems related to symbols. However, the symbol grounding problem was originally posed to connect symbolic AI and sensorimotor information and did not consider many interdisciplinary phenomena in human communication and dynamic symbol systems in our society, which semiotics considered. In this paper, we focus on the symbol emergence problem, addressing not only cognitive dynamics but also the dynamics of symbol systems in society, rather than the symbol grounding problem. We first introduce the notion of a symbol in semiotics from the humanities, to leave the very narrow idea of symbols in symbolic AI. Furthermore, over the years, it became more and more clear that symbol emergence has to be regarded as a multifaceted problem. Therefore, secondly, we review the history of the symbol emergence problem in different fields, including both biological and artificial systems, showing their mutual relations. We summarize the discussion and provide an integrative viewpoint and comprehensive overview of symbol emergence in cognitive systems. Additionally, we describe the challenges facing the creation of cognitive systems that can be part of symbol emergence systems
- …