148,661 research outputs found
ETHER: Aligning Emergent Communication for Hindsight Experience Replay
Natural language instruction following is paramount to enable collaboration
between artificial agents and human beings. Natural language-conditioned
reinforcement learning (RL) agents have shown how natural languages'
properties, such as compositionality, can provide a strong inductive bias to
learn complex policies. Previous architectures like HIGhER combine the benefit
of language-conditioning with Hindsight Experience Replay (HER) to deal with
sparse rewards environments. Yet, like HER, HIGhER relies on an oracle
predicate function to provide a feedback signal highlighting which linguistic
description is valid for which state. This reliance on an oracle limits its
application. Additionally, HIGhER only leverages the linguistic information
contained in successful RL trajectories, thus hurting its final performance and
data-efficiency. Without early successful trajectories, HIGhER is no better
than DQN upon which it is built. In this paper, we propose the Emergent Textual
Hindsight Experience Replay (ETHER) agent, which builds on HIGhER and addresses
both of its limitations by means of (i) a discriminative visual referential
game, commonly studied in the subfield of Emergent Communication (EC), used
here as an unsupervised auxiliary task and (ii) a semantic grounding scheme to
align the emergent language with the natural language of the
instruction-following benchmark. We show that the referential game's agents
make an artificial language emerge that is aligned with the natural-like
language used to describe goals in the BabyAI benchmark and that it is
expressive enough so as to also describe unsuccessful RL trajectories and thus
provide feedback to the RL agent to leverage the linguistic, structured
information contained in all trajectories. Our work shows that EC is a viable
unsupervised auxiliary task for RL and provides missing pieces to make HER more
widely applicable.Comment: work in progres
Autonomous Acquisition of Natural Language
An important part of human intelligence is the ability to use language. Humans learn how to use language in a society of language users, which is probably the most effective way to learn a language from the ground up. Principles that might allow an artificial agents to learn language this way are not known at present. Here we present a framework which begins to address this challenge. Our auto-catalytic, endogenous, reflective architecture (AERA) supports the creation of agents that can learn natural language by observation. We present results from two experiments where our S1 agent learns human communication by observing two humans interacting in a realtime mock television interview, using gesture and situated language. Results show that S1 can learn multimodal complex language and multimodal communicative acts, using a vocabulary of 100 words with numerous sentence formats, by observing unscripted interaction between the humans, with no grammar being provided to it a priori, and only high-level information about the format of the human interaction in the form of high-level goals of the interviewer and interviewee and a small ontology. The agent learns both the pragmatics, semantics, and syntax of complex sentences spoken by the human subjects on the topic of recycling of objects such as aluminum cans, glass bottles, plastic, and wood, as well as use of manual deictic reference and anaphora
Diseño de un lenguaje artificial para la interacción Hombre-Máquina.
TFG presentado en inglésThis project has as objective to note motivations to search and develop an optimal
general purpose artificial language with effective and not ambiguous human-machine
communication objective in long term. Following this idea, we will ask ourselves how
could be studied which languages basic elements are actually necessaries and which
ones are unnecessary or even counter-productive. Given the optimality objective that
motivates this works, it will be necessary to face general purpose language from a
constructive way (from the absence of language, creating them progressively).
This work does not aim to face an specific natural language understanding so we
will not study human language capacity complexities and we will not subdue our resulting
languages to match with any natural language. With that, we will use as study
subject three basics elements in natural languages (subject, verb and object).
We will propose an enough simple clustering problem performed by agents with
partial information that will be shared between them by the use of non ambiguous
languages with increasing complex syntax, semantic and use. We will also define
some mathematical tools to evaluate results to contrast each language effectiveness
to obtain results to empirically support what language is better to face proposed problem
Emoji as a Proxy of Emotional Communication
Nowadays, emoji plays a fundamental role in human computer-mediated communications, allowing the latter to convey body language, objects, symbols, or ideas in text messages using Unicode standardized pictographs and logographs. Emoji allows people expressing more “authentically” emotions and their personalities, by increasing the semantic content of visual messages. The relationship between language, emoji, and emotions is now being studied by several disciplines such as linguistics, psychology, natural language processing (NLP), and machine learning (ML). Particularly, the last two are employed for the automatic detection of emotions and personality traits, building emoji sentiment lexicons, as well as for conveying artificial agents with the ability of expressing emotions through emoji. In this chapter, we introduce the concept of emoji and review the main challenges in using these as a proxy of language and emotions, the ML, and NLP techniques used for classification and detection of emotions using emoji, and presenting new trends for the exploitation of discovered emotional patterns for robotic emotional communication
The Self-Organization of Speech Sounds
The speech code is a vehicle of language: it defines
a set of forms used by a community to carry information.
Such a code is necessary to support the linguistic
interactions that allow humans to communicate.
How then may a speech code be formed prior to the
existence of linguistic interactions?
Moreover, the human speech code is discrete and compositional,
shared by all the individuals of a community but different
across communities, and phoneme inventories are characterized by
statistical regularities. How can a speech code with these properties form?
We try to approach these questions in the paper,
using the ``methodology of the artificial''. We
build a society of artificial agents, and detail a mechanism that
shows the formation of a discrete speech code without pre-supposing
the existence of linguistic capacities or of coordinated interactions.
The mechanism is based on a low-level model of
sensory-motor interactions. We show that the integration of certain very
simple and non language-specific neural devices
leads to the formation of a speech code that
has properties similar to the human speech code.
This result relies on the self-organizing properties of a generic
coupling between perception and production
within agents, and on the interactions between agents.
The artificial system helps us to develop better intuitions on how speech
might have appeared, by showing how self-organization
might have helped natural selection to find speech
Towards a more natural and intelligent interface with embodied conversation agent
Conversational agent also known as chatterbots are computer programs which are designed to converse like a human as much as their intelligent allows. In many ways, they are the embodiment of Turing's vision. The ability for computers to converse with human users using natural language would arguably increase their usefulness. Recent advances in Natural Language Processing (NLP) and Artificial Intelligence (AI) in general have advances this field in realizing the vision of a more humanoid interactive system. This paper presents and discusses the use of embodied conversation agent (ECA) for the imitation games. This paper also presents the technical design of our ECA and its performance. In the interactive media industry, it can also been observed that the ECA are getting popular
No Grice: Computers that Lie, Deceive and Conceal
In the future our daily life interactions with other people, with computers, robots and smart environments will be recorded and interpreted by computers or embedded intelligence in environments, furniture, robots, displays, and wearables. These sensors record our activities, our behavior, and our interactions. Fusion of such information and reasoning about such information makes it possible, using computational models of human behavior and activities, to provide context- and person-aware interpretations of human behavior and activities, including determination of attitudes, moods, and emotions. Sensors include cameras, microphones, eye trackers, position and proximity sensors, tactile or smell sensors, et cetera. Sensors can be embedded in an environment, but they can also move around, for example, if they are part of a mobile social robot or if they are part of devices we carry around or are embedded in our clothes or body. \ud
\ud
Our daily life behavior and daily life interactions are recorded and interpreted. How can we use such environments and how can such environments use us? Do we always want to cooperate with these environments; do these environments always want to cooperate with us? In this paper we argue that there are many reasons that users or rather human partners of these environments do want to keep information about their intentions and their emotions hidden from these smart environments. On the other hand, their artificial interaction partner may have similar reasons to not give away all information they have or to treat their human partner as an opponent rather than someone that has to be supported by smart technology.\ud
\ud
This will be elaborated in this paper. We will survey examples of human-computer interactions where there is not necessarily a goal to be explicit about intentions and feelings. In subsequent sections we will look at (1) the computer as a conversational partner, (2) the computer as a butler or diary companion, (3) the computer as a teacher or a trainer, acting in a virtual training environment (a serious game), (4) sports applications (that are not necessarily different from serious game or education environments), and games and entertainment applications
- …