7 research outputs found

    Interactive Learning from Unlabeled Instructions

    Get PDF
    International audienceInteractive learning deals with the problem of learning and solving tasks using human instructions. It is common in human-robot interaction, tutoring systems, and in human-computer interfaces such as brain-computer ones. In most cases, learning these tasks is possible because the signals are predefined or an ad-hoc calibration procedure allows to map signals to specific meanings. In this paper, we address the problem of simultaneously solving a task under human feedback and learning the associated meanings of the feedback signals. This has important practical application since the user can start controlling a device from scratch, without the need of an expert to define the meaning of signals or carrying out a calibration phase. The paper proposes an algorithm that simultaneously assign meanings to signals while solving a sequential task under the assumption that both, human and machine, share the same a priori on the possible instruction meanings and the possible tasks. Furthermore, we show using synthetic and real EEG data from a brain-computer interface that taking into account the uncertainty of the task and the signal is necessary for the machine to actively plan how to solve the task efficiently

    SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents

    Full text link
    Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI. Within the Deep Reinforcement Learning (DRL) field, this objective motivated multiple works on embodied language use. However, current approaches focus on language as a communication tool in very simplified and non-diverse social situations: the "naturalness" of language is reduced to the concept of high vocabulary size and variability. In this paper, we argue that aiming towards human-level AI requires a broader set of key social skills: 1) language use in complex and variable social contexts; 2) beyond language, complex embodied communication in multimodal settings within constantly evolving social worlds. We explain how concepts from cognitive sciences could help AI to draw a roadmap towards human-like intelligence, with a focus on its social dimensions. As a first step, we propose to expand current research to a broader set of core social skills. To do this, we present SocialAI, a benchmark to assess the acquisition of social skills of DRL agents using multiple grid-world environments featuring other (scripted) social agents. We then study the limits of a recent SOTA DRL approach when tested on SocialAI and discuss important next steps towards proficient social agents. Videos and code are available at https://sites.google.com/view/socialai.Comment: under review. This paper extends and generalizes work in arXiv:2104.1320

    Pragmatic Frames for Teaching and Learning in Human-Robot interaction: Review and Challenges

    Get PDF
    Vollmer A-L, Wrede B, Rohlfing KJ, Oudeyer P-Y. Pragmatic Frames for Teaching and Learning in Human-Robot interaction: Review and Challenges. FRONTIERS IN NEUROROBOTICS. 2016;10: 10.One of the big challenges in robotics today is to learn from human users that are inexperienced in interacting with robots but yet are often used to teach skills flexibly to other humans and to children in particular. A potential route toward natural and efficient learning and teaching in Human-Robot Interaction (HRI) is to leverage the social competences of humans and the underlying interactional mechanisms. In this perspective, this article discusses the importance of pragmatic frames as flexible interaction protocols that provide important contextual cues to enable learners to infer new action or language skills and teachers to convey these cues. After defining and discussing the concept of pragmatic frames, grounded in decades of research in developmental psychology, we study a selection of HRI work in the literature which has focused on learning-teaching interaction and analyze the interactional and learning mechanisms that were used in the light of pragmatic frames. This allows us to show that many of the works have already used in practice, but not always explicitly, basic elements of the pragmatic frames machinery. However, we also show that pragmatic frames have so far been used in a very restricted way as compared to how they are used in human-human interaction and argue that this has been an obstacle preventing robust natural multi-task learning and teaching in HRI. In particular, we explain that two central features of human pragmatic frames, mostly absent of existing HRI studies, are that (1) social peers use rich repertoires of frames, potentially combined together, to convey and infer multiple kinds of cues; (2) new frames can be learnt continually, building on existing ones, and guiding the interaction toward higher levels of complexity and expressivity. To conclude, we give an outlook on the future research direction describing the relevant key challenges that need to be solved for leveraging pragmatic frames for robot learning and teaching

    Towards Teachable Autonomous Agents

    Get PDF
    Autonomous discovery and direct instruction are two extreme sources of learning in children, but educational sciences have shown that intermediate approaches such as assisted discovery or guided play resulted in better acquisition of skills. When turning to Artificial Intelligence, the above dichotomy can be translated into the distinction between autonomous agents, which learn in isolation from their own signals, and interactive learning agents which can be taught by social partners but generally lack autonomy. In between should stand teachable autonomous agents: agents that learn from both internal and teaching signals to benefit from the higher efficiency of assisted discovery processes. Designing such agents could result in progress in two ways. First, very concretely, it would offer a way to non-expert users in the real world to drive the learning behavior of agents towards their expectations. Second, more fundamentally, it might be a key step to endow agents with the necessary capabilities to reach general intelligence. The purpose of this paper is to elucidate the key obstacles standing in the way towards the design of such agents. We proceed in four steps. First, we build on a seminal work of Bruner to extract relevant features of the assisted discovery processes happening between a child and a tutor. Second, we highlight how current research on intrinsically motivated agents is paving the way towards teachable and autonomous agents. In particular, we focus on autotelic agents, i.e. agents equipped with forms of intrinsic motivations that enable them to represent, self-generate and pursue their own goals. We argue that such autotelic capabilities from the learner side are key in the discovery process. Third, we adopt a social learning perspective on the interaction between a tutor and a learner to highlight some components that are currently missing to these agents before they can be taught by ordinary people using natural pedagogy. Finally, we provide a list of specific research questions that emerge from the perspective of extending these agents with assisted learning capabilities

    Making friends on the fly : advances in ad hoc teamwork

    Get PDF
    textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations. The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones. We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science

    Machine Learning from Casual Conversation

    Get PDF
    Human social learning is an effective process that has inspired many existing machine learning techniques, such as learning from observation and learning by demonstration. In this dissertation, we introduce another form of social learning, Learning from a Casual Conversation (LCC). LCC is an open-ended machine learning system in which an artificially intelligent agent learns from an extended dialog with a human. Our system enables the agent to incorporate changes into its knowledge base, based on the human\u27s conversational text input. This system emulates how humans learn from each other through a dialog. LCC closes the gap in the current research that is focused on teaching specific tasks to computer agents. Furthermore, LCC aims to provide an easy way to enhance the knowledge of the system without requiring the involvement of a programmer. This system does not require the user to enter specific information; instead, the user can chat naturally with the agent. LCC identifies the inputs that contain information relevant to its knowledge base in the learning process. LCC\u27s architecture consists of multiple sub-systems combined to perform the task. Its learning component can add new knowledge to existing information in the knowledge base, confirm existing information, and/or update existing information found to be related to the user input. %The test results indicate that the prototype was successful in learning from a conversation. The LCC system functionality was assessed using different evaluation methods. This includes tests performed by the developer, as well as by 130 human test subjects. Thirty of those test subjects interacted directly with the system and completed a survey of 13 questions/statements that asked the user about his/her experience using LCC. A second group of 100 human test subjects evaluated the dialogue logs of a subset of the first group of human testers. The collected results were all found to be acceptable and within the range of our expectations
    corecore