Search CORE

8 research outputs found

A Review on Learning Planning Action Models for Socio-Communicative HRI

Author: Arora Ankuj
Fiorino Humbert
Pellier Damien
Pesty Sylvie
Publication venue: HAL CCSD
Publication date: 13/06/2016
Field of study

National audienceFor social robots to be brought more into widespread use in the fields of companionship, care taking and domestic help, they must be capable of demonstrating social intelligence. In order to be acceptable, they must exhibit socio-communicative skills. Classic approaches to program HRI from observed human-human interactions fails to capture the subtlety of multimodal interactions as well as the key structural differences between robots and humans. The former arises due to a difficulty in quantifying and coding mul-timodal behaviours, while the latter due to a difference of the degrees of liberty between a robot and a human. However , the notion of reverse engineering from multimodal HRI traces to learn the underlying behavioral blueprint of the robot given multimodal traces seems an option worth exploring. With this spirit, the entire HRI can be seen as a sequence of exchanges of speech acts between the robot and human, each act treated as an action, bearing in mind that the entire sequence is goal-driven. Thus, this entire interaction can be treated as a sequence of actions propelling the interaction from its initial to goal state, also known as a plan in the domain of AI planning. In the same domain, this action sequence that stems from plan execution can be represented as a trace. AI techniques, such as machine learning , can be used to learn behavioral models (also known as symbolic action models in AI), intended to be reusable for AI planning, from the aforementioned multimodal traces. This article reviews recent machine learning techniques for learning planning action models which can be applied to the field of HRI with the intent of rendering robots as socio-communicative

Hal - Université Grenoble Alpes

Action Model Learning for Socio-Communicative Human Robot Interaction

Author: Arora Ankuj
Publication venue
Publication date: 08/12/2017
Field of study

Conduite dans le but de rendre les robots comme socio-communicatifs, les chercheurs ont cherché à mettre au point des robots dotés de compétences sociales et de «bon sens» pour les rendre acceptables. Cette intelligence sociale ou «sens commun» du robot est ce qui finit par déterminer son acceptabilité sociale à long terme.Cependant, ce n'est pas commun. Les robots peuvent donc seulement apprendre à être acceptables avec l'expérience. Cependant, en enseignant à un humanoïde, les subtilités d'une interaction sociale ne sont pas évidentes. Même un échange de dialogue standard intègre le panel le plus large possible de signes qui interviennent dans la communication et sont difficiles à codifier (synchronisation entre l'expression du corps, le visage, le ton de la voix, etc.). Dans un tel scénario, l'apprentissage du modèle comportemental du robot est une approche prometteuse. Cet apprentissage peut être réalisé avec l'aide de techniques d'IA. Cette étude tente de résoudre le problème de l'apprentissage des modèles comportementaux du robot dans le paradigme automatisé de planification et d'ordonnancement (APS) de l'IA. Dans le domaine de la planification automatisée et de l'ordonnancement (APS), les agents intelligents nécessitent un modèle d'action (plans d'actions dont les exécutions entrelacées effectuent des transitions de l'état système) afin de planifier et résoudre des problèmes réels. Au cours de cette thèse, nous présentons deux nouveaux systèmes d'apprentissage qui facilitent l'apprentissage des modèles d'action et élargissent la portée de ces nouveaux systèmes pour apprendre les modèles de comportement du robot. Ces techniques peuvent être classées dans les catégories non optimale et optimale. Les techniques non optimales sont plus classiques dans le domaine, ont été traitées depuis des années et sont de nature symbolique. Cependant, ils ont leur part de quirks, ce qui entraîne un taux d'apprentissage moins élevé que souhaité. Les techniques optimales sont basées sur les progrès récents dans l'apprentissage en profondeur, en particulier la famille à long terme (LSTM) de réseaux récurrents récurrents. Ces techniques sont de plus en plus séduisantes par la vertu et produisent également des taux d'apprentissage plus élevés. Cette étude met en vedette ces deux techniques susmentionnées qui sont testées sur des repères d'IA pour évaluer leurs prouesses. Ils sont ensuite appliqués aux traces HRI pour estimer la qualité du modèle de comportement du robot savant. Ceci est dans l'intérêt d'un objectif à long terme d'introduire l'autonomie comportementale dans les robots, afin qu'ils puissent communiquer de manière autonome avec les humains sans avoir besoin d'une intervention de «magicien».Driven with the objective of rendering robots as socio-communicative, there has been a heightened interest towards researching techniques to endow robots with social skills and ``commonsense'' to render them acceptable. This social intelligence or ``commonsense'' of the robot is what eventually determines its social acceptability in the long run.Commonsense, however, is not that common. Robots can, thus, only learn to be acceptable with experience. However, teaching a humanoid the subtleties of a social interaction is not evident. Even a standard dialogue exchange integrates the widest possible panel of signs which intervene in the communication and are difficult to codify (synchronization between the expression of the body, the face, the tone of the voice, etc.). In such a scenario, learning the behavioral model of the robot is a promising approach. This learning can be performed with the help of AI techniques. This study tries to solve the problem of learning robot behavioral models in the Automated Planning and Scheduling (APS) paradigm of AI. In the domain of Automated Planning and Scheduling (APS), intelligent agents by virtue require an action model (blueprints of actions whose interleaved executions effectuates transitions of the system state) in order to plan and solve real world problems. During the course of this thesis, we introduce two new learning systems which facilitate the learning of action models, and extend the scope of these new systems to learn robot behavioral models. These techniques can be classified into the categories of non-optimal and optimal. Non-optimal techniques are more classical in the domain, have been worked upon for years, and are symbolic in nature. However, they have their share of quirks, resulting in a less-than-desired learning rate. The optimal techniques are pivoted on the recent advances in deep learning, in particular the Long Short Term Memory (LSTM) family of recurrent neural networks. These techniques are more cutting edge by virtue, and produce higher learning rates as well. This study brings into the limelight these two aforementioned techniques which are tested on AI benchmarks to evaluate their prowess. They are then applied to HRI traces to estimate the quality of the learnt robot behavioral model. This is in the interest of a long term objective to introduce behavioral autonomy in robots, such that they can communicate autonomously with humans without the need of ``wizard'' intervention

Theses.fr

Apprentissage du modèle d'action pour une interaction socio-communicative des hommes-robots

Author: Arora Ankuj
Publication venue: HAL CCSD
Publication date: 08/12/2017
Field of study

Driven with the objective of rendering robots as socio-communicative, there has been a heightened interest towards researching techniques to endow robots with social skills and ``commonsense'' to render them acceptable. This social intelligence or ``commonsense'' of the robot is what eventually determines its social acceptability in the long run.Commonsense, however, is not that common. Robots can, thus, only learn to be acceptable with experience. However, teaching a humanoid the subtleties of a social interaction is not evident. Even a standard dialogue exchange integrates the widest possible panel of signs which intervene in the communication and are difficult to codify (synchronization between the expression of the body, the face, the tone of the voice, etc.). In such a scenario, learning the behavioral model of the robot is a promising approach. This learning can be performed with the help of AI techniques. This study tries to solve the problem of learning robot behavioral models in the Automated Planning and Scheduling (APS) paradigm of AI. In the domain of Automated Planning and Scheduling (APS), intelligent agents by virtue require an action model (blueprints of actions whose interleaved executions effectuates transitions of the system state) in order to plan and solve real world problems. During the course of this thesis, we introduce two new learning systems which facilitate the learning of action models, and extend the scope of these new systems to learn robot behavioral models. These techniques can be classified into the categories of non-optimal and optimal. Non-optimal techniques are more classical in the domain, have been worked upon for years, and are symbolic in nature. However, they have their share of quirks, resulting in a less-than-desired learning rate. The optimal techniques are pivoted on the recent advances in deep learning, in particular the Long Short Term Memory (LSTM) family of recurrent neural networks. These techniques are more cutting edge by virtue, and produce higher learning rates as well. This study brings into the limelight these two aforementioned techniques which are tested on AI benchmarks to evaluate their prowess. They are then applied to HRI traces to estimate the quality of the learnt robot behavioral model. This is in the interest of a long term objective to introduce behavioral autonomy in robots, such that they can communicate autonomously with humans without the need of ``wizard'' intervention.Conduite dans le but de rendre les robots comme socio-communicatifs, les chercheurs ont cherché à mettre au point des robots dotés de compétences sociales et de «bon sens» pour les rendre acceptables. Cette intelligence sociale ou «sens commun» du robot est ce qui finit par déterminer son acceptabilité sociale à long terme.Cependant, ce n'est pas commun. Les robots peuvent donc seulement apprendre à être acceptables avec l'expérience. Cependant, en enseignant à un humanoïde, les subtilités d'une interaction sociale ne sont pas évidentes. Même un échange de dialogue standard intègre le panel le plus large possible de signes qui interviennent dans la communication et sont difficiles à codifier (synchronisation entre l'expression du corps, le visage, le ton de la voix, etc.). Dans un tel scénario, l'apprentissage du modèle comportemental du robot est une approche prometteuse. Cet apprentissage peut être réalisé avec l'aide de techniques d'IA. Cette étude tente de résoudre le problème de l'apprentissage des modèles comportementaux du robot dans le paradigme automatisé de planification et d'ordonnancement (APS) de l'IA. Dans le domaine de la planification automatisée et de l'ordonnancement (APS), les agents intelligents nécessitent un modèle d'action (plans d'actions dont les exécutions entrelacées effectuent des transitions de l'état système) afin de planifier et résoudre des problèmes réels. Au cours de cette thèse, nous présentons deux nouveaux systèmes d'apprentissage qui facilitent l'apprentissage des modèles d'action et élargissent la portée de ces nouveaux systèmes pour apprendre les modèles de comportement du robot. Ces techniques peuvent être classées dans les catégories non optimale et optimale. Les techniques non optimales sont plus classiques dans le domaine, ont été traitées depuis des années et sont de nature symbolique. Cependant, ils ont leur part de quirks, ce qui entraîne un taux d'apprentissage moins élevé que souhaité. Les techniques optimales sont basées sur les progrès récents dans l'apprentissage en profondeur, en particulier la famille à long terme (LSTM) de réseaux récurrents récurrents. Ces techniques sont de plus en plus séduisantes par la vertu et produisent également des taux d'apprentissage plus élevés. Cette étude met en vedette ces deux techniques susmentionnées qui sont testées sur des repères d'IA pour évaluer leurs prouesses. Ils sont ensuite appliqués aux traces HRI pour estimer la qualité du modèle de comportement du robot savant. Ceci est dans l'intérêt d'un objectif à long terme d'introduire l'autonomie comportementale dans les robots, afin qu'ils puissent communiquer de manière autonome avec les humains sans avoir besoin d'une intervention de «magicien»

Thèses en Ligne

Hal - Université Grenoble Alpes

Learning Robot Speech Models to Predict Speech Acts in HRI

Author: Arora Ankuj
Fiorino Humbert
Pellier Damien
Pesty Sylvie
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/08/2018
Field of study

International audienceIn order to be acceptable and able to "camou-flage" into their physio-social context in the long run, robots need to be not just functional, but autonomously psycho-affective as well. This motivates a long term necessity of introducing behavioral autonomy in robots, so they can autonomously communicate with humans without the need of "wizard" intervention. This paper proposes a technique to learn robot speech models from human-robot dialog exchanges. It views the entire exchange in the Automated Planning (AP) paradigm, representing the dialog sequences (speech acts) in the form of action sequences that modify the state of the world upon execution, gradually propelling the state to a desired goal. We then exploit intra-action and interaction dependencies, encoding them in the form of constraints. We attempt to satisfy these constraints using a weighted maximum satisfiability model known as MAX-SAT, and convert the solution into a speech model. This model could have many uses, such as planning of fresh dialogs. In this study, the learnt model is used to predict speech acts in the dialog sequences using the sequence labeling (predicting future acts based on previously seen ones) capabilities of the LSTM (Long Short Term Memory) class of recurrent neural networks. Encouraging empirical results demonstrate the utility of this learnt model and its long term potential to facilitate autonomous behavioral planning of robots, an aspect to be explored in future works

Hal - Université Grenoble Alpes

Action Model Acquisition using Sequential Pattern Mining

Author: Arora Ankuj
Fiorino Humbert
Pellier Damien
Pesty Sylvie
Publication venue: HAL CCSD
Publication date: 25/09/2017
Field of study

International audienceThis paper presents an approach to learn the agents' action model (action blueprints orchestrating transitions of the system state) from plan execution sequences. It does so by representing intra-action and interaction dependencies in the form of a maximum satisfiability problem (MAX-SAT), and solving it with a MAX-SAT solver to reconstruct the underlying action model. Unlike previous MAX-SAT driven approaches, our chosen dependencies exploit the relationship between consecutive actions, rendering more accurately learnt models in the end

Hal - Université Grenoble Alpes

Learning Robot Speech Models to Predict Speech Acts in HRI

Author: Arora Ankuj
Fiorino Humbert
Pellier Damien
Pesty Sylvie
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/08/2016
Field of study

In order to be acceptable and able to “camouflage” into their physio-social context in the long run, robots need to be not just functional, but autonomously psycho-affective as well. This motivates a long term necessity of introducing behavioral autonomy in robots, so they can autonomously communicate with humans without the need of “wizard” intervention. This paper proposes a technique to learn robot speech models from human-robot dialog exchanges. It views the entire exchange in the Automated Planning (AP) paradigm, representing the dialog sequences (speech acts) in the form of action sequences that modify the state of the world upon execution, gradually propelling the state to a desired goal. We then exploit intra-action and inter-action dependencies, encoding them in the form of constraints. We attempt to satisfy these constraints using aweighted maximum satisfiability model known as MAX-SAT, and convert the solution into a speech model. This model could have many uses, such as planning of fresh dialogs. In this study, the learnt model is used to predict speech acts in the dialog sequences using the sequence labeling (predicting future acts based on previously seen ones) capabilities of the LSTM (Long Short Term Memory) class of recurrent neural networks. Encouraging empirical results demonstrate the utility of this learnt model and its long term potential to facilitate autonomous behavioral planning of robots, an aspect to be explored in future works

Directory of Open Access Journals

A Review of Learning Planning Action Models

Author: Arora Ankuj
Etivier Marc,
Fiorino Humbert
Pellier Damien
Pesty Sylvie
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2018
Field of study

International audienceAutomated planning has been a continuous field of study since the 1960s, since the notion of accomplishing a task using an ordered set of actions resonates with almost every known activity domain. However, as we move from toy domains closer to the complex real world, these actions become increasingly difficult to codify. The reasons range from intense laborious effort, to intricacies so barely identifiable, that programming them is a challenge that presents itself much later in the process. In such domains, planners now leverage recent advancements in machine learning to learn action models i.e. blueprints of all the actions whose execution effectuates transitions in the system. This learning provides an opportunity for the evolution of the model towards a version more consistent and adapted to its environment, augmenting the probability of success of the plans. It is also a conscious effort to decrease laborious manual coding and increase quality. This paper presents a survey of the machine learning techniques applied for learning planning action models. It first describes the characteristics of learning systems. It then details the learning techniques that have been used in the literature during the past decades, and finally presents some open issues

Crossref

Hal - Université Grenoble Alpes