11 research outputs found

    Using Natural Language Feedback in a Neuro-inspired Integrated Multimodal Robotic Architecture

    Get PDF
    International audienceIn this paper we present a multi-modal human robot interaction architecture which is able to combine information coming from different sensory inputs, and can generate feedback for the user which helps to teach him/her implicitly how to interact with the robot. The system combines vision, speech and language with inference and feedback. The system environment consists of a Nao robot which has to learn objects situated on a table only by understanding absolute and relative object locations uttered by the user and afterwards points on a desired object to show what it has learned. The results of a user study and performance test show the usefulness of the feedback produced by the system and also justify the usage of the system in a real-world applications, as its classification accuracy of multi-modal input is around 80.8%. In the experiments, the system was able to detect inconsistent input coming from different sensory modules in all cases and could generate useful feedback for the user from this information

    From Phonemes to Sentence Comprehension: A Neurocomputational Model of Sentence Processing for Robots

    Get PDF
    International audienceThere has been an important progress these last years in speech recognition systems. The word recognition error rate went down with the arrival of deep learning methods. However, if one uses cloud speech API and integrate it inside a robotic architecture, one faces a non negligible number of wrong sentence recognition. Thus speech recognition can not be considered as solved (because many sentences out of their contexts are ambiguous). We believe that contextual solutions (i.e. adaptable and trainable on different HRI applications) have to be found. In this perspective, the way children learn language and how our brains process utterances may help us improve how robots process language. Getting inspiration from language acquisition theories and how the brain processes sentences we previously developed a neuro-inspired model of sentence processing. In this study, we investigate how this model can process different levels of abstractions as input: sequence of phonemes, seq. of words or grammatical constructions. We see that even if the model was only tested on grammatical constructions before, it has better performances with words and phonemes inputs

    Recurrent Neural Network for Syntax Learning with Flexible Representations

    Get PDF
    International audienceWe present a Recurrent Neural Network (RNN), namely an Echo State Network (ESN), that performs sentence comprehension and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). We have previously shown that this ESN is able to generalize to unknown sentence structures in English and French. The meaning representations it can learn to produce are flexible: it enables one to use any kind of " series of slots " (or more generally a vector representation) and are not limited to predicates. Moreover, preliminary work has shown that the model could be trained fully incrementally. Thus, it enables the exploration of language acquisition in a developmental approach. Furthermore, an " inverse " version of the model has been also studied, which enables to produce sentence structure from meaning representations. Therefore, if these two models are combined in a same agent, one can investigate language (and in particular syntax) emergence through agent-based simulations. This model has been encapsulated in a ROS module which enables one to use it in a cognitive robotic architecture, or in a distributed agent simulation

    Recurrent Neural Network for Syntax Learning with Flexible Predicates for Robotic Architectures

    Get PDF
    International audienceWe present a Recurrent Neural Network (RNN), namely an Echo State Network (ESN), that performs sentence comprehension and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (i.e. predicates). We have previously shown that this ESN is able to generalize to unknown sentence structures. Moreover, it is able to learn English, French or both at the same time. The are two novelties presented here: (1) the encapsulation of this RNN in a ROS module enables one to use it in a robotic architecture like the Nao humanoid robot, and (2) the flexibility of the predicates it can learn to produce (e.g. extracting adjectives) enables one to use the model to explore language acquisition in a developmental approach

    Recurrent Neural Network Sentence Parser for Multiple Languages with Flexible Meaning Representations for Home Scenarios

    Get PDF
    International audienceWe present a Recurrent Neural Network (RNN), namely an Echo State Network (ESN), that performs sentence comprehension and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (i.e. predicates). We have previously shown that this ESN is able to generalize to unknown sentence structures in English and French. The flexibility of the predicates it can learn to produce enables one to use the model to explore language acquisition in a developmental approach. This RNN has been encapsulated in a ROS module which enables one to use it in a cognitive robotic architecture. Here, for the first time, we show that it can be trained to learn to parse sentences related to home scenarios with higly flexible predicate representations and variable sentence structures. Moreover we apply it to various languages, including some languages that were never tried with the architecture before, namely German and Spanish. We conclude that the representations are not limited to predicates, other type of representations can be used

    Learning to Parse Grounded Language using Reservoir Computing

    Get PDF
    International audienceRecently new models for language processing and learning using Reservoir Computing have been popular. However, these models are typically not grounded in sensorimotor systems and robots. In this paper, we develop a model of Reservoir Computing called Reservoir Parser (ResPars) for learning to parse Natural Language from grounded data coming from humanoid robots. Previous work showed that ResPars is able to do syntactic generalization over different sentences (surface structure) with the same meaning (deep structure). We argue that such ability is key to guide linguistic generalization in a grounded architecture. We show that ResPars is able to generalize on grounded compositional semantics by combining it with Incremental Recruitment Language (IRL). Additionally, we show that ResPars is able to learn to generalize on the same sentences, but not processed word by word, but as an unsegmented sequence of phonemes. This ability enables the architecture to not rely only on the words recognized by a speech recognizer, but to process the sub-word level directly. We additionally test the model's robustness to word error recognition

    Which Input Abstraction is Better for a Robot Syntax Acquisition Model? Phonemes, Words or Grammatical Constructions?

    Get PDF
    Corresponding code at https://github.com/neuronalX/Hinaut2018_icdl-epirobInternational audienceThere has been a considerable progress these last years in speech recognition systems [13]. The word recognition error rate went down with the arrival of deep learning methods. However, if one uses cloud-based speech API and integrates it inside a robotic architecture [33], one still encounters considerable cases of wrong sentences recognition. Thus speech recognition can not be considered as solved especially when an utterance is considered in isolation of its context. Particular solutions, that can be adapted to different Human-Robot Interaction applications and contexts, have to be found. In this perspective, the way children learn language and how our brains process utterances may help us improve how robot process language. Getting inspiration from language acquisition theories and how the brain processes sentences we previously developed a neuro-inspired model of sentence processing. In this study, we investigate how this model can process different levels of abstractions as input: sequences of phonemes, sequences of words or grammatical constructions. We see that even if the model was only tested on grammatical constructions before, it has better performances with words and phonemes inputs

    Cross-Situational Learning with Reservoir Computing for Language Acquisition Modelling

    Get PDF
    International audienceUnderstanding the mechanisms enabling children to learn rapidly word-to-meaning mapping through cross-situational learning in uncertain conditions is still a matter of debate. In particular, many models simply look at the word level, and not at the full sentence comprehension level. We present a model of language acquisition, applying cross-situational learning on Recurrent Neural Networks with the Reservoir Computing paradigm. Using the co-occurrences between words and visual perceptions, the model learns to ground a complex sentence, describing a scene involving different objects, into a perceptual representation space. The model processes sentences describing scenes it perceives simultaneously via a simulated vision module: sentences are inputs and simulated vision are target outputs of the RNN. Evaluations of the model show its capacity to extract the semantics of virtually hundred of thousands possible combinations of sentences (based on a context-free grammar); remarkably the model generalises only after a few hundred of partially described scenes via cross-situational learning. Furthermore, it handles polysemous and synonymous words, and deals with complex sentences where word order is crucial for understanding. Finally, further improvements of the model are discussed in order to reach proper reinforced and self-supervised learning schemes, with the goal to enable robots to acquire and ground language by themselves (with no oracle supervision)

    Teach Your Robot Your Language! Trainable Neural Parser for Modelling Human Sentence Processing: Examples for 15 Languages

    Get PDF
    We present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). Previously, we have shown that the model is able to generalize on English and French corpora. In this study, we investigate its ability to adapt to various languages originating from Asia or Europe. We show that it can successfully learn to parse sentences related to home scenarios in fifteen languages: English, German, French, Spanish, Catalan, Basque, Portuguese, Italian, Bulgarian, Turkish, Persian, Hindi, Marathi, Malay and Mandarin Chinese. Moreover, in the corpora we have deliberately included variable complex sentences in order to explore the flexibility of the predicate-like output representations. This demonstrates that (1) the learning principle of our model is not limited to a particular language (or particular sentence structures), but more generic in nature, and (2) it can deal with various kind of representations (not only predicates), which enables users to adapt it to their own needs. As the model is inspired from neuroscience and language acquisition theories, this generic and language independent aspect makes it a good candidate for modelling human sentence processing. It is especially relevant when this model is implemented in grounded multimodal robotic architectures

    Teach Your Robot Your Language! Trainable Neural Parser for Modelling Human Sentence Processing: Examples for 15 Languages

    Get PDF
    International audienceWe present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). Previously, we have shown that the model is able to generalize on English and French corpora. In this study, we investigate its ability to adapt to various languages originating from Asia or Europe. We show that it can successfully learn to parse sentences related to home scenarios in fifteen languages: English, German, French, Spanish, Catalan, Basque, Portuguese, Italian, Bulgarian, Turkish, Persian, Hindi, Marathi, Malay and Mandarin Chinese. Moreover, in the corpora we have deliberately included variable complex sentences in order to explore the flexibility of the predicate-like output representations. This demonstrates that (1) the learning principle of our model is not limited to a particular language (or particular sentence structures), but more generic in nature, and (2) it can deal with various kind of representations (not only predicates), which enables users to adapt it to their own needs. As the model is inspired from neuroscience and language acquisition theories, this generic and language independent aspect makes it a good candidate for modelling human sentence processing. Finally, we discuss the potential implementation of the model in a grounded robotic architecture
    corecore