Search CORE

681 research outputs found

HUMAN-ROBOT INTERACTION: LANGUAGE ACQUISITION WITH NEURAL NETWORK

Author: Fazrie Alvin Rindra
Publication venue: 'LP2M Universitas Islam Negeri (UIN) Syarif Hidayatullah Jakarta'
Publication date: 04/05/2018
Field of study

ABSTRACT The paper gives an overview about the process between two language processing methods towards Human-robot interaction. In this paper, Echo State Networks and Stochastic-learning grammar are explored in order to get an idea about generating human’s natural language and the possibilities of integrating these methods to make the communication process between robot to robot or robot to human to be more natural in dialogic syntactic language game. The methods integration could give several benefits such as improving the communicative efficiency and producing the more natural communication sentence. ABSTRAK Tulisan ini memberikan penjabaran mengenai dua metode pemrosesan bahasa alami pada interaksi Manusia dan Robot. Echo State Networks adalah salah satu arsitektur dari Jaringan Syaraf Tiruan yang berdasarkan prinsip Supervised Learning untuk Recurrent Neural Network, dieksplorasi bersama Stochastic-learning Grammar yaitu salah satu framework tata bahasa dengan konsep probabilistik yang bertujuan untuk mendapatkan ide bagaimana proses bahasa alami dari manusia dan kemungkinannya mengintegrasikan dua metode tersebut untuk membuat proses komunikasi antara robot dengan robot atau robot dengan manusia menjadi lebih natural dalam dialogic syntactic language game. Metode integrasi dapat memberikan beberapa keuntungan seperti meningkatkan komunikasi yang efisien dan dapat membuat konstruksi kalimat saat komunikasi menjadi lebih natural. How To Cite : Fazrie, A.R. (2018). HUMAN-ROBOT INTERACTION: LANGUAGE ACQUISITION WITH NEURAL NETWORK. Jurnal Teknik Informatika, 11(1), 75-84. doi 10.15408/jti.v11i1.6093 Permalink/DOI: http://dx.doi.org/10.15408/jti.v11i1.6093

JURNAL TEKNIK INFORMATIKA

Which Input Abstraction is Better for a Robot Syntax Acquisition Model? Phonemes, Words or Grammatical Constructions?

Author: Hinaut Xavier
Publication venue: HAL CCSD
Publication date: 17/09/2018
Field of study

Corresponding code at https://github.com/neuronalX/Hinaut2018_icdl-epirobInternational audienceThere has been a considerable progress these last years in speech recognition systems [13]. The word recognition error rate went down with the arrival of deep learning methods. However, if one uses cloud-based speech API and integrates it inside a robotic architecture [33], one still encounters considerable cases of wrong sentences recognition. Thus speech recognition can not be considered as solved especially when an utterance is considered in isolation of its context. Particular solutions, that can be adapted to different Human-Robot Interaction applications and contexts, have to be found. In this perspective, the way children learn language and how our brains process utterances may help us improve how robot process language. Getting inspiration from language acquisition theories and how the brain processes sentences we previously developed a neuro-inspired model of sentence processing. In this study, we investigate how this model can process different levels of abstractions as input: sequences of phonemes, sequences of words or grammatical constructions. We see that even if the model was only tested on grammatical constructions before, it has better performances with words and phonemes inputs

Crossref

INRIA a CCSD electronic archive server

Reservoir SMILES: Towards SensoriMotor Interaction of Language and Embodiment of Symbols with Reservoir Architectures

Author: Hinaut Xavier
Publication venue: HAL CCSD
Publication date: 16/11/2022
Field of study

Language involves several hierarchical levels of abstraction. Most models focus on a particular level of abstraction making them unable to model bottom-up and top-down processes. Moreover, we do not know how the brain grounds symbols to perceptions and how these symbols emerge throughout development. Experimental evidence suggests that perception and action shape one-another (e.g. motor areas activated during speech perception) but the precise mechanisms involved in this action-perception shaping at various levels of abstraction are still largely unknown. My previous and current work include the modelling of language comprehension, language acquisition with a robotic perspective, sensorimotor models and extended models of Reservoir Computing to model working memory and hierarchical processing. I propose to create a new generation of neural-based computational models of language processing and production; to use biologically plausible learning mechanisms relying on recurrent neural networks; create novel sensorimotor mechanisms to account for action-perception shaping; build hierarchical models from sensorimotor to sentence level; embody such models in robots

INRIA a CCSD electronic archive server

From Phonemes to Robot Commands with a Neural Parser

Author: Hinaut Xavier
Publication venue: HAL CCSD
Publication date: 18/09/2017
Field of study

International audienceThe understanding of how children acquire language [1][2], from phoneme to syntax, could be improved by computational models. In particular when they are integrated in robots [3]: e.g. by interacting with users [4] or grounding language cues [5]. Recently, speech recognition systems have greatly improved thanks to deep learning. However, for specific domain applications, like Human-Robot Interaction, using generic recognition tools such as Google API often provide words that are unknown by the robotic system when not just irrelevant [6]. Additionally, such recognition system does not provide much indications on how our brains acquire or process these phonemes, words or grammatical constructions (i.e. sentence templates). Moreover, to our knowledge they do not provide useful tools to learn from small corpora, from which a child may bootstrap from. Here, we propose a neuro-inspired approach that processes sentences word by word, or phoneme by phoneme, with no prior knowledge of the semantics of the words. Previously, we demonstrated this RNN-based model was able to generalize on grammatical constructions [7] even with unknown words (i.e. words out of the vocabulary of the training data) [8]. In this preliminary study, in order to try to overcome word misrecognition, we tested whether the same architecture is able to solve the same task directly by processing phonemes instead of grammatical constructions [9]. Applied on a small corpus, we see that the model has similar performance (even if a little weaker) when using phonemes as inputs instead of grammatical constructions. We speculate that this phoneme version could overcome the previous model when dealing with real noisy phoneme inputs, thus improving its performance in a real-time human-robot interaction

INRIA a CCSD electronic archive server

Learning to Parse Grounded Language using Reservoir Computing

Author: Hinaut Xavier
Spranger Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/08/2019
Field of study

International audienceRecently new models for language processing and learning using Reservoir Computing have been popular. However, these models are typically not grounded in sensorimotor systems and robots. In this paper, we develop a model of Reservoir Computing called Reservoir Parser (ResPars) for learning to parse Natural Language from grounded data coming from humanoid robots. Previous work showed that ResPars is able to do syntactic generalization over different sentences (surface structure) with the same meaning (deep structure). We argue that such ability is key to guide linguistic generalization in a grounded architecture. We show that ResPars is able to generalize on grounded compositional semantics by combining it with Incremental Recruitment Language (IRL). Additionally, we show that ResPars is able to learn to generalize on the same sentences, but not processed word by word, but as an unsegmented sequence of phonemes. This ability enables the architecture to not rely only on the words recognized by a speech recognizer, but to process the sub-word level directly. We additionally test the model's robustness to word error recognition

Crossref

INRIA a CCSD electronic archive server

Teach Your Robot Your Language! Trainable Neural Parser for Modelling Human Sentence Processing: Examples for 15 Languages

Author: Hinaut Xavier
Twiefel Johannes
Publication venue: HAL CCSD
Publication date: 17/12/2017
Field of study

We present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). Previously, we have shown that the model is able to generalize on English and French corpora. In this study, we investigate its ability to adapt to various languages originating from Asia or Europe. We show that it can successfully learn to parse sentences related to home scenarios in fifteen languages: English, German, French, Spanish, Catalan, Basque, Portuguese, Italian, Bulgarian, Turkish, Persian, Hindi, Marathi, Malay and Mandarin Chinese. Moreover, in the corpora we have deliberately included variable complex sentences in order to explore the flexibility of the predicate-like output representations. This demonstrates that (1) the learning principle of our model is not limited to a particular language (or particular sentence structures), but more generic in nature, and (2) it can deal with various kind of representations (not only predicates), which enables users to adapt it to their own needs. As the model is inspired from neuroscience and language acquisition theories, this generic and language independent aspect makes it a good candidate for modelling human sentence processing. It is especially relevant when this model is implemented in grounded multimodal robotic architectures

INRIA a CCSD electronic archive server

Teach Your Robot Your Language! Trainable Neural Parser for Modelling Human Sentence Processing: Examples for 15 Languages

Author: Hinaut Xavier
Twiefel Johannes
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/12/2019
Field of study

International audienceWe present a Recurrent Neural Network (RNN) that performs thematic role assignment and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). Previously, we have shown that the model is able to generalize on English and French corpora. In this study, we investigate its ability to adapt to various languages originating from Asia or Europe. We show that it can successfully learn to parse sentences related to home scenarios in fifteen languages: English, German, French, Spanish, Catalan, Basque, Portuguese, Italian, Bulgarian, Turkish, Persian, Hindi, Marathi, Malay and Mandarin Chinese. Moreover, in the corpora we have deliberately included variable complex sentences in order to explore the flexibility of the predicate-like output representations. This demonstrates that (1) the learning principle of our model is not limited to a particular language (or particular sentence structures), but more generic in nature, and (2) it can deal with various kind of representations (not only predicates), which enables users to adapt it to their own needs. As the model is inspired from neuroscience and language acquisition theories, this generic and language independent aspect makes it a good candidate for modelling human sentence processing. Finally, we discuss the potential implementation of the model in a grounded robotic architecture

INRIA a CCSD electronic archive server

Recurrent Neural Network for Syntax Learning with Flexible Representations

Author: Hinaut Xavier
Publication venue: HAL CCSD
Publication date: 19/12/2016
Field of study

International audienceWe present a Recurrent Neural Network (RNN), namely an Echo State Network (ESN), that performs sentence comprehension and can be used for Human-Robot Interaction (HRI). The RNN is trained to map sentence structures to meanings (e.g. predicates). We have previously shown that this ESN is able to generalize to unknown sentence structures in English and French. The meaning representations it can learn to produce are flexible: it enables one to use any kind of " series of slots " (or more generally a vector representation) and are not limited to predicates. Moreover, preliminary work has shown that the model could be trained fully incrementally. Thus, it enables the exploration of language acquisition in a developmental approach. Furthermore, an " inverse " version of the model has been also studied, which enables to produce sentence structure from meaning representations. Therefore, if these two models are combined in a same agent, one can investigate language (and in particular syntax) emergence through agent-based simulations. This model has been encapsulated in a ROS module which enables one to use it in a cognitive robotic architecture, or in a distributed agent simulation

INRIA a CCSD electronic archive server

From Phonemes to Sentence Comprehension: A Neurocomputational Model of Sentence Processing for Robots

Author: Hinaut Xavier
Publication venue: HAL CCSD
Publication date: 24/05/2018
Field of study

International audienceThere has been an important progress these last years in speech recognition systems. The word recognition error rate went down with the arrival of deep learning methods. However, if one uses cloud speech API and integrate it inside a robotic architecture, one faces a non negligible number of wrong sentence recognition. Thus speech recognition can not be considered as solved (because many sentences out of their contexts are ambiguous). We believe that contextual solutions (i.e. adaptable and trainable on different HRI applications) have to be found. In this perspective, the way children learn language and how our brains process utterances may help us improve how robots process language. Getting inspiration from language acquisition theories and how the brain processes sentences we previously developed a neuro-inspired model of sentence processing. In this study, we investigate how this model can process different levels of abstractions as input: sequence of phonemes, seq. of words or grammatical constructions. We see that even if the model was only tested on grammatical constructions before, it has better performances with words and phonemes inputs

INRIA a CCSD electronic archive server