Integration of real-time speech recognition and action in humanoid robots

Abstract

Human speech and visual data are two crucial sources of communication that aid people in interacting with their surrounding environment. Thus, both speech and visual inputs are essential and should contribute to the robot’s action to promote the use of the robot as a cognitive tool. Speech recognition and face recognition are two demanding areas of research: they represent two means by which intelligence behaviors can be expressed. In this thesis, we are interested in investigating whether a robot is able to integrate visual and speech information to make decisions and perform actions accordingly. The iCub robot will listen to real-time human speech from the user and point its finger at a person’s face in an image as dictated by the user. In the following sections, our methods, experimental results, and future work will be further discussed.Ope

    Similar works