research

Vision-based interaction within a multimodal framework

Abstract

Our contribution is to the field of video-based interaction techniques and is integrated in the home environment of the EMBASSI project. This project addresses innovative methods of man-machine interaction achieved through the development of intelligent assistance and anthropomorphic user interfaces. Within this project, multimodal techniques represent a basic requirement, especially considering those related to the integration of modalities. We are using a stereoscopic approach to allow the natural selection of devices via pointing ges-tures. The pointing hand is segmented from the video images and the 3D position and orientation of the forefinger is calculated. This modality has a subsequent integration with that of speech, in the context of a multimodal interaction infrastructure. In a first phase, we use semantic fusion with amodal input, considering the modalities in a so-called late fusion state

    Similar works