modular, adaptable software systems for vision-based, human-computer interaction. The key idea of this approach is to use local visual interaction cues (VICs) on a video stream shared between the user and the machine. A VIC consists of a graphical representation (e.g. an icon) superimposed on the video stream (thus visible to the user), associated image processing algorithms for activating the cue, and other application-specific code. The video stream could be monocular or stereo, enabling 2-D and 3-D interaction and may be combined with speech or haptics to provide enhanced interaction capabilities. VICs are intended to be used in situations where large-scale spatial motion, particularly hand-eye coordinated motion, is essential. For example, manipulatin
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.