thesis
Multimodal interactions in virtual environments using eye tracking and gesture control.
- Publication date
- Publisher
Abstract
Multimodal interactions provide users with more natural ways to interact with virtual environments than using traditional input methods. An emerging approach is gaze modulated pointing, which enables users to perform virtual content selection and manipulation conveniently through the use of a combination of gaze and other hand control techniques/pointing devices, in this thesis, mid-air gestures. To establish a synergy between the two modalities and evaluate the affordance of this novel multimodal interaction technique, it is important to understand their behavioural patterns and relationship, as well as any possible perceptual conflicts and interactive ambiguities. More specifically, evidence shows that eye movements lead hand movements but the question remains that whether the leading relationship is similar when interacting using a pointing device. Moreover, as gaze modulated pointing uses different sensors to track and detect user behaviours, its performance relies on users perception on the exact spatial mapping between the virtual space and the physical space. It raises an underexplored issue that whether gaze can introduce misalignment of the spatial mapping and lead to users misperception and interactive errors. Furthermore, the accuracy of eye tracking and mid-air gesture control are not comparable with the traditional pointing techniques (e.g., mouse) yet. This may cause pointing ambiguity when fine grainy interactions are required, such as selecting in a dense virtual scene where proximity and occlusion are prone to occur. This thesis addresses these concerns through experimental studies and theoretical analysis that involve paradigm design, development of interactive prototypes, and user study for verification of assumptions, comparisons and evaluations. Substantial data sets were obtained and analysed from each experiment. The results conform to and extend previous empirical findings that gaze leads pointing devices movements in most cases both spatially and temporally. It is testified that gaze does introduce spatial misperception and three methods (Scaling, Magnet and Dual-gaze) were proposed and proved to be able to reduce the impact caused by this perceptual conflict where Magnet and Dual-gaze can deliver better performance than Scaling. In addition, a coarse-to-fine solution is proposed and evaluated to compensate the degradation introduced by eye tracking inaccuracy, which uses a gaze cone to detect ambiguity followed by a gaze probe for decluttering. The results show that this solution can enhance the interaction accuracy but requires a compromise on efficiency. These findings can be used to inform a more robust multimodal inter- face design for interactions within virtual environments that are supported by both eye tracking and mid-air gesture control. This work also opens up a technical pathway for the design of future multimodal interaction techniques, which starts from a derivation from natural correlated behavioural patterns, and then considers whether the design of the interaction technique can maintain perceptual constancy and whether any ambiguity among the integrated modalities will be introduced