17 research outputs found
Design and Evaluation of Controller-based Raycasting Methods for Efficient Alphanumeric and Special Character Entry in Virtual Reality
Alphanumeric and special characters are essential during text entry. Text entry in virtual reality (VR) is usually performed on a virtual Qwerty keyboard to minimize the need to learn new layouts. As such, entering capitals, symbols, and numbers in VR is often a direct migration from a physical/touchscreen Qwerty keyboard—that is, using the mode-switching keys to switch between different types of characters and symbols. However, there are inherent differences between a keyboard in VR and a physical/touchscreen keyboard, and as such, a direct adaptation of mode-switching via switch keys may not be suitable for VR. The high flexibility afforded by VR opens up more possibilities for entering alphanumeric and special characters using the Qwerty layout. In this work, we designed two controller-based raycasting text entry methods for alphanumeric and special characters input (Layer-ButtonSwitch and Key-ButtonSwitch) and compared them with two other methods (Standard Qwerty Keyboard and Layer-PointSwitch) that were derived from physical and soft Qwerty keyboards. We explored the performance and user preference of these four methods via two user studies (one short-term and one prolonged use), where participants were instructed to input text containing alphanumeric and special characters. Our results show that Layer-ButtonSwitch led to the highest statistically significant performance, followed by Key-ButtonSwitch and Standard Qwerty Keyboard, while Layer-PointSwitch had the slowest speed. With continuous practice, participants' performance using Key-ButtonSwitch reached that of Layer-ButtonSwitch. Further, the results show that the key-level layout used in Key-ButtonSwitch led users to parallel mode switching and character input operations because this layout showed all characters on one layer. We distill three recommendations from th results that can help guide the design of text entry techniques for alphanumeric and special characters in VR
XAIR: A Framework of Explainable AI in Augmented Reality
Explainable AI (XAI) has established itself as an important component of
AI-driven interactive systems. With Augmented Reality (AR) becoming more
integrated in daily lives, the role of XAI also becomes essential in AR because
end-users will frequently interact with intelligent services. However, it is
unclear how to design effective XAI experiences for AR. We propose XAIR, a
design framework that addresses "when", "what", and "how" to provide
explanations of AI output in AR. The framework was based on a
multi-disciplinary literature review of XAI and HCI research, a large-scale
survey probing 500+ end-users' preferences for AR-based explanations, and three
workshops with 12 experts collecting their insights about XAI design in AR.
XAIR's utility and effectiveness was verified via a study with 10 designers and
another study with 12 end-users. XAIR can provide guidelines for designers,
inspiring them to identify new design opportunities and achieve effective XAI
designs in AR.Comment: Proceedings of the 2023 CHI Conference on Human Factors in Computing
System
Recommended from our members
Machine-Learning-Enabled Gestural Interaction in Mixed Reality
Mixed Reality (MR) is a term used to describe the seamless blending of a physical environment with a digitally generated environment, creating an integrated space where real and virtual elements coexist and interact. It has introduced the possibility of new computing and interaction paradigms for users with the use of commercial headsets such as Microsoft HoloLens and Meta Quest. However, various factors have prevented these products from becoming widely adopted as everyday devices. The lack of frictionless interactions is one of the preventive factors. The thesis focuses on gestural interactions that do not require the use of controllers and are instead performed with bare hands. However, gestural interactions in MR are still in their infancy, with potential for exploration and development. The direction of gestural interaction in this thesis is partitioned into two parts: mid-air gesture keyboard for text entry, and gesture recognition for eliciting commands. They are similar to the keyboard and mouse used on personal computers. Machine learning has significantly advanced various technologies; thus the central hypothesis in this thesis is that Machine learning enables fast and accurate gestural interaction systems in Mixed Reality.
The design of machine-learning-enabled gestural interaction systems faces many challenges.
Standard machine learning models require a significant amount of data for training so that complex patterns can be recognized from the data. Owing to the limited adoption of MR devices and considering that novel interaction systems are proposed, acquiring large amounts of data can be time-consuming and challenging. Thus, data sparsity poses the first challenge, which leads to Research Question 1: How to use generative machine learning models to synthesize skeleton gestural data? This thesis proposes a novel model, the Imaginative Generative Adversarial Network (GAN), to automatically synthesize skeleton-based hand gesture data aimed at data augmentation for training gesture classification models. The results demonstrate that the proposed model trains quickly and can enhance classification accuracy compared to conventional data augmentation strategies. Furthermore, this model is extended to generate trajectory data for mid-air gesture keyboards, compare it with other generative models, and discuss the comparative advantages of each model.
Designing a mid-air gesture keyboard involves both user interface and user experience design.
MR introduces a vast design space through its high-degree of interactivity and expansive display area. Consequently, the optimal design for a mid-air gesture keyboard size varies among users due to differences in users' gesture motions and preferences. Thus, Research Question 2 is: How to use machine learning-based optimization methods to adaptively personalize the size of a mid-Air gesture keyboard? This thesis proposes a multi-objective Bayesian optimization approach for adapting the layout size of a mid-air gesture keyboard to individual users. The results demonstrates that this process can achieve a 14.4% improvement in speed and a 13.8% improvement in accuracy relative to a baseline design with a constant size.
Furthermore, MR products face challenges related to the inaccuracy and high latency of tracking, as well as the lack of haptic feedback for users. These issues result in lower text entry accuracy and speed on mid-air gesture keyboards, subsequently causing a sub-optimal user experience. Therefore, Research Question 3 is: How to use machine learning methods to accurately decode gesture trajectories into intended text on a mid-air gesture keyboard, thereby facilitating innovative designs that enhance user experience? This thesis introduces a novel gesture trajectory decoding model that robustly translates users’ three-dimensional fingertip gestural trajectories into their intended text. This accurate decoding model enables the investigation of innovative open-loop interaction designs, including the removal of visual feedback and the relaxation of delimitation thresholds.
Lastly, gesture recognition is a complex task that requires the recognition of noisy and intricate gesture data in real-time. Moreover, for those without experience in machine learning, it can be a challenge to apply these techniques to gesture recognition. Consequently, Research Question 4 is: How to use machine learning models to perform accurate gesture recognition in real-time and empower non-experts to use this technology? This thesis proposes a key gesture spotting architecture that includes a novel gesture classifier model and a novel single-time activation algorithm. This architecture is evaluated on four separate skeleton-based hand gesture datasets and achieves high recognition accuracy with early detection. Furthermore, various data processing and augmentation strategies, along with the proposed key gesture spotting architecture, are encapsulated into an interactive application, which demonstrates high usability in a user study
Recommended from our members
Tinkerable AAC Keyboard
This is the source code of Tinkerable AAC Keyboard, derived from the study of assisting non-speaking individuals with motor impairments to communicate with other people. This software can also be utilized for human-machine generated text interaction.
The code is also maintained on GitHub: https://github.com/TuringFallAsleep/Tinkerable-AAC-Keyboar
Recommended from our members
Gesture Spotter: A Rapid Prototyping Tool for Key Gesture Spotting in Virtual and Augmented Reality Applications.
In this paper we examine the task of key gesture spotting: accurate and timely online recognition of hand gestures. We specifically seek to address two key challenges faced by developers when integrating key gesture spotting functionality into their applications. These are: i) achieving high accuracy and zero or negative activation lag with single-time activation; and ii) avoiding the requirement for deep domain expertise in machine learning. We address the first challenge by proposing a key gesture spotting architecture consisting of a novel gesture classifier model and a novel single-time activation algorithm. This key gesture spotting architecture was evaluated on four separate hand skeleton gesture datasets, and achieved high recognition accuracy with early detection. We address the second challenge by encapsulating different data processing and augmentation strategies, as well as the proposed key gesture spotting architecture, into a graphical user interface and an application programming interface. Two user studies demonstrate that developers are able to efficiently construct custom recognizers using both the graphical user interface and the application programming interface