3,357 research outputs found

    To Draw or Not to Draw: Recognizing Stroke-Hover Intent in Gesture-Free Bare-Hand Mid-Air Drawing Tasks

    Get PDF
    Over the past several decades, technological advancements have introduced new modes of communication with the computers, introducing a shift from traditional mouse and keyboard interfaces. While touch based interactions are abundantly being used today, latest developments in computer vision, body tracking stereo cameras, and augmented and virtual reality have now enabled communicating with the computers using spatial input in the physical 3D space. These techniques are now being integrated into several design critical tasks like sketching, modeling, etc. through sophisticated methodologies and use of specialized instrumented devices. One of the prime challenges in design research is to make this spatial interaction with the computer as intuitive as possible for the users. Drawing curves in mid-air with fingers, is a fundamental task with applications to 3D sketching, geometric modeling, handwriting recognition, and authentication. Sketching in general, is a crucial mode for effective idea communication between designers. Mid-air curve input is typically accomplished through instrumented controllers, specific hand postures, or pre-defined hand gestures, in presence of depth and motion sensing cameras. The user may use any of these modalities to express the intention to start or stop sketching. However, apart from suffering with issues like lack of robustness, the use of such gestures, specific postures, or the necessity of instrumented controllers for design specific tasks further result in an additional cognitive load on the user. To address the problems associated with different mid-air curve input modalities, the presented research discusses the design, development, and evaluation of data driven models for intent recognition in non-instrumented, gesture-free, bare-hand mid-air drawing tasks. The research is motivated by a behavioral study that demonstrates the need for such an approach due to the lack of robustness and intuitiveness while using hand postures and instrumented devices. The main objective is to study how users move during mid-air sketching, develop qualitative insights regarding such movements, and consequently implement a computational approach to determine when the user intends to draw in mid-air without the use of an explicit mechanism (such as an instrumented controller or a specified hand-posture). By recording the user’s hand trajectory, the idea is to simply classify this point as either hover or stroke. The resulting model allows for the classification of points on the user’s spatial trajectory. Drawing inspiration from the way users sketch in mid-air, this research first specifies the necessity for an alternate approach for processing bare hand mid-air curves in a continuous fashion. Further, this research presents a novel drawing intent recognition work flow for every recorded drawing point, using three different approaches. We begin with recording mid-air drawing data and developing a classification model based on the extracted geometric properties of the recorded data. The main goal behind developing this model is to identify drawing intent from critical geometric and temporal features. In the second approach, we explore the variations in prediction quality of the model by improving the dimensionality of data used as mid-air curve input. Finally, in the third approach, we seek to understand the drawing intention from mid-air curves using sophisticated dimensionality reduction neural networks such as autoencoders. Finally, the broad level implications of this research are discussed, with potential development areas in the design and research of mid-air interactions

    Gestural interaction in Virtual Environments: user studies and applications

    Get PDF
    With the currently available technology, there has been an increased interest in the development of virtual reality (VR) applications, some of those already becoming commercial products. This fact also rose many issues related to their usability. One of the main challenges is the design of interfaces to interact with these immersive virtual environments (IVE), in particular for those setups relying on hand tracking as a mean of input and in general with devices that stray away from the standard choices such as keyboards and mice. Finding appropriate ways to deal with the usability issues is key to the success of applications in VR, not only for entertainment purposes but also for professional application in medical and engineering field. The research on this topic in the last years has been quite active and several problems were identified as relevant to achieve satisfying usability results. This research project aims to tackle some of those critical aspects strictly related to interaction in IVEs. It is firstly focused on object manipulation. An initial evaluation allowed us to highlight the nature of some critical issues from which we derived some design guidelines for manipulation techniques. We proposed a different number of solutions and performed user tests to validate them. Secondly, we tried to verify the feasibility of gestural interfaces in applications by developing and testing a gesture recognition algorithm based on 3D hand trajectories. This work showed promising results in terms of accuracy hinting at the possibility of a reliable gestural interface for VR applications

    Dynamic motion coupling of body movement for input control

    Get PDF
    Touchless gestures are used for input when touch is unsuitable or unavailable, such as when interacting with displays that are remote, large, public, or when touch is prohibited for hygienic reasons. Traditionally user input is spatially or semantically mapped to system output, however, in the context of touchless gestures these interaction principles suffer from several disadvantages including memorability, fatigue, and ill-defined mappings. This thesis investigates motion correlation as the third interaction principle for touchless gestures, which maps user input to system output based on spatiotemporal matching of reproducible motion. We demonstrate the versatility of motion correlation by using movement as the primary sensing principle, relaxing the restrictions on how a user provides input. Using TraceMatch, a novel computer vision-based system, we show how users can provide effective input through investigation of input performance with different parts of the body, and how users can switch modes of input spontaneously in realistic application scenarios. Secondly, spontaneous spatial coupling shows how motion correlation can bootstrap spatial input, allowing any body movement, or movement of tangible objects, to be appropriated for ad hoc touchless pointing on a per interaction basis. We operationalise the concept in MatchPoint, and demonstrate the unique capabilities through an exploration of the design space with application examples. Finally, we explore how users synchronise with moving targets in the context of motion correlation, revealing how simple harmonic motion leads to better synchronisation. Using the insights gained we explore the robustness of algorithms used for motion correlation, showing how it is possible to successfully detect a user's intent to interact whilst suppressing accidental activations from common spatial and semantic gestures. Finally, we look across our work to distil guidelines for interface design, and further considerations of how motion correlation can be used, both in general and for touchless gestures

    Towards Naturalistic Interfaces of Virtual Reality Systems

    Get PDF
    Interaction plays a key role in achieving realistic experience in virtual reality (VR). Its realization depends on interpreting the intents of human motions to give inputs to VR systems. Thus, understanding human motion from the computational perspective is essential to the design of naturalistic interfaces for VR. This dissertation studied three types of human motions, including locomotion (walking), head motion and hand motion in the context of VR. For locomotion, the dissertation presented a machine learning approach for developing a mechanical repositioning technique based on a 1-D treadmill for interacting with a unique new large-scale projective display, called the Wide-Field Immersive Stereoscopic Environment (WISE). The usability of the proposed approach was assessed through a novel user study that asked participants to pursue a rolling ball at variable speed in a virtual scene. In addition, the dissertation studied the role of stereopsis in avoiding virtual obstacles while walking by asking participants to step over obstacles and gaps under both stereoscopic and non-stereoscopic viewing conditions in VR experiments. In terms of head motion, the dissertation presented a head gesture interface for interaction in VR that recognizes real-time head gestures on head-mounted displays (HMDs) using Cascaded Hidden Markov Models. Two experiments were conducted to evaluate the proposed approach. The first assessed its offline classification performance while the second estimated the latency of the algorithm to recognize head gestures. The dissertation also conducted a user study that investigated the effects of visual and control latency on teleoperation of a quadcopter using head motion tracked by a head-mounted display. As part of the study, a method for objectively estimating the end-to-end latency in HMDs was presented. For hand motion, the dissertation presented an approach that recognizes dynamic hand gestures to implement a hand gesture interface for VR based on a static head gesture recognition algorithm. The proposed algorithm was evaluated offline in terms of its classification performance. A user study was conducted to compare the performance and the usability of the head gesture interface, the hand gesture interface and a conventional gamepad interface for answering Yes/No questions in VR. Overall, the dissertation has two main contributions towards the improvement of naturalism of interaction in VR systems. Firstly, the interaction techniques presented in the dissertation can be directly integrated into existing VR systems offering more choices for interaction to end users of VR technology. Secondly, the results of the user studies of the presented VR interfaces in the dissertation also serve as guidelines to VR researchers and engineers for designing future VR systems

    Understanding egocentric human actions with temporal decision forests

    Get PDF
    Understanding human actions is a fundamental task in computer vision with a wide range of applications including pervasive health-care, robotics and game control. This thesis focuses on the problem of egocentric action recognition from RGB-D data, wherein the world is viewed through the eyes of the actor whose hands describe the actions. The main contributions of this work are its findings regarding egocentric actions as described by hands in two application scenarios and a proposal of a new technique that is based on temporal decision forests. The thesis first introduces a novel framework to recognise fingertip writing in mid-air in the context of human-computer interaction. This framework detects whether the user is writing and tracks the fingertip over time to generate spatio-temporal trajectories that are recognised by using a Hough forest variant that encourages temporal consistency in prediction. A problem with using such forest approach for action recognition is that the learning of temporal dynamics is limited to hand-crafted temporal features and temporal regression, which may break the temporal continuity and lead to inconsistent predictions. To overcome this limitation, the thesis proposes transition forests. Besides any temporal information that is encoded in the feature space, the forest automatically learns the temporal dynamics during training, and it is exploited in inference in an online and efficient manner achieving state-of-the-art results. The last contribution of this thesis is its introduction of the first RGB-D benchmark to allow for the study of egocentric hand-object actions with both hand and object pose annotations. This study conducts an extensive evaluation of different baselines, state-of-the art approaches and temporal decision forest models using colour, depth and hand pose features. Furthermore, it extends the transition forest model to incorporate data from different modalities and demonstrates the benefit of using hand pose features to recognise egocentric human actions. The thesis concludes by discussing and analysing the contributions and proposing a few ideas for future work.Open Acces

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    A vision-based approach for human hand tracking and gesture recognition.

    Get PDF
    Hand gesture interface has been becoming an active topic of human-computer interaction (HCI). The utilization of hand gestures in human-computer interface enables human operators to interact with computer environments in a natural and intuitive manner. In particular, bare hand interpretation technique frees users from cumbersome, but typically required devices in communication with computers, thus offering the ease and naturalness in HCI. Meanwhile, virtual assembly (VA) applies virtual reality (VR) techniques in mechanical assembly. It constructs computer tools to help product engineers planning, evaluating, optimizing, and verifying the assembly of mechanical systems without the need of physical objects. However, traditional devices such as keyboards and mice are no longer adequate due to their inefficiency in handling three-dimensional (3D) tasks. Special VR devices, such as data gloves, have been mandatory in VA. This thesis proposes a novel gesture-based interface for the application of VA. It develops a hybrid approach to incorporate an appearance-based hand localization technique with a skin tone filter in support of gesture recognition and hand tracking in the 3D space. With this interface, bare hands become a convenient substitution of special VR devices. Experiment results demonstrate the flexibility and robustness introduced by the proposed method to HCI.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .L8. Source: Masters Abstracts International, Volume: 43-03, page: 0883. Adviser: Xiaobu Yuan. Thesis (M.Sc.)--University of Windsor (Canada), 2004
    • …
    corecore