1,243 research outputs found

    Gaze modulated disambiguation technique for gesture control in 3D virtual objects selection

    Get PDF
    © 2017 IEEE. Inputs with multimodal information provide more natural ways to interact with virtual 3D environment. An emerging technique that integrates gaze modulated pointing with mid-air gesture control enables fast target acquisition and rich control expressions. The performance of this technique relies on the eye tracking accuracy which is not comparable with the traditional pointing techniques (e.g., mouse) yet. This will cause troubles when fine grainy interactions are required, such as selecting in a dense virtual scene where proximity and occlusion are prone to occur. This paper proposes a coarse-to-fine solution to compensate the degradation introduced by eye tracking inaccuracy using a gaze cone to detect ambiguity and then a gaze probe for decluttering. It is tested in a comparative experiment which involves 12 participants with 3240 runs. The results show that the proposed technique enhanced the selection accuracy and user experience but it is still with a potential to be improved in efficiency. This study contributes to providing a robust multimodal interface design supported by both eye tracking and mid-air gesture control

    Study of the interaction with a virtual 3D environment displayed on a smartphone

    Get PDF
    Les environnements virtuels à 3D (EV 3D) sont de plus en plus utilisés dans différentes applications telles que la CAO, les jeux ou la téléopération. L'évolution des performances matérielles des Smartphones a conduit à l'introduction des applications 3D sur les appareils mobiles. En outre, les Smartphones offrent de nouvelles capacités bien au-delà de la communication vocale traditionnelle qui sont consentis par l'intégrité d'une grande variété de capteurs et par la connectivité via Internet. En conséquence, plusieurs intéressantes applications 3D peuvent être conçues en permettant aux capacités de l'appareil d'interagir dans un EV 3D. Sachant que les Smartphones ont de petits et aplatis écrans et que EV 3D est large, dense et contenant un grand nombre de cibles de tailles différentes, les appareils mobiles présentent certaines contraintes d'interaction dans l'EV 3D comme : la densité de l'environnement, la profondeur de cibles et l'occlusion. La tâche de sélection fait face à ces trois problèmes pour sélectionner une cible. De plus, la tâche de sélection peut être décomposée en trois sous-tâches : la Navigation, le Pointage et la Validation. En conséquence, les chercheurs dans un environnement virtuel 3D ont développé de nouvelles techniques et métaphores pour l'interaction en 3D afin d'améliorer l'utilisation des applications 3D sur les appareils mobiles, de maintenir la tâche de sélection et de faire face aux problèmes ou facteurs affectant la performance de sélection. En tenant compte de ces considérations, cette thèse expose un état de l'art des techniques de sélection existantes dans un EV 3D et des techniques de sélection sur Smartphone. Il expose les techniques de sélection dans un EV 3D structurées autour des trois sous-tâches de sélection: navigation, pointage et validation. En outre, il décrit les techniques de désambiguïsation permettant de sélectionner une cible parmi un ensemble d'objets présélectionnés. Ultérieurement, il expose certaines techniques d'interaction décrites dans la littérature et conçues pour être implémenter sur un Smartphone. Ces techniques sont divisées en deux groupes : techniques effectuant des tâches de sélection bidimensionnelle sur un Smartphone et techniques exécutant des tâches de sélection tridimensionnelle sur un Smartphone. Enfin, nous exposons les techniques qui utilisaient le Smartphone comme un périphérique de saisie. Ensuite, nous discuterons la problématique de sélection dans un EV 3D affichée sur un Smartphone. Il expose les trois problèmes identifiés de sélection : la densité de l'environnement, la profondeur des cibles et l'occlusion. Ensuite, il établit l'amélioration offerte par chaque technique existante pour la résolution des problèmes de sélection. Il analyse les atouts proposés par les différentes techniques, la manière dont ils éliminent les problèmes, leurs avantages et leurs inconvénients. En outre, il illustre la classification des techniques de sélection pour un EV 3D en fonction des trois problèmes discutés (densité, profondeur et occlusion) affectant les performances de sélection dans un environnement dense à 3D. Hormis pour les jeux vidéo, l'utilisation d'environnement virtuel 3D sur Smartphone n'est pas encore démocratisée. Ceci est dû au manque de techniques d'interaction proposées pour interagir avec un dense EV 3D composé de nombreux objets proches les uns des autres et affichés sur un petit écran aplati et les problèmes de sélection pour afficher l' EV 3D sur un petit écran plutôt sur un grand écran. En conséquence, cette thèse se concentre sur la proposition et la description du fruit de cette étude : la technique d'interaction DichotoZoom. Elle compare et évalue la technique proposée à la technique de circulation suggérée par la littérature. L'analyse comparative montre l'efficacité de la technique DichotoZoom par rapport à sa contrepartie. Ensuite, DichotoZoom a été évalué selon les différentes modalités d'interaction disponibles sur les Smartphones. Cette évaluation montre la performance de la technique de sélection proposée basée sur les quatre modalités d'interaction suivantes : utilisation de boutons physiques ou sous forme de composants graphiques, utilisation d'interactions gestuelles via l'écran tactile ou le déplacement de l'appareil lui-même. Enfin, cette thèse énumère nos contributions dans le domaine des techniques d'interaction 3D utilisées dans un environnement virtuel 3D dense affiché sur de petits écrans et propose des travaux futurs.3D Virtual Environments (3D VE) are more and more used in different applications such as CAD, games, or teleoperation. Due to the improvement of smartphones hardware performance, 3D applications were also introduced to mobile devices. In addition, smartphones provide new computing capabilities far beyond the traditional voice communication. They are permitted by the variety of built-in sensors and the internet connectivity. In consequence, interesting 3D applications can be designed by enabling the device capabilities to interact in a 3D VE. Due to the fact that smartphones have small and flat screens and that a 3D VE is wide and dense with a large number of targets of various sizes, mobile devices present some constraints in interacting on the 3D VE like: the environment density, the depth of targets and the occlusion. The selection task faces these three problems to select a target. In addition, the selection task can be decomposed into three subtasks: Navigation, Pointing and Validation. In consequence, researchers in 3D virtual environment have developed new techniques and metaphors for 3D interaction to improve 3D application usability on mobile devices, to support the selection task and to face the problems or factors affecting selection performance. In light of these considerations, this thesis exposes a state of the art of the existing selection techniques in 3D VE and the selection techniques on smartphones. It exposes the selection techniques in 3D VE structured around the selection subtasks: navigation, pointing and validation. Moreover, it describes disambiguation techniques providing the selection of a target from a set of pre-selected objects. Afterward, it exposes some interaction techniques described in literature and designed for implementation on Smartphone. These techniques are divided into two groups: techniques performing two-dimensional selection tasks on smartphones, and techniques performing three-dimensional selection tasks on smartphones. Finally, we expose techniques that used the smartphone as an input device. Then, we will discuss the problematic of selecting in 3D VE displayed on a Smartphone. It exposes the three identified selection problems: the environment density, the depth of targets and the occlusion. Afterward, it establishes the enhancement offered by each existing technique in solving the selection problems. It analysis the assets proposed by different techniques, the way they eliminates the problems, their advantages and their inconvenient. Furthermore, it illustrates the classification of the selection techniques for 3D VE according to the three discussed problems (density, depth and occlusion) affecting the selection performance in a dense 3D VE. Except for video games, the use of 3D virtual environment (3D VE) on Smartphone has not yet been popularized. This is due to the lack of interaction techniques to interact with a dense 3D VE composed of many objects close to each other and displayed on a small and flat screen and the selection problems to display the 3D VE on a small screen rather on a large screen. Accordingly, this thesis focuses on defining and describing the fruit of this study: DichotoZoom interaction technique. It compares and evaluates the proposed technique to the Circulation technique, suggested by the literature. The comparative analysis shows the effectiveness of DichotoZoom technique compared to its counterpart. Then, DichotoZoom was evaluated in different modalities of interaction available on Smartphones. It reports on the performance of the proposed selection technique based on the following four interaction modalities: using physical buttons, using graphical buttons, using gestural interactions via touchscreen or moving the device itself. Finally, this thesis lists our contributions to the field of 3D interaction techniques used in a dense 3D virtual environment displayed on small screens and proposes some future works

    Deep Projective 3D Semantic Segmentation

    Full text link
    Semantic segmentation of 3D point clouds is a challenging problem with numerous real-world applications. While deep learning has revolutionized the field of image semantic segmentation, its impact on point cloud data has been limited so far. Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results. Such methods require voxelizations of the underlying point cloud data, leading to decreased spatial resolution and increased memory consumption. Additionally, 3D-CNNs greatly suffer from the limited availability of annotated datasets. In this paper, we propose an alternative framework that avoids the limitations of 3D-CNNs. Instead of directly solving the problem in 3D, we first project the point cloud onto a set of synthetic 2D-images. These images are then used as input to a 2D-CNN, designed for semantic segmentation. Finally, the obtained prediction scores are re-projected to the point cloud to obtain the segmentation results. We further investigate the impact of multiple modalities, such as color, depth and surface normals, in a multi-stream network architecture. Experiments are performed on the recent Semantic3D dataset. Our approach sets a new state-of-the-art by achieving a relative gain of 7.9 %, compared to the previous best approach.Comment: Submitted to CAIP 201

    Computational interaction techniques for 3D selection, manipulation and navigation in immersive VR

    Get PDF
    3D interaction provides a natural interplay for HCI. Many techniques involving diverse sets of hardware and software components have been proposed, which has generated an explosion of Interaction Techniques (ITes), Interactive Tasks (ITas) and input devices, increasing thus the heterogeneity of tools in 3D User Interfaces (3DUIs). Moreover, most of those techniques are based on general formulations that fail in fully exploiting human capabilities for interaction. This is because while 3D interaction enables naturalness, it also produces complexity and limitations when using 3DUIs. In this thesis, we aim to generate approaches that better exploit the high potential human capabilities for interaction by combining human factors, mathematical formalizations and computational methods. Our approach is focussed on the exploration of the close coupling between specific ITes and ITas while addressing common issues of 3D interactions. We specifically focused on the stages of interaction within Basic Interaction Tasks (BITas) i.e., data input, manipulation, navigation and selection. Common limitations of these tasks are: (1) the complexity of mapping generation for input devices, (2) fatigue in mid-air object manipulation, (3) space constraints in VR navigation; and (4) low accuracy in 3D mid-air selection. Along with two chapters of introduction and background, this thesis presents five main works. Chapter 3 focusses on the design of mid-air gesture mappings based on human tacit knowledge. Chapter 4 presents a solution to address user fatigue in mid-air object manipulation. Chapter 5 is focused on addressing space limitations in VR navigation. Chapter 6 describes an analysis and a correction method to address Drift effects involved in scale-adaptive VR navigation; and Chapter 7 presents a hybrid technique 3D/2D that allows for precise selection of virtual objects in highly dense environments (e.g., point clouds). Finally, we conclude discussing how the contributions obtained from this exploration, provide techniques and guidelines to design more natural 3DUIs

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    Point-and-shake: selecting from levitating object displays

    Get PDF
    Acoustic levitation enables a radical new type of humancomputer interface composed of small levitating objects. For the first time, we investigate the selection of such objects, an important part of interaction with a levitating object display. We present Point-and-Shake, a mid-air pointing interaction for selecting levitating objects, with feedback given through object movement. We describe the implementation of this technique and present two user studies that evaluate it. The first study found that users could accurately (96%) and quickly (4.1s) select objects by pointing at them. The second study found that users were able to accurately (95%) and quickly (3s) select occluded objects. These results show that Point-and- Shake is an effective way of initiating interaction with levitating object displays

    Designing for Mixed Reality Urban Exploration

    Get PDF
    This paper introduces a design framework for mixed reality urban exploration (MRUE), based on a concrete implementation in a historical city. The framework integrates different modalities, such as virtual reality (VR), augmented reality (AR), and haptics-audio interfaces, as well as advanced features such as personalized recommendations, social exploration, and itinerary management. It permits to address a number of concerns regarding information overload, safety, and quality of the experience, which are not sufficiently tackled in traditional non-integrated approaches. This study presents an integrated mobile platform built on top of this framework and reflects on the lessons learned
    • …
    corecore