591 research outputs found

    Eyewear Computing \u2013 Augmenting the Human with Head-Mounted Wearable Assistants

    Get PDF
    The seminar was composed of workshops and tutorials on head-mounted eye tracking, egocentric vision, optics, and head-mounted displays. The seminar welcomed 30 academic and industry researchers from Europe, the US, and Asia with a diverse background, including wearable and ubiquitous computing, computer vision, developmental psychology, optics, and human-computer interaction. In contrast to several previous Dagstuhl seminars, we used an ignite talk format to reduce the time of talks to one half-day and to leave the rest of the week for hands-on sessions, group work, general discussions, and socialising. The key results of this seminar are 1) the identification of key research challenges and summaries of breakout groups on multimodal eyewear computing, egocentric vision, security and privacy issues, skill augmentation and task guidance, eyewear computing for gaming, as well as prototyping of VR applications, 2) a list of datasets and research tools for eyewear computing, 3) three small-scale datasets recorded during the seminar, 4) an article in ACM Interactions entitled \u201cEyewear Computers for Human-Computer Interaction\u201d, as well as 5) two follow-up workshops on \u201cEgocentric Perception, Interaction, and Computing\u201d at the European Conference on Computer Vision (ECCV) as well as \u201cEyewear Computing\u201d at the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)

    MISAR: A Multimodal Instructional System with Augmented Reality

    Full text link
    Augmented reality (AR) requires the seamless integration of visual, auditory, and linguistic channels for optimized human-computer interaction. While auditory and visual inputs facilitate real-time and contextual user guidance, the potential of large language models (LLMs) in this landscape remains largely untapped. Our study introduces an innovative method harnessing LLMs to assimilate information from visual, auditory, and contextual modalities. Focusing on the unique challenge of task performance quantification in AR, we utilize egocentric video, speech, and context analysis. The integration of LLMs facilitates enhanced state estimation, marking a step towards more adaptive AR systems. Code, dataset, and demo will be available at https://github.com/nguyennm1024/misar.Comment: Accepted at ICCV 2023 - AV4D, 6 figures, 2 table

    An Outlook into the Future of Egocentric Vision

    Full text link
    What will the future be? We wonder! In this survey, we explore the gap between current research in egocentric vision and the ever-anticipated future, where wearable computing, with outward facing cameras and digital overlays, is expected to be integrated in our every day lives. To understand this gap, the article starts by envisaging the future through character-based stories, showcasing through examples the limitations of current technology. We then provide a mapping between this future and previously defined research tasks. For each task, we survey its seminal works, current state-of-the-art methodologies and available datasets, then reflect on shortcomings that limit its applicability to future research. Note that this survey focuses on software models for egocentric vision, independent of any specific hardware. The paper concludes with recommendations for areas of immediate explorations so as to unlock our path to the future always-on, personalised and life-enhancing egocentric vision.Comment: We invite comments, suggestions and corrections here: https://openreview.net/forum?id=V3974SUk1

    Deformable Objects for Virtual Environments

    Get PDF

    Efficient Distance Accuracy Estimation Of Real-World Environments In Virtual Reality Head-Mounted Displays

    Get PDF
    Virtual reality (VR) is a very promising technology with many compelling industrial applications. As many advancements have been made recently to deploy and use VR technology in virtual environments, they are still less mature to be used to render real environments. The current VR systems settings, which are developed for virtual environments rendering, fail to adequately address the challenges of capturing and displaying real-world virtual reality that these systems entail. Before these systems can be used in real life settings, their performance needs to be investigated, more specifically, depth perception and how distances to objects in the rendered scenes are estimated. The perceived depth is influenced by Head Mounted Displays (HMD) that inevitability decrease the virtual content’s depth perception. Distances are consistently underestimated in virtual environments (VEs) compared to the real world. The reason behind this underestimation is still not understood. This thesis investigates another version of this kind of system, that to the best of authors knowledge has not been explored by any previous research. Previous research used a computer-generated scene. This work is examining distance estimation in real environments rendered to Head-Mounted Displays, where distance estimations is among the most challenging issues that are still investigated and not fully understood.This thesis introduces a dual-camera video feed system through a virtual reality head mounted display with two models: a video-based and a static photo-based model, in which, the purpose is to explore whether the misjudgment of distances in HMDs could be due to a lack of realism, or not, with the use of a real-world scene rendering system. Distance judgments performance in the real world and these two evaluated VE models were compared using protocols already proven to accurately measure real-world distance estimations. An improved model based on enhancing the field of view (FOV) of the displayed scenes to improve distance judgements when displaying real-world VR content to HMDs was developed; allowing to mitigate the limited FOV, which is among the first potential causes of distance underestimation, specially, the mismatch of FOV between the camera and the HMD field of views. The proposed model is using a set of two cameras to generate the video instead of hundreds of input cameras or tens of cameras mounted on a circular rig as previous works from the literature. First Results from the first implementation of this system found that when the model was rendered as static photo-based, the underestimation was less as compared with the live video feed rendering. The video-based (real + HMD) model and the static photo-based (real + photo + HMD) model averaged 80.2% of the actual distance, and 81.4% respectively compared to the Real-World estimations that averaged 92.4%. The improved developed approach (Real + HMD + FOV) was compared to these two models and showed an improvement of 11%, increasing the estimation accuracy from 80% to 91% and reducing the estimation error from 1.29% to 0.56%. This thesis results present strong evidence of the need for novel distance estimation improvements methods for real world VR content systems and provides effective initial work towards this goal

    Visual Perception and Cognition in Image-Guided Intervention

    Get PDF
    Surgical image visualization and interaction systems can dramatically affect the efficacy and efficiency of surgical training, planning, and interventions. This is even more profound in the case of minimally-invasive surgery where restricted access to the operative field in conjunction with limited field of view necessitate a visualization medium to provide patient-specific information at any given moment. Unfortunately, little research has been devoted to studying human factors associated with medical image displays and the need for a robust, intuitive visualization and interaction interfaces has remained largely unfulfilled to this day. Failure to engineer efficient medical solutions and design intuitive visualization interfaces is argued to be one of the major barriers to the meaningful transfer of innovative technology to the operating room. This thesis was, therefore, motivated by the need to study various cognitive and perceptual aspects of human factors in surgical image visualization systems, to increase the efficiency and effectiveness of medical interfaces, and ultimately to improve patient outcomes. To this end, we chose four different minimally-invasive interventions in the realm of surgical training, planning, training for planning, and navigation: The first chapter involves the use of stereoendoscopes to reduce morbidity in endoscopic third ventriculostomy. The results of this study suggest that, compared with conventional endoscopes, the detection of the basilar artery on the surface of the third ventricle can be facilitated with the use of stereoendoscopes, increasing the safety of targeting in third ventriculostomy procedures. In the second chapter, a contour enhancement technique is described to improve preoperative planning of arteriovenous malformation interventions. The proposed method, particularly when combined with stereopsis, is shown to increase the speed and accuracy of understanding the spatial relationship between vascular structures. In the third chapter, an augmented-reality system is proposed to facilitate the training of planning brain tumour resection. The results of our user study indicate that the proposed system improves subjects\u27 performance, particularly novices\u27, in formulating the optimal point of entry and surgical path independent of the sensorimotor tasks performed. In the last chapter, the role of fully-immersive simulation environments on the surgeons\u27 non-technical skills to perform vertebroplasty procedure is investigated. Our results suggest that while training surgeons may increase their technical skills, the introduction of crisis scenarios significantly disturbs the performance, emphasizing the need of realistic simulation environments as part of training curriculum

    A Modular Approach to the Development of Interactive Augmented Reality Applications.

    Get PDF
    Augmented reality (AR) technologies are becoming increasingly popular as a result of the increase in the power of mobile computing devices. Emerging AR applications have the potential to have an enormous impact on industries such as education, healthcare, research, training and entertainment. There are currently a number of augmented reality toolkits and libraries available for the development of these applications; however, there is currently no standard tool for development. In this thesis we propose a modular approach to the organization and development of AR systems in order to enable the creation novel AR experiences. We also investigate the incorporation of the framework that resulted from our approach into game engines to enable the creation and visualization of immersive virtual reality experiences. We address issues in the development process of AR systems and provide a solution for reducing the time, cost and barrier of entry for development while simultaneously providing a framework in which researchers can test and apply advanced augmented reality technologies
    • …
    corecore