2,474 research outputs found
AdaptiX -- A Transitional XR Framework for Development and Evaluation of Shared Control Applications in Assistive Robotics
With the ongoing efforts to empower people with mobility impairments and the
increase in technological acceptance by the general public, assistive
technologies, such as collaborative robotic arms, are gaining popularity. Yet,
their widespread success is limited by usability issues, specifically the
disparity between user input and software control along the autonomy continuum.
To address this, shared control concepts provide opportunities to combine the
targeted increase of user autonomy with a certain level of computer assistance.
This paper presents the free and open-source AdaptiX XR framework for
developing and evaluating shared control applications in a high-resolution
simulation environment. The initial framework consists of a simulated robotic
arm with an example scenario in Virtual Reality (VR), multiple standard control
interfaces, and a specialized recording/replay system. AdaptiX can easily be
extended for specific research needs, allowing Human-Robot Interaction (HRI)
researchers to rapidly design and test novel interaction methods, intervention
strategies, and multi-modal feedback techniques, without requiring an actual
physical robotic arm during the early phases of ideation, prototyping, and
evaluation. Also, a Robot Operating System (ROS) integration enables the
controlling of a real robotic arm in a PhysicalTwin approach without any
simulation-reality gap. Here, we review the capabilities and limitations of
AdaptiX in detail and present three bodies of research based on the framework.
AdaptiX can be accessed at https://adaptix.robot-research.de.Comment: Accepted submission at The 16th ACM SIGCHI Symposium on Engineering
Interactive Computing Systems (EICS'24
Object Pose Detection to Enable 3D Interaction from 2D Equirectangular Images in Mixed Reality Educational Settings
In this paper, we address the challenge of estimating the 6DoF pose of objects in 2D equirectangular images. This solution allows the transition to the objects’ 3D model from their current pose. In particular, it finds application in the educational use of 360° videos, where it enhances the learning experience of students by making it more engaging and immersive due to the possible interaction with 3D virtual models. We developed a general approach usable for any object and shape. The only requirement is to have an accurate CAD model, even without textures of the item, whose pose must be estimated. The developed pipeline has two main steps: vehicle segmentation from the image background and estimation of the vehicle pose. To accomplish the first task, we used deep learning methods, while for the second, we developed a 360° camera simulator in Unity to generate synthetic equirectangular images used for comparison. We conducted our tests using a miniature truck model whose CAD was at our disposal. The developed algorithm was tested using a metrological analysis applied to real data. The results showed a mean difference of 1.5° with a standard deviation of 1° from the ground truth data for rotations, and 1.4 cm with a standard deviation of 1.5 cm for translations over a research range of ±20° and ±20 cm, respectively
Developing a Data Model for an Omnidirectional Image-Based Multi-Scale Representation of Space
One of the major challenges that existing spatial data is facing is the fragmentation of its representation of indoor and outdoor space. As studies in the use of omnidirectional images in representing space and providing Location-based Services (LBS) has been increasing, the representation of the different scales of space, both in indoors and outdoors, has yet to be addressed. This study aims to develop a data model for generating a multi-scale image-based representation of space using omnidirectional images based spatial relationships. This paper identifies the different scales of space that are represented in spatial data and extends previous approaches of using omnidirectional images in providing indoor LBS towards representing the other scales of space, particularly in outdoor space. Using a sample data, we present an experimental implementation to demonstrate the potential of the proposed data model. Results show that apart from the realistic visualization that image data provides, basic spatial functions can be performed on the image data constructed based on the proposed data model
Omnidirectional camera pose estimation and projective texture mapping for photorealistic 3D virtual reality experiences
Modern applications in virtual reality require a high level of fruition of the environment as if it was real. In applications that have to deal with real scenarios, it is important to acquire both its three-dimensional (3D) structure and details to enable the users to achieve good immersive experiences. The purpose of this paper is to illustrate a method to obtain a mesh with high quality texture combining a raw 3D mesh model of the environment and 360 ° images. The main outcome is a mesh with a high level of photorealistic details. This enables both a good depth perception thanks to the mesh model and high visualization quality thanks to the 2D resolution of modern omnidirectional cameras. The fundamental step to reach this goal is the correct alignment between the 360 ° camera and the 3D mesh model. For this reason, we propose a method that embodies two steps: 1) find the 360 ° cameras pose within the current 3D environment; 2) project the high-quality 360 ° image on top of the mesh. After the method description, we outline its validation in two virtual reality scenarios, a mine and city environment, respectively, which allows us to compare the achieved results with the ground truth.</p
Materialising contexts: virtual soundscapes for real-world exploration
© 2020, The Author(s). This article presents the results of a study based on a group of participantsâ interactions with an experimental sound installation at the National Science and Media Museum in Bradford, UK. The installation used audio augmented reality to attach virtual sound sources to a vintage radio receiver from the museumâs collection, with a view to understanding the potentials of this technology for promoting exploration and engagement within museums and galleries. We employ a practice-based design ethnography, including a thematic analysis of our participantsâ interactions with spatialised interactive audio, and present an identified sequence of interactional phases. We discuss how audio augmented artefacts can communicate and engage visitors beyond their traditional confines of line-of-sight, and how visitors can be drawn to engage further, beyond the realm of their original encounter. Finally, we provide evidence of how contextualised and embodied interactions, along with authentic audio reproduction, evoked personal memories associated with our museum artefact, and how this can promote interest in the acquisition of declarative knowledge. Additionally, through the adoption of a functional and theoretical aura-based model, we present ways in which this could be achieved, and, overall, we demonstrate a material objectâs potential role as an interface for engaging users with, and contextualising, immaterial digital audio archival content
Adaptive Vision Based Scene Registration for Outdoor Augmented Reality
Augmented Reality (AR) involves adding virtual content into real scenes. Scenes are viewed using a Head-Mounted Display or other display type. In
order to place content into the user's view of a scene, the user's position and orientation relative to the scene, commonly referred to as their pose, must be determined accurately. This allows the objects to be placed in the correct positions and to remain there when the user moves or the scene changes. It is achieved by tracking the user in relation to their environment using a variety of technology. One technology which has proven to provide accurate results is computer vision. Computer vision involves a computer
analysing images and achieving an understanding of them. This may be locating objects such as faces in the images, or in the case of AR, determining the pose of the user.
One of the ultimate goals of AR systems is to be capable of operating under any condition. For example, a computer vision system must be robust under a range of different scene types, and under unpredictable environmental conditions due to variable illumination and weather. The majority of existing literature tests algorithms under the assumption of ideal or 'normal' imaging conditions. To ensure robustness under as many circumstances as possible it is also important to evaluate the systems under adverse conditions.
This thesis seeks to analyse the effects that variable illumination has on computer vision algorithms. To enable this analysis, test data is required to isolate weather and illumination effects, without other factors such as changes in viewpoint that would bias the results. A new dataset is presented which also allows controlled viewpoint differences in the presence of weather and illumination changes. This is achieved by capturing video from a camera undergoing a repeatable motion sequence. Ground truth data is stored per frame allowing images from the same position under differing environmental conditions, to be easily extracted from the
videos.
An in depth analysis of six detection algorithms and five matching techniques demonstrates the impact that non-uniform illumination changes can have on vision algorithms. Specifically, shadows can degrade performance and reduce confidence in the system, decrease reliability, or even completely prevent successful operation.
An investigation into approaches to improve performance yields techniques that can help reduce the impact of shadows. A novel algorithm is presented that merges reference data captured at different times, resulting in reference data with minimal shadow effects. This can significantly improve performance and reliability when operating on images containing shadow effects. These advances improve the robustness of computer vision systems and extend the range of conditions in which they can operate. This can increase the usefulness of the algorithms and the AR systems that employ them
Recommended from our members
Augmented Reality Interfaces for Enabling Fast and Accurate Task Localization
Changing viewpoints is a common technique to gain additional visual information about the spatial relations among the objects contained within an environment. In many cases, all of the necessary visual information is not available from a single vantage point, due to factors such as occlusion, level of detail, and limited field of view. In certain instances, strategic viewpoints may need to be visited multiple times (e.g., after each step of an iterative process), which makes being able to transition between viewpoints precisely and with minimum effort advantageous for improved task performance (e.g., faster completion time, fewer errors, less dependence on memory).
Many augmented reality (AR) applications are designed to make tasks easier to perform by supplementing a user's first-person view with virtual instructions. For those tasks that benefit from being seen from more than a single viewpoint, AR users typically have to physically relocalize (i.e., move a see-through display and typically themselves since those displays are often head-worn or hand-held) to those additional viewpoints. However, this physical motion may be costly or difficult, due to increased distances or obstacles in the environment.
We have developed a set of interaction techniques that enable fast and accurate task localization in AR. Our first technique, SnapAR, allows users to take snapshots of augmented scenes that can be virtually revisited at later times. The system stores still images of scenes along with camera poses, so that augmentations remain dynamic and interactive. Our prototype implementation features a set of interaction techniques specifically designed to enable quick viewpoint switching. A formal evaluation of the capability to manipulate virtual objects within snapshot mode showed significant savings in time spent and gain in accuracy when compared to physically traveling between viewpoints.
For cases when a user has to physically travel to a strategic viewpoint (e.g., to perform maintenance and repair on a large physical piece of equipment), we present ParaFrustum, a geometric construct that represents this set of strategic viewpoints and viewing directions and establishes constraints on a range of acceptable locations for the user's eyes and a range of acceptable angles in which the user's head can be oriented. Providing tolerance in the allowable viewing positions and directions avoids burdening the user with the need to assume a tightly constrained 6DOF pose when it is not required by the task. We describe two visualization techniques, ParaFrustum-InSitu and ParaFrustum-HUD, that guide a user to assume one of the poses defined by a ParaFrustum. A formal user study corroborated that speed improvements increase with larger tolerances and reveals interesting differences in participant trajectories based on the visualization technique.
When the object to be operated on is smaller and can be handheld, instead of being large and stationary, it can be manually rotated instead of the user moving to a strategic viewpoint. Examples of such situations include tasks in which one object must be oriented relative to a second prior to assembly and tasks in which objects must be held in specific ways to inspect them. Researchers have investigated guidance mechanisms for some 6DOF tasks, using wide--field-of-view (FOV), stereoscopic virtual and augmented reality head-worn displays (HWDs). However, there has been relatively little work directed toward smaller FOV lightweight monoscopic HWDs, such as Google Glass, which may remain more comfortable and less intrusive than stereoscopic HWDs in the near future. In our Orientation Assistance work, we have designed and implemented a novel visualization approach and three additional visualizations representing different paradigms for guiding unconstrained manual 3DOF rotation, targeting these monoscopic HWDs. This chapter includes our exploration of these paradigms and the results of a user study evaluating the relative performance of the visualizations and showing the advantages of our new approach.
In summary, we investigated ways of enabling an AR user to obtain visual information from multiple viewpoints, both physically and virtually. In the virtual case, we showed how one can change viewpoints precisely and with less effort. In the physical case, we explored how we can interactively guide users to obtain strategic viewpoints, either by moving their heads or re-orienting handheld objects. In both cases, we showed that our techniques help users accomplish certain types of tasks more quickly and with fewer errors, compared to when they have to change viewpoints following alternative, previously suggested methods
Scene creation and exploration in outdoor augmented reality
This thesis investigates Outdoor Augmented Reality (AR) especially for scene creation and exploration aspects.We decompose a scene into several components: a) Device, b) Target Object(s), c) Task, and discuss their interrelations. Based on those relations we outline use-cases and workflows. The main contribution of this thesis is providing AR oriented workflows for selected professional fields specifically for scene creation and exploration purposes, through case studies as well as analyzing the relations between AR scene components. Our contributions inlude, but not limited to: i) analysis of scene components and factoring inherintly available errors, to create a transitional hybrid tracking scheme for multiple targets, ii) a novel image-based approach that uses building block analogy for modelling and introduces volumetric and temporal labeling for annotations, iii) an evaluation of the state of the art X-Ray visualization methods as well as our proposed multi-view method. AR technology and capabilities tend to change rapidly, however we believe the relation between scene components and the practical advantages their analysis provide are valuable. Moreover, we have chosen case studies as diverse as possible in order to cover a wide range of professional field studies. We believe our research is extendible to a variety of field studies for disciplines including but not limited to: Archaeology, architecture, cultural heritage, tourism, stratigraphy, civil engineering, and urban maintenance
The Politics of Twilights: Notes on the Semiotics of Horizon Photography
Visual sociology is crucial for exploring the indexical meanings that thick description cannot capture within a cultural setting. This paper explores how such meanings are created within a subset of the domain of photography. Using data gathered over several years, I constructed the semiotic code âhorizonâ photographers use when âin the fieldâ for photographing periods of twilight. This code explains the relevance of subject matter to the photographâs aesthetics. Specifically, I detail how âthe horizonâ communicates the potential for the photographer to âcaptureâ the index of a symbol that later permits the photographer to culturally mark scenes with âlightâ. In doing so, the paper explains how photography is a means through which a given truth about a given culture is made intelligible, elaborating the relationship between cultural meaning, narrative and decision-making despite the increased automation of the means of production of photographs. This is done to examine how this process of cultural marking is changing and why the agency of âthe photographerâ still matters for evaluating the cultural significance of the resulting photograph and for photography as a vital part of ethnographic research. This paper concludes with a commentary on the aesthetics of twilight as an allegorical reflection of society
- âŠ