5,690 research outputs found

    Spatial calibration of an optical see-through head-mounted display

    Get PDF
    We present here a method for calibrating an optical see-through Head Mounted Display (HMD) using techniques usually applied to camera calibration (photogrammetry). Using a camera placed inside the HMD to take pictures simultaneously of a tracked object and features in the HMD display, we could exploit established camera calibration techniques to recover both the intrinsic and extrinsic properties of the~HMD (width, height, focal length, optic centre and principal ray of the display). Our method gives low re-projection errors and, unlike existing methods, involves no time-consuming and error-prone human measurements, nor any prior estimates about the HMD geometry

    Occlusion Handling using Semantic Segmentation and Visibility-Based Rendering for Mixed Reality

    Full text link
    Real-time occlusion handling is a major problem in outdoor mixed reality system because it requires great computational cost mainly due to the complexity of the scene. Using only segmentation, it is difficult to accurately render a virtual object occluded by complex objects such as trees, bushes etc. In this paper, we propose a novel occlusion handling method for real-time, outdoor, and omni-directional mixed reality system using only the information from a monocular image sequence. We first present a semantic segmentation scheme for predicting the amount of visibility for different type of objects in the scene. We also simultaneously calculate a foreground probability map using depth estimation derived from optical flow. Finally, we combine the segmentation result and the probability map to render the computer generated object and the real scene using a visibility-based rendering method. Our results show great improvement in handling occlusions compared to existing blending based methods

    Towards transparent telepresence

    Get PDF
    It is proposed that the concept of transparent telepresence can be closely approached through high fidelity technological mediation. It is argued that the matching of the system capabilities to those of the human user will yield a strong sense of immersion and presence at a remote site. Some applications of such a system are noted. The concept is explained and critical system elements are described together with an overview of some of the necessary system specifications

    Augmented Reality Meets Computer Vision : Efficient Data Generation for Urban Driving Scenes

    Full text link
    The success of deep learning in computer vision is based on availability of large annotated datasets. To lower the need for hand labeled images, virtually rendered 3D worlds have recently gained popularity. Creating realistic 3D content is challenging on its own and requires significant human effort. In this work, we propose an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation and object detection models. Exploiting the fact that not all aspects of the scene are equally important for this task, we propose to augment real-world imagery with virtual objects of the target category. Capturing real-world images at large scale is easy and cheap, and directly provides real background appearances without the need for creating complex 3D models of the environment. We present an efficient procedure to augment real images with virtual objects. This allows us to create realistic composite images which exhibit both realistic background appearance and a large number of complex object arrangements. In contrast to modeling complete 3D environments, our augmentation approach requires only a few user interactions in combination with 3D shapes of the target object. Through extensive experimentation, we conclude the right set of parameters to produce augmented data which can maximally enhance the performance of instance segmentation models. Further, we demonstrate the utility of our approach on training standard deep models for semantic instance segmentation and object detection of cars in outdoor driving scenes. We test the models trained on our augmented data on the KITTI 2015 dataset, which we have annotated with pixel-accurate ground truth, and on Cityscapes dataset. Our experiments demonstrate that models trained on augmented imagery generalize better than those trained on synthetic data or models trained on limited amount of annotated real data

    Planar Refrains

    Get PDF
    My practice explores phenomenal poetic truths that exist in fissures between the sensual and physical qualities of material constructs. Magnifying this confounding interspace, my work activates specific instruments within mutable, relational systems of installation, movement, and documentation. The tools I fabricate function within variable orientations and are implemented as both physical barriers and thresholds into alternate, virtual domains. Intersecting fragments of sound and moving image build a nexus of superimposed spatialities, while material constructions are enveloped in ephemeral intensities. Within this compounded environment, both mind and body are charged as active sites through which durational, contemplative experiences can pass. Reverberation, the ghostly refrain of a sound calling back to our ears from a distant plane, can intensify our emotional experience of place. My project Planar Refrains utilizes four electro-mechanical reverb plates, analog audio filters designed to simulate expansive acoustic arenas. Historically these devices have provided emotive voicings to popular studio recordings, dislocating the performer from the commercial studio and into a simulated reverberant territory of mythic proportions. The material resonance of steel is used to filter a recorded signal, shaping the sound of a human performance into something more transformative, a sound embodying otherworldly dynamics. In subverting the designed utility of reverb plates, I am exploring their value as active surfaces extending across different spatial realities. The background of ephemeral sonic residue is collapsed into the foreground, a filter becomes sculpture, and this sculpture becomes an instrument in an evolving soundscape

    GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB

    Full text link
    We address the highly challenging problem of real-time 3D hand tracking based on a monocular RGB-only sequence. Our tracking method combines a convolutional neural network with a kinematic 3D hand model, such that it generalizes well to unseen data, is robust to occlusions and varying camera viewpoints, and leads to anatomically plausible as well as temporally smooth hand motions. For training our CNN we propose a novel approach for the synthetic generation of training data that is based on a geometrically consistent image-to-image translation network. To be more specific, we use a neural network that translates synthetic images to "real" images, such that the so-generated images follow the same statistical distribution as real-world hand images. For training this translation network we combine an adversarial loss and a cycle-consistency loss with a geometric consistency loss in order to preserve geometric properties (such as hand pose) during translation. We demonstrate that our hand tracking system outperforms the current state-of-the-art on challenging RGB-only footage
    • …
    corecore