Search CORE

4 research outputs found

Field-of-View IoU for Object Detection in 360{\deg} Images

Author: Aizawa Kiyoharu
Cao Miao
Ikehata Satoshi
Publication venue
Publication date: 22/09/2022
Field of study

360{\deg} cameras have gained popularity over the last few years. In this paper, we propose two fundamental techniques -- Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360{\deg} images. Although most object detection neural networks designed for the perspective images are applicable to 360{\deg} images in equirectangular projection (ERP) format, their performance deteriorates owing to the distortion in ERP images. Our method can be readily integrated with existing perspective object detectors and significantly improves the performance. The FoV-IoU computes the intersection-over-union of two Field-of-View bounding boxes in a spherical image which could be used for training, inference, and evaluation while 360Augmentation is a data augmentation technique specific to 360{\deg} object detection task which randomly rotates a spherical image and solves the bias due to the sphere-to-plane projection. We conduct extensive experiments on the 360indoor dataset with different types of perspective object detectors and show the consistent effectiveness of our method

arXiv.org e-Print Archive

Transitioning360: Content-aware NFoV Virtual Camera Paths for 360° Video Playback

Author: Hu Shi-Min
Li Yi-Jun
Richardt Christian
Wang Miao
Zhang Wen-Xuan
Publication venue: IEEE
Publication date: 31/12/2020
Field of study

Despite the increasing number of head-mounted displays, many 360° VR videos are still being viewed by users on existing 2D displays. To this end, a subset of the 360° video content is often shown inside a manually or semi-automatically selected normal-field-of-view (NFoV) window. However, during the playback, simply watching an NFoV video can easily miss concurrent off-screen content. We present Transitioning360, a tool for 360° video navigation and playback on 2D displays by transitioning between multiple NFoV views that track potentially interesting targets or events. Our method computes virtual NFoV camera paths considering content awareness and diversity in an offline preprocess. During playback, the user can watch any NFoV view corresponding to a precomputed camera path. Moreover, our interface shows other candidate views, providing a sense of concurrent events. At any time, the user can transition to other candidate views for fast navigation and exploration. Experimental results including a user study demonstrate that the viewing experience using our method is more enjoyable and convenient than previous methods

OPUS

Transitioning360: Content-aware NFoV Virtual Camera Paths for 360° Video Playback

Author: Hu Shi-Min
Li Yi-Jun
Richardt Christian
Wang Miao
Zhang Wen-Xuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/12/2020
Field of study

OPUS

Fusing Multimedia Data Into Dynamic Virtual Environments

Author: Du Ruofei
Publication venue
Publication date: 01/01/2018
Field of study

In spite of the dramatic growth of virtual and augmented reality (VR and AR) technology, content creation for immersive and dynamic virtual environments remains a significant challenge. In this dissertation, we present our research in fusing multimedia data, including text, photos, panoramas, and multi-view videos, to create rich and compelling virtual environments. First, we present Social Street View, which renders geo-tagged social media in its natural geo-spatial context provided by 360° panoramas. Our system takes into account visual saliency and uses maximal Poisson-disc placement with spatiotemporal filters to render social multimedia in an immersive setting. We also present a novel GPU-driven pipeline for saliency computation in 360° panoramas using spherical harmonics (SH). Our spherical residual model can be applied to virtual cinematography in 360° videos. We further present Geollery, a mixed-reality platform to render an interactive mirrored world in real time with three-dimensional (3D) buildings, user-generated content, and geo-tagged social media. Our user study has identified several use cases for these systems, including immersive social storytelling, experiencing the culture, and crowd-sourced tourism. We next present Video Fields, a web-based interactive system to create, calibrate, and render dynamic videos overlaid on 3D scenes. Our system renders dynamic entities from multiple videos, using early and deferred texture sampling. Video Fields can be used for immersive surveillance in virtual environments. Furthermore, we present VRSurus and ARCrypt projects to explore the applications of gestures recognition, haptic feedback, and visual cryptography for virtual and augmented reality. Finally, we present our work on Montage4D, a real-time system for seamlessly fusing multi-view video textures with dynamic meshes. We use geodesics on meshes with view-dependent rendering to mitigate spatial occlusion seams while maintaining temporal consistency. Our experiments show significant enhancement in rendering quality, especially for salient regions such as faces. We believe that Social Street View, Geollery, Video Fields, and Montage4D will greatly facilitate several applications such as virtual tourism, immersive telepresence, and remote education

Digital Repository at the University of Maryland