1,264 research outputs found
MusA: Using Indoor Positioning and Navigation to Enhance Cultural Experiences in a museum
In recent years there has been a growing interest into the use of multimedia mobile guides in museum environments. Mobile devices have the capabilities to detect the user context and to provide pieces of information suitable to help visitors discovering and following the logical and emotional connections that develop during the visit. In this scenario, location based services (LBS) currently represent an asset, and the choice of the technology to determine users' position, combined with the definition of methods that can effectively convey information, become key issues in the design process. In this work, we present MusA (Museum Assistant), a general framework for the development of multimedia interactive guides for mobile devices. Its main feature is a vision-based indoor positioning system that allows the provision of several LBS, from way-finding to the contextualized communication of cultural contents, aimed at providing a meaningful exploration of exhibits according to visitors' personal interest and curiosity. Starting from the thorough description of the system architecture, the article presents the implementation of two mobile guides, developed to respectively address adults and children, and discusses the evaluation of the user experience and the visitors' appreciation of these application
Interactive videos: Plausible video editing using sparse structure points
Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user-scribbles to structure the point representations obtained using structure-from-motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image-based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure-preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio-temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real-world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve
Capture4VR: From VR Photography to VR Video
Virtual reality (VR) enables the display of dynamic visual content with unparalleled realism and immersion. However, VR is also still a relatively young medium that requires new ways to author content, particularly for visual content that is captured from the real world. This course, therefore, provides a comprehensive overview of the latest progress in bringing photographs and video into VR. Ultimately, the techniques, approaches and systems we discuss aim to faithfully capture the visual appearance and dynamics of the real world, and to bring it into virtual reality to create unparalleled realism and immersion by providing freedom of head motion and motion parallax, which is a vital depth cue for the human visual system. In this half-day course, we take the audience on a journey from VR photography to VR video that began more than a century ago but which has accelerated tremendously in the last five years. We discuss both commercial state-of-the-art systems by Facebook, Google and Microsoft, as well as the latest research techniques and prototypes
Capture4VR: From VR Photography to VR Video
Virtual reality (VR) enables the display of dynamic visual content with unparalleled realism and immersion. However, VR is also still a relatively young medium that requires new ways to author content, particularly for visual content that is captured from the real world. This course, therefore, provides a comprehensive overview of the latest progress in bringing photographs and video into VR. Ultimately, the techniques, approaches and systems we discuss aim to faithfully capture the visual appearance and dynamics of the real world, and to bring it into virtual reality to create unparalleled realism and immersion by providing freedom of head motion and motion parallax, which is a vital depth cue for the human visual system. In this half-day course, we take the audience on a journey from VR photography to VR video that began more than a century ago but which has accelerated tremendously in the last five years. We discuss both commercial state-of-the-art systems by Facebook, Google and Microsoft, as well as the latest research techniques and prototypes
360Roam: Real-Time Indoor Roaming Using Geometry-Aware 360 Radiance Fields
Virtual tour among sparse 360 images is widely used while hindering
smooth and immersive roaming experiences. The emergence of Neural Radiance
Field (NeRF) has showcased significant progress in synthesizing novel views,
unlocking the potential for immersive scene exploration. Nevertheless, previous
NeRF works primarily focused on object-centric scenarios, resulting in
noticeable performance degradation when applied to outward-facing and
large-scale scenes due to limitations in scene parameterization. To achieve
seamless and real-time indoor roaming, we propose a novel approach using
geometry-aware radiance fields with adaptively assigned local radiance fields.
Initially, we employ multiple 360 images of an indoor scene to
progressively reconstruct explicit geometry in the form of a probabilistic
occupancy map, derived from a global omnidirectional radiance field.
Subsequently, we assign local radiance fields through an adaptive
divide-and-conquer strategy based on the recovered geometry. By incorporating
geometry-aware sampling and decomposition of the global radiance field, our
system effectively utilizes positional encoding and compact neural networks to
enhance rendering quality and speed. Additionally, the extracted floorplan of
the scene aids in providing visual guidance, contributing to a realistic
roaming experience. To demonstrate the effectiveness of our system, we curated
a diverse dataset of 360 images encompassing various real-life scenes,
on which we conducted extensive experiments. Quantitative and qualitative
comparisons against baseline approaches illustrated the superior performance of
our system in large-scale indoor scene roaming
- …