364 research outputs found
From Capture to Display: A Survey on Volumetric Video
Volumetric video, which offers immersive viewing experiences, is gaining
increasing prominence. With its six degrees of freedom, it provides viewers
with greater immersion and interactivity compared to traditional videos.
Despite their potential, volumetric video services poses significant
challenges. This survey conducts a comprehensive review of the existing
literature on volumetric video. We firstly provide a general framework of
volumetric video services, followed by a discussion on prerequisites for
volumetric video, encompassing representations, open datasets, and quality
assessment metrics. Then we delve into the current methodologies for each stage
of the volumetric video service pipeline, detailing capturing, compression,
transmission, rendering, and display techniques. Lastly, we explore various
applications enabled by this pioneering technology and we present an array of
research challenges and opportunities in the domain of volumetric video
services. This survey aspires to provide a holistic understanding of this
burgeoning field and shed light on potential future research trajectories,
aiming to bring the vision of volumetric video to fruition.Comment: Submitte
Neural Radiance Fields: Past, Present, and Future
The various aspects like modeling and interpreting 3D environments and
surroundings have enticed humans to progress their research in 3D Computer
Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall
et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in
Computer Graphics, Robotics, Computer Vision, and the possible scope of
High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D
models have gained traction from res with more than 1000 preprints related to
NeRFs published. This paper serves as a bridge for people starting to study
these fields by building on the basics of Mathematics, Geometry, Computer
Vision, and Computer Graphics to the difficulties encountered in Implicit
Representations at the intersection of all these disciplines. This survey
provides the history of rendering, Implicit Learning, and NeRFs, the
progression of research on NeRFs, and the potential applications and
implications of NeRFs in today's world. In doing so, this survey categorizes
all the NeRF-related research in terms of the datasets used, objective
functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation
MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
We introduce a method to convert stereo 360{\deg} (omnidirectional stereo)
imagery into a layered, multi-sphere image representation for six
degree-of-freedom (6DoF) rendering. Stereo 360{\deg} imagery can be captured
from multi-camera systems for virtual reality (VR), but lacks motion parallax
and correct-in-all-directions disparity cues. Together, these can quickly lead
to VR sickness when viewing content. One solution is to try and generate a
format suitable for 6DoF rendering, such as by estimating depth. However, this
raises questions as to how to handle disoccluded regions in dynamic scenes. Our
approach is to simultaneously learn depth and disocclusions via a multi-sphere
image representation, which can be rendered with correct 6DoF disparity and
motion parallax in VR. This significantly improves comfort for the viewer, and
can be inferred and rendered in real time on modern GPU hardware. Together,
these move towards making VR video a more comfortable immersive medium.Comment: 25 pages, 13 figures, Published at European Conference on Computer
Vision (ECCV 2020), Project Page: http://visual.cs.brown.edu/matryodshk
VIRD: Immersive Match Video Analysis for High-Performance Badminton Coaching
Badminton is a fast-paced sport that requires a strategic combination of
spatial, temporal, and technical tactics. To gain a competitive edge at
high-level competitions, badminton professionals frequently analyze match
videos to gain insights and develop game strategies. However, the current
process for analyzing matches is time-consuming and relies heavily on manual
note-taking, due to the lack of automatic data collection and appropriate
visualization tools. As a result, there is a gap in effectively analyzing
matches and communicating insights among badminton coaches and players. This
work proposes an end-to-end immersive match analysis pipeline designed in close
collaboration with badminton professionals, including Olympic and national
coaches and players. We present VIRD, a VR Bird (i.e., shuttle) immersive
analysis tool, that supports interactive badminton game analysis in an
immersive environment based on 3D reconstructed game views of the match video.
We propose a top-down analytic workflow that allows users to seamlessly move
from a high-level match overview to a detailed game view of individual rallies
and shots, using situated 3D visualizations and video. We collect 3D spatial
and dynamic shot data and player poses with computer vision models and
visualize them in VR. Through immersive visualizations, coaches can
interactively analyze situated spatial data (player positions, poses, and shot
trajectories) with flexible viewpoints while navigating between shots and
rallies effectively with embodied interaction. We evaluated the usefulness of
VIRD with Olympic and national-level coaches and players in real matches.
Results show that immersive analytics supports effective badminton match
analysis with reduced context-switching costs and enhances spatial
understanding with a high sense of presence.Comment: To Appear in IEEE Transactions on Visualization and Computer Graphics
(IEEE VIS), 202
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
Pathway to Future Symbiotic Creativity
This report presents a comprehensive view of our vision on the development
path of the human-machine symbiotic art creation. We propose a classification
of the creative system with a hierarchy of 5 classes, showing the pathway of
creativity evolving from a mimic-human artist (Turing Artists) to a Machine
artist in its own right. We begin with an overview of the limitations of the
Turing Artists then focus on the top two-level systems, Machine Artists,
emphasizing machine-human communication in art creation. In art creation, it is
necessary for machines to understand humans' mental states, including desires,
appreciation, and emotions, humans also need to understand machines' creative
capabilities and limitations. The rapid development of immersive environment
and further evolution into the new concept of metaverse enable symbiotic art
creation through unprecedented flexibility of bi-directional communication
between artists and art manifestation environments. By examining the latest
sensor and XR technologies, we illustrate the novel way for art data collection
to constitute the base of a new form of human-machine bidirectional
communication and understanding in art creation. Based on such communication
and understanding mechanisms, we propose a novel framework for building future
Machine artists, which comes with the philosophy that a human-compatible AI
system should be based on the "human-in-the-loop" principle rather than the
traditional "end-to-end" dogma. By proposing a new form of inverse
reinforcement learning model, we outline the platform design of machine
artists, demonstrate its functions and showcase some examples of technologies
we have developed. We also provide a systematic exposition of the ecosystem for
AI-based symbiotic art form and community with an economic model built on NFT
technology. Ethical issues for the development of machine artists are also
discussed
TCP-Based distributed offloading architecture for the future of untethered immersive experiences in wireless networks
IMX '22: ACM International Conference on Interactive Media Experiences, 22-24 June 2022, Aveiro, Portugal.Task offloading has become a key term in the field of immersive media technologies: it can enable lighter and cheaper devices while providing them higher remote computational capabilities. In this paper we present our TCP-based offloading architecture. The architecture, has been specifically designed for immersive media offloading tasks with a particular care in reducing any processing overhead which can degrade the network performance. We tested the architecture for different offloading scenarios and conditions on two different wireless networks: WiFi and 5G millimeter wave technologies. Besides, to test the network on alternative millimeter wave configurations, currently not available on the actual 5G millimeter rollouts, we used a 5G Radio Access Network (RAN) real-time emulator. This emulator was also used to test the offloading architecture for an simulated immersive user sharing network resources with other users. We provide insights of the importance of user prioritization techniques for successful immersive media offloading. The results show a great performance for the tested immersive media scenarios, highlighting the relevance of millimeter wave technology for the future of immersive media applications.This work has received funding from the European Union (EU) Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie ETN TeamUp5G, grant agreement No. 813391
- …