16,770 research outputs found
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds
We present NeRFVS, a novel neural radiance fields (NeRF) based method to
enable free navigation in a room. NeRF achieves impressive performance in
rendering images for novel views similar to the input views while suffering for
novel views that are significantly different from the training views. To
address this issue, we utilize the holistic priors, including pseudo depth maps
and view coverage information, from neural reconstruction to guide the learning
of implicit neural representations of 3D indoor scenes. Concretely, an
off-the-shelf neural reconstruction method is leveraged to generate a geometry
scaffold. Then, two loss functions based on the holistic priors are proposed to
improve the learning of NeRF: 1) A robust depth loss that can tolerate the
error of the pseudo depth map to guide the geometry learning of NeRF; 2) A
variance loss to regularize the variance of implicit neural representations to
reduce the geometry and color ambiguity in the learning procedure. These two
loss functions are modulated during NeRF optimization according to the view
coverage information to reduce the negative influence brought by the view
coverage imbalance. Extensive results demonstrate that our NeRFVS outperforms
state-of-the-art view synthesis methods quantitatively and qualitatively on
indoor scenes, achieving high-fidelity free navigation results.Comment: 10 pages, 7 figure
VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations
Recent advancements in implicit neural representations have contributed to
high-fidelity surface reconstruction and photorealistic novel view synthesis.
However, the computational complexity inherent in these methodologies presents
a substantial impediment, constraining the attainable frame rates and
resolutions in practical applications. In response to this predicament, we
propose VQ-NeRF, an effective and efficient pipeline for enhancing implicit
neural representations via vector quantization. The essence of our method
involves reducing the sampling space of NeRF to a lower resolution and
subsequently reinstating it to the original size utilizing a pre-trained VAE
decoder, thereby effectively mitigating the sampling time bottleneck
encountered during rendering. Although the codebook furnishes representative
features, reconstructing fine texture details of the scene remains challenging
due to high compression rates. To overcome this constraint, we design an
innovative multi-scale NeRF sampling scheme that concurrently optimizes the
NeRF model at both compressed and original scales to enhance the network's
ability to preserve fine details. Furthermore, we incorporate a semantic loss
function to improve the geometric fidelity and semantic coherence of our 3D
reconstructions. Extensive experiments demonstrate the effectiveness of our
model in achieving the optimal trade-off between rendering quality and
efficiency. Evaluation on the DTU, BlendMVS, and H3DS datasets confirms the
superior performance of our approach.Comment: Submitted to the 38th Annual AAAI Conference on Artificial
Intelligenc
Itās a long way to Monte-Carlo: probabilistic display in GPS navigation
We present a mobile, GPS-based multimodal navigation system, equipped with inertial control that allows users to explore and navigate through an augmented physical space, incorporating and displaying the uncertainty resulting from inaccurate sensing and unknown user intentions. The system propagates uncertainty appropriately via Monte Carlo sampling and predicts at a user-controllable time horizon. Control of the Monte Carlo exploration is entirely tilt-based. The system output is displayed both visually and in audio. Audio is rendered via granular synthesis to accurately display the probability of the user reaching targets in the space. We also demonstrate the use of uncertain prediction in a trajectory following task, where a section of music is modulated according to the changing predictions of user position with respect to the target trajectory. We show that appropriate display of the full distribution of potential future users positions with respect to sites-of-interest can improve the quality of interaction over a simplistic interpretation of the sensed data
Motion Textures: Modeling, Classification, and Segmentation Using Mixed-State Markov Random Fields
published_or_final_versio
Proposals for evaluating the regularity of a scientist'sresearch output
Evaluating the career of individual scientists according to their scientific output is a common bibliometric problem. Two aspects are classically taken into account: overall productivity and overall diffusion/impact, which can be measured by a plethora of indicators that consider publications and/or citations separately or synthesise these two quantities into a single number (e.g. h-index). A secondary aspect, which is sometimes mentioned in the rules of competitive examinations for research position/promotion, is time regularity of one researcher's scientific output. Despite the fact that it is sometimes invoked, a clear definition of regularity is still lacking. We define it as the ability of generating an active and stable research output over time, in terms of both publications/ quantity and citations/diffusion. The goal of this paper is introducing three analysis tools to perform qualitative/quantitative evaluations on the regularity of one scientist's output in a simple and organic way. These tools are respectively (1) the PY/CY diagram, (2) the publication/citation Ferrers diagram and (3) a simplified procedure for comparing the research output of several scientists according to their publication and citation temporal distributions (Borda's ranking). Description of these tools is supported by several examples
- ā¦