5,689 research outputs found
Analyzing and Modeling the Performance of the HemeLB Lattice-Boltzmann Simulation Environment
We investigate the performance of the HemeLB lattice-Boltzmann simulator for
cerebrovascular blood flow, aimed at providing timely and clinically relevant
assistance to neurosurgeons. HemeLB is optimised for sparse geometries,
supports interactive use, and scales well to 32,768 cores for problems with ~81
million lattice sites. We obtain a maximum performance of 29.5 billion site
updates per second, with only an 11% slowdown for highly sparse problems (5%
fluid fraction). We present steering and visualisation performance measurements
and provide a model which allows users to predict the performance, thereby
determining how to run simulations with maximum accuracy within time
constraints.Comment: Accepted by the Journal of Computational Science. 33 pages, 16
figures, 7 table
Real-time content-aware texturing for deformable surfaces
Animation of models often introduces distortions to their parameterisation, as these are typically optimised for a single frame. The net effect is that under deformation, the mapped features, i.e. UV texture maps, bump maps or displacement maps, may appear to stretch or scale in an undesirable way. Ideally, what we would like is for the appearance of such features to remain feasible given any underlying deformation. In this paper we introduce a real-time technique that reduces such distortions based on a distortion control (rigidity) map. In two versions of our proposed technique, the parameter space is warped in either an axis or a non-axis aligned manner based on the minimisation of a non-linear distortion metric. This in turn is solved using a highly optimised hybrid CPU-GPU strategy. The result is real-time dynamic content-aware texturing that reduces distortions in a controlled way. The technique can be applied to reduce distortions in a variety of scenarios, including reusing a low geometric complexity animated sequence with a multitude of detail maps, dynamic procedurally defined features mapped on deformable geometry and animation authoring previews on texture-mapped models. © 2013 ACM
Binary Radiance Fields
In this paper, we propose binary radiance fields (BiRF), a storage-efficient
radiance field representation employing binary feature encoding that encodes
local features using binary encoding parameters in a format of either or
. This binarization strategy lets us represent the feature grid with highly
compact feature encoding and a dramatic reduction in storage size. Furthermore,
our 2D-3D hybrid feature grid design enhances the compactness of feature
encoding as the 3D grid includes main components while 2D grids capture
details. In our experiments, binary radiance field representation successfully
outperforms the reconstruction performance of state-of-the-art (SOTA) efficient
radiance field models with lower storage allocation. In particular, our model
achieves impressive results in static scene reconstruction, with a PSNR of
31.53 dB for Synthetic-NeRF scenes, 34.26 dB for Synthetic-NSVF scenes, 28.02
dB for Tanks and Temples scenes while only utilizing 0.7 MB, 0.8 MB, and 0.8 MB
of storage space, respectively. We hope the proposed binary radiance field
representation will make radiance fields more accessible without a storage
bottleneck.Comment: 21 pages, 12 Figures, and 11 Table
DeepVoxels: Learning Persistent 3D Feature Embeddings
In this work, we address the lack of 3D understanding of generative neural
networks by introducing a persistent 3D feature embedding for view synthesis.
To this end, we propose DeepVoxels, a learned representation that encodes the
view-dependent appearance of a 3D scene without having to explicitly model its
geometry. At its core, our approach is based on a Cartesian 3D grid of
persistent embedded features that learn to make use of the underlying 3D scene
structure. Our approach combines insights from 3D geometric computer vision
with recent advances in learning image-to-image mappings based on adversarial
loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction
of the scene, using a 2D re-rendering loss and enforces perspective and
multi-view geometry in a principled manner. We apply our persistent 3D scene
representation to the problem of novel view synthesis demonstrating
high-quality results for a variety of challenging scenes.Comment: Video: https://www.youtube.com/watch?v=HM_WsZhoGXw Supplemental
material:
https://drive.google.com/file/d/1BnZRyNcVUty6-LxAstN83H79ktUq8Cjp/view?usp=sharing
Code: https://github.com/vsitzmann/deepvoxels Project page:
https://vsitzmann.github.io/deepvoxels
Steered mixture-of-experts for light field images and video : representation and coding
Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution
Spiking NeRF: Making Bio-inspired Neural Networks See through the Real World
Spiking neuron networks (SNNs) have been thriving on numerous tasks to
leverage their promising energy efficiency and exploit their potentialities as
biologically plausible intelligence. Meanwhile, the Neural Radiance Fields
(NeRF) render high-quality 3D scenes with massive energy consumption, and few
works delve into the energy-saving solution with a bio-inspired approach. In
this paper, we propose spiking NeRF (SpikingNeRF), which aligns the radiance
ray with the temporal dimension of SNN, to naturally accommodate the SNN to the
reconstruction of Radiance Fields. Thus, the computation turns into a
spike-based, multiplication-free manner, reducing the energy consumption. In
SpikingNeRF, each sampled point on the ray is matched onto a particular time
step, and represented in a hybrid manner where the voxel grids are maintained
as well. Based on the voxel grids, sampled points are determined whether to be
masked for better training and inference. However, this operation also incurs
irregular temporal length. We propose the temporal condensing-and-padding (TCP)
strategy to tackle the masked samples to maintain regular temporal length,
i.e., regular tensors, for hardware-friendly computation. Extensive experiments
on a variety of datasets demonstrate that our method reduces the
energy consumption on average and obtains comparable synthesis quality with the
ANN baseline
Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic Disentanglement
Human perception reliably identifies movable and immovable parts of 3D
scenes, and completes the 3D structure of objects and background from
incomplete observations. We learn this skill not via labeled examples, but
simply by observing objects move. In this work, we propose an approach that
observes unlabeled multi-view videos at training time and learns to map a
single image observation of a complex scene, such as a street with cars, to a
3D neural scene representation that is disentangled into movable and immovable
parts while plausibly completing its 3D structure. We separately parameterize
movable and immovable scene parts via 2D neural ground plans. These ground
plans are 2D grids of features aligned with the ground plane that can be
locally decoded into 3D neural radiance fields. Our model is trained
self-supervised via neural rendering. We demonstrate that the structure
inherent to our disentangled 3D representation enables a variety of downstream
tasks in street-scale 3D scenes using simple heuristics, such as extraction of
object-centric 3D representations, novel view synthesis, instance segmentation,
and 3D bounding box prediction, highlighting its value as a backbone for
data-efficient 3D scene understanding models. This disentanglement further
enables scene editing via object manipulation such as deletion, insertion, and
rigid-body motion.Comment: Project page: https://prafullsharma.net/see3d
- …