2,652 research outputs found
The effects of viewpoint on the virtual space of pictures
Pictorial displays whose primary purpose is to convey accurate information about the 3-D spatial layout of an environment are discussed. How and how well, pictures can convey such information is discussed. It is suggested that picture perception is not best approached as a unitary, indivisible process. Rather, it is a complex process depending on multiple, partially redundant, interacting sources of visual information for both the real surface of the picture and the virtual space beyond. Each picture must be assessed for the particular information that it makes available. This will determine how accurately the virtual space represented by the picture is seen, as well as how it is distorted when seen from the wrong viewpoint
Selecting texture resolution using a task-specific visibility metric
In real-time rendering, the appearance of scenes is greatly affected by the quality and resolution of the textures used for image
synthesis. At the same time, the size of textures determines the performance and the memory requirements of rendering. As a
result, finding the optimal texture resolution is critical, but also a non-trivial task since the visibility of texture imperfections
depends on underlying geometry, illumination, interactions between several texture maps, and viewing positions. Ideally, we
would like to automate the task with a visibility metric, which could predict the optimal texture resolution. To maximize the
performance of such a metric, it should be trained on a given task. This, however, requires sufficient user data which is often
difficult to obtain. To address this problem, we develop a procedure for training an image visibility metric for a specific task
while reducing the effort required to collect new data. The procedure involves generating a large dataset using an existing
visibility metric followed by refining that dataset with the help of an efficient perceptual experiment. Then, such a refined
dataset is used to retune the metric. This way, we augment sparse perceptual data to a large number of per-pixel annotated
visibility maps which serve as the training data for application-specific visibility metrics. While our approach is general and
can be potentially applied for different image distortions, we demonstrate an application in a game-engine where we optimize
the resolution of various textures, such as albedo and normal maps
Haptic Hybrid Prototyping (HHP): An AR Application for Texture Evaluation with Semantic Content in Product Design
The manufacture of prototypes is costly in economic and temporal terms and in order to carry this out it is necessary to accept certain deviations with respect to the final finishes. This article proposes haptic hybrid prototyping, a haptic-visual product prototyping method created to help product design teams evaluate and select semantic information conveyed between product and user through texturing and ribs of a product in early stages of conceptualization. For the evaluation of this tool, an experiment was realized in which the haptic experience was compared during the interaction with final products and through the HHP. As a result, it was observed that the answers of the interviewees coincided in both situations in 81% of the cases. It was concluded that the HHP enables us to know the semantic information transmitted through haptic-visual means between product and user as well as being able to quantify the clarity with which this information is transmitted. Therefore, this new tool makes it possible to reduce the manufacturing lead time of prototypes as well as the conceptualization phase of the product, providing information on the future success of the product in the market and its economic return
SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields
Neural Radiance Fields (NeRFs) have emerged as a popular approach for novel
view synthesis. While NeRFs are quickly being adapted for a wider set of
applications, intuitively editing NeRF scenes is still an open challenge. One
important editing task is the removal of unwanted objects from a 3D scene, such
that the replaced region is visually plausible and consistent with its context.
We refer to this task as 3D inpainting. In 3D, solutions must be both
consistent across multiple views and geometrically valid. In this paper, we
propose a novel 3D inpainting method that addresses these challenges. Given a
small set of posed images and sparse annotations in a single input image, our
framework first rapidly obtains a 3D segmentation mask for a target object.
Using the mask, a perceptual optimizationbased approach is then introduced that
leverages learned 2D image inpainters, distilling their information into 3D
space, while ensuring view consistency. We also address the lack of a diverse
benchmark for evaluating 3D scene inpainting methods by introducing a dataset
comprised of challenging real-world scenes. In particular, our dataset contains
views of the same scene with and without a target object, enabling more
principled benchmarking of the 3D inpainting task. We first demonstrate the
superiority of our approach on multiview segmentation, comparing to NeRFbased
methods and 2D segmentation approaches. We then evaluate on the task of 3D
inpainting, establishing state-ofthe-art performance against other NeRF
manipulation algorithms, as well as a strong 2D image inpainter baselineComment: Project Page: https://spinnerf3d.github.i
Perceptually optimized real-time computer graphics
Perceptual optimization, the application of human visual perception models to remove imperceptible components in a graphics system, has been proven effective in achieving significant computational speedup. Previous implementations of this technique have focused on spatial level of detail reduction, which typically results in noticeable degradation of image quality. This thesis introduces refresh rate modulation (RRM), a novel perceptual optimization technique that produces better performance enhancement while more effectively preserving image quality and resolving static scene elements in full detail. In order to demonstrate the effectiveness of this technique, a graphics framework has been developed that interfaces with eye tracking hardware to take advantage of user fixation data in real-time. Central to the framework is a high-performance GPGPU ray-tracing engine written in OpenCL. RRM reduces the frequency with which pixels outside of the foveal region are updated by the ray-tracer. A persistent pixel buffer is maintained such that peripheral data from previous frames provides context for the foveal image in the current frame. Traditional optimization techniques have also been incorporated into the ray-tracer for improved performance. Applying the RRM technique to the ray-tracing engine results in a speedup of 2.27 (252 fps vs. 111 fps at 1080p) for the classic Whitted scene with reflection and transmission enabled. A speedup of 3.41 (140 fps vs. 41 fps at 1080p) is observed for a high-polygon scene that depicts the Stanford Bunny. A small pilot study indicates that RRM achieves these results with minimal impact to perceived image quality. A secondary investigation is conducted regarding the performance benefits of increasing physics engine error tolerance for bounding volume hierarchy based collision detection when the scene elements involved are in the user\u27s periphery. The open-source Bullet Physics Library was used to add accurate collision detection to the full resolution ray-tracing engine. For a scene with a static high-polygon model and 50 moving spheres, a speedup of 1.8 was observed for physics calculations. The development and integration of this subsystem demonstrates the extensibility of the graphics framework
Relating Objective and Subjective Performance Measures for AAM-based Visual Speech Synthesizers
We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality
- …