474 research outputs found

    Binaural Rendering of Spherical Microphone Array Signals

    Get PDF
    The presentation of extended reality for consumer and professional applications requires major advancements in the capture and reproduction of its auditory component to provide a plausible listening experience. A spatial representation of the acoustic environment needs to be considered to allow for movement within or an interaction with the augmented or virtual reality. This thesis focuses on the application of capturing a real-world acoustic environment by means of a spherical microphone array with the subsequent head-tracked binaural reproduction to a single listener via headphones. The introduction establishes the fundamental concepts and relevant terminology for non-experts of the field. Furthermore, the specific challenges of the method due to spatial oversampling the sound field as well as physical limitations and imperfections of the microphone array are presented to the reader. The first objective of this thesis was to develop a software in the Python programming language, which is capable of performing all required computations for the acoustic rendering of the captured signals in real-time. The implemented processing pipeline was made publicly available under an open-source license. Secondly, specific parameters of the microphone array hardware as well as the rendering software that are required for a perceptually high reproduction quality have been identified and investigated by means of multiple user studies. Lastly, the results provide insights into how unwanted additive noise components in the captured microphone signals from different spherical array configurations contribute to the reproduced ear signals

    Perceptual Visibility Model for Temporal Contrast Changes in Periphery

    Get PDF
    Modeling perception is critical for many applications and developments in computer graphics to optimize and evaluate content generation techniques. Most of the work to date has focused on central (foveal) vision. However, this is insufficient for novel wide-field-of-view display devices, such as virtual and augmented reality headsets. Furthermore, the perceptual models proposed for the fovea do not readily extend to the off-center, peripheral visual field, where human perception is drastically different. In this paper, we focus on modeling the temporal aspect of visual perception in the periphery. We present new psychophysical experiments that measure the sensitivity of human observers to different spatio-temporal stimuli across a wide field of view. We use the collected data to build a perceptual model for the visibility of temporal changes at different eccentricities in complex video content. Finally, we discuss, demonstrate, and evaluate several problems that can be addressed using our technique. First, we show how our model enables injecting new content into the periphery without distracting the viewer, and we discuss the link between the model and human attention. Second, we demonstrate how foveated rendering methods can be evaluated and optimized to limit the visibility of temporal aliasing

    Image synthesis based on a model of human vision

    Get PDF
    Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision

    Validating Stereoscopic Volume Rendering

    Get PDF
    The evaluation of stereoscopic displays for surface-based renderings is well established in terms of accurate depth perception and tasks that require an understanding of the spatial layout of the scene. In comparison direct volume rendering (DVR) that typically produces images with a high number of low opacity, overlapping features is only beginning to be critically studied on stereoscopic displays. The properties of the specific images and the choice of parameters for DVR algorithms make assessing the effectiveness of stereoscopic displays for DVR particularly challenging and as a result existing literature is sparse with inconclusive results. In this thesis stereoscopic volume rendering is analysed for tasks that require depth perception including: stereo-acuity tasks, spatial search tasks and observer preference ratings. The evaluations focus on aspects of the DVR rendering pipeline and assess how the parameters of volume resolution, reconstruction filter and transfer function may alter task performance and the perceived quality of the produced images. The results of the evaluations suggest that the transfer function and choice of recon- struction filter can have an effect on the performance on tasks with stereoscopic displays when all other parameters are kept consistent. Further, these were found to affect the sensitivity and bias response of the participants. The studies also show that properties of the reconstruction filters such as post-aliasing and smoothing do not correlate well with either task performance or quality ratings. Included in the contributions are guidelines and recommendations on the choice of pa- rameters for increased task performance and quality scores as well as image based methods of analysing stereoscopic DVR images

    Application of sound source separation methods to advanced spatial audio systems

    Full text link
    This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented.Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969Palanci

    Hybrid Rugosity Mesostructures (HRMs) for fast and accurate rendering of fine haptic detail

    Get PDF
    The haptic rendering of surface mesostructure (fine relief features) in dense triangle meshes requires special structures, equipment, and high sampling rates for detailed perception of rugged models. Low cost approaches render haptic texture at the expense of fidelity of perception. We propose a faster method for surface haptic rendering using image-based Hybrid Rugosity Mesostructures (HRMs), paired maps with per-face heightfield displacements and normal maps, which are layered on top of a much decimated mesh, effectively adding greater surface detail than actually present in the geometry. The haptic probe’s force response algorithm is modulated using the blended HRM coat to render dense surface features at much lower costs. The proposed method solves typical problems at edge crossings, concave foldings and texture transitions. To prove the wellness of the approach, a usability testbed framework was built to measure and compare experimental results of haptic rendering approaches in a common set of specially devised meshes, HRMs, and performance tests. Trial results of user testing evaluations show the goodness of the proposed HRM technique, rendering accurate 3D surface detail at high sampling rates, deriving useful modeling and perception thresholds for this technique.Peer ReviewedPostprint (published version

    The influence of olfaction on the perception of high-fidelity computer graphics

    Get PDF
    The computer graphics industry is constantly demanding more realistic images and animations. However, producing such high quality scenes can take a long time, even days, if rendering on a single PC. One of the approaches that can be used to speed up rendering times is Visual Perception, which exploits the limitations of the Human Visual System, since the viewers of the results will be humans. Although there is an increasing body of research into how haptics and sound may affect a viewer's perception in a virtual environment, the in uence of smell has been largely ignored. The aim of this thesis is to address this gap and make smell an integral part of multi-modal virtual environments. In this work, we have performed four major experiments, with a total of 840 participants. In the experiments we used still images and animations, related and unrelated smells and finally, a multi-modal environment was considered with smell, sound and temperature. Beside this, we also investigated how long it takes for an average person to adapt to smell and what affect there may be when performing a task in the presence of a smell. The results of this thesis clearly show that a smell present in the environment firstly affects the perception of object quality within a rendered image, and secondly, enables parts of the scene or the whole animation to be selectively rendered in high quality while the rest can be rendered in a lower quality without the viewer noticing the drop in quality. Such selective rendering in the presence of smell results in significant computational performance gains without any loss in the quality of the image or animations perceived by a viewer

    Low-Latency Rendering With Dataflow Architectures

    Get PDF
    Recent years have seen a resurgence of virtual reality (VR), sparked by the repurposing of low-cost COTS components. VR aims to generate stimuli that appear to come from a source other than the interface through which they are delivered. The synthetic stimuli replace real-world stimuli, and transport the user to another, perhaps imaginary, “place.” To do this, we must overcome many challenges, often related to matching the synthetic stimuli to the expectations and behavior of the real world. One way in which the stimuli can fail is its latency–– the time between a user's action and the computer's response. We constructed a novel VR renderer, that optimized latency above all else. Our prototype allowed us to explore how latency affects human–computer interaction. We had to completely reconsider the interaction between time, space, and synchronization on displays and in the traditional graphics pipeline. Using a specialized architecture––dataflow computing––we combined consumer, industrial, and prototype components to create an integrated 1:1 room-scale VR system with a latency of under 3 ms. While this was prototype hardware, the considerations in achieving this performance inform the design of future VR pipelines, and our human factors studies have provided new and sometimes surprising contributions to the body of knowledge on latency in HCI

    Local sound field synthesis

    Get PDF
    This thesis investigates the physical and perceptual properties of selected methods for (Local) Sound Field Synthesis ((L)SFS). In agreement with numerical sound field simulations, a specifically developed geometric model shows an increase of synthesis accuracy for LSFS compared to conventional SFS approaches. Different (L)SFS approaches are assessed within listening experiments, where LSFS performs at least as good as conventional methods for azimuthal sound source localisation and achieves a significant increase of timbral fidelity for distinct parametrisations.Die Arbeit untersucht die physikalischen und perzeptiven Eigenschaften von ausgewählten Verfahren zur (lokalen) Schallfeldsynthese ((L)SFS). Zusammen mit numerischen Simulationen zeigt ein eigens entwickeltes geometrisches Modell, dass LSFS gegenüber konventioneller SFS zu einer genauere Synthese führt. Die Verfahren werden in Hörversuchen evaluiert, wobei LSFS bei der horizontalen Lokalisierung von Schallquellen eine Genauigkeit erreicht, welche mindestens gleich der von konventionellen Methoden ist. Für bestimmte Parametrierung wird eine signifikant verbesserte klangliche Treue erreicht
    corecore