17 research outputs found

    Efficient rendering for three-dimensional displays

    Get PDF
    This thesis explores more efficient methods for visualizing point data sets on three-dimensional (3D) displays. Point data sets are used in many scientific applications, e.g. cosmological simulations. Visualizing these data sets in {3D} is desirable because it can more readily reveal structure and unknown phenomena. However, cutting-edge scientific point data sets are very large and producing/rendering even a single image is expensive. Furthermore, current literature suggests that the ideal number of views for 3D (multiview) displays can be in the hundreds, which compounds the costs. The accepted notion that many views are required for {3D} displays is challenged by carrying out a novel human factor trials study. The results suggest that humans are actually surprisingly insensitive to the number of viewpoints with regard to their task performance, when occlusion in the scene is not a dominant factor. Existing stereoscopic rendering algorithms can have high set-up costs which limits their use and none are tuned for uncorrelated {3D} point rendering. This thesis shows that it is possible to improve rendering speeds for a low number of views by perspective reprojection. The novelty in the approach described lies in delaying the reprojection and generation of the viewpoints until the fragment stage of the pipeline and streamlining the rendering pipeline for points only. Theoretical analysis suggests a fragment reprojection scheme will render at least 2.8 times faster than na\"{i}vely re-rendering the scene from multiple viewpoints. Building upon the fragment reprojection technique, further rendering performance is shown to be possible (at the cost of some rendering accuracy) by restricting the amount of reprojection required according to the stereoscopic resolution of the display. A significant benefit is that the scene depth can be mapped arbitrarily to the perceived depth range of the display at no extra cost than a single region mapping approach. Using an average case-study (rendering from a 500k points for a 9-view High Definition 3D display), theoretical analysis suggests that this new approach is capable of twice the performance gains than simply reprojecting every single fragment, and quantitative measures show the algorithm to be 5 times faster than a naïve rendering approach. Further detailed quantitative results, under varying scenarios, are provided and discussed

    Real-time synthetic primate vision

    Get PDF

    Optimization techniques for computationally expensive rendering algorithms

    Get PDF
    Realistic rendering in computer graphics simulates the interactions of light and surfaces. While many accurate models for surface reflection and lighting, including solid surfaces and participating media have been described; most of them rely on intensive computation. Common practices such as adding constraints and assumptions can increase performance. However, they may compromise the quality of the resulting images or the variety of phenomena that can be accurately represented. In this thesis, we will focus on rendering methods that require high amounts of computational resources. Our intention is to consider several conceptually different approaches capable of reducing these requirements with only limited implications in the quality of the results. The first part of this work will study rendering of time-­¿varying participating media. Examples of this type of matter are smoke, optically thick gases and any material that, unlike the vacuum, scatters and absorbs the light that travels through it. We will focus on a subset of algorithms that approximate realistic illumination using images of real world scenes. Starting from the traditional ray marching algorithm, we will suggest and implement different optimizations that will allow performing the computation at interactive frame rates. This thesis will also analyze two different aspects of the generation of anti-­¿aliased images. One targeted to the rendering of screen-­¿space anti-­¿aliased images and the reduction of the artifacts generated in rasterized lines and edges. We expect to describe an implementation that, working as a post process, it is efficient enough to be added to existing rendering pipelines with reduced performance impact. A third method will take advantage of the limitations of the human visual system (HVS) to reduce the resources required to render temporally antialiased images. While film and digital cameras naturally produce motion blur, rendering pipelines need to explicitly simulate it. This process is known to be one of the most important burdens for every rendering pipeline. Motivated by this, we plan to run a series of psychophysical experiments targeted at identifying groups of motion-­¿blurred images that are perceptually equivalent. A possible outcome is the proposal of criteria that may lead to reductions of the rendering budgets

    Mixed-reality visualization environments to facilitate ultrasound-guided vascular access

    Get PDF
    Ultrasound-guided needle insertions at the site of the internal jugular vein (IJV) are routinely performed to access the central venous system. Ultrasound-guided insertions maintain high rates of carotid artery puncture, as clinicians rely on 2D information to perform a 3D procedure. The limitations of 2D ultrasound-guidance motivated the research question: “Do 3D ultrasound-based environments improve IJV needle insertion accuracy”. We addressed this by developing advanced surgical navigation systems based on tracked surgical tools and ultrasound with various visualizations. The point-to-line ultrasound calibration enables the use of tracked ultrasound. We automated the fiducial localization required for this calibration method such that fiducials can be automatically localized within 0.25 mm of the manual equivalent. The point-to-line calibration obtained with both manual and automatic localizations produced average normalized distance errors less than 1.5 mm from point targets. Another calibration method was developed that registers an optical tracking system and the VIVE Pro head-mounted display (HMD) tracking system with sub-millimetre and sub-degree accuracy compared to ground truth values. This co-calibration enabled the development of an HMD needle navigation system, in which the calibrated ultrasound image and tracked models of the needle, needle trajectory, and probe were visualized in the HMD. In a phantom experiment, 31 clinicians had a 96 % success rate using the HMD system compared to 70 % for the ultrasound-only approach (p= 0.018). We developed a machine-learning-based vascular reconstruction pipeline that automatically returns accurate 3D reconstructions of the carotid artery and IJV given sequential tracked ultrasound images. This reconstruction pipeline was used to develop a surgical navigation system, where tracked models of the needle, needle trajectory, and the 3D z-buffered vasculature from a phantom were visualized in a common coordinate system on a screen. This system improved the insertion accuracy and resulted in 100 % success rates compared to 70 % under ultrasound-guidance (p=0.041) across 20 clinicians during the phantom experiment. Overall, accurate calibrations and machine learning algorithms enable the development of advanced 3D ultrasound systems for needle navigation, both in an immersive first-person perspective and on a screen, illustrating that 3D US environments outperformed 2D ultrasound-guidance used clinically

    Quality of Experience in Immersive Video Technologies

    Get PDF
    Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

    Situated face detection

    Get PDF
    scholarship 71357In the last twenty years, important advances have been made in the field of automatic face processing, given the importance of human faces for personal identification, emotional expression and verbal and non verbal communication. The very first step in a face processing algorithm is the detection of faces; while this is a trivial problem in controlled environments, the detection of faces in real environments is still a challenging task. Until now, the most successful approaches for face detection represent the face as a grey-level pattern, and the problem itself is considered as the classification between "face" and "non-face" patterns. Satisfactory results have been achieved in this area. The main disadvantage is that an exhaustive search has to be done on each image in order to locate the faces. This search normally involves testing every single position on the image at different scales, and although this does not represent an important drawback in off-line face processing systems, in those cases where a real-time response is needed it is still a problem. In the different proposed methods for face detection, the "observer" is a disembodied entity, which holds no relationship with the observed scene. This thesis presents a framework for an efficient location of faces in real scenes, in which, by considering both the observer to be situated in the world, and the relationships that hold between the two, a set of constraints in the search space can be defined. The constraints rely on two main assumptions; first, the observer can purposively interact with the world (i.e. change its position relative to the observed scene) and second, the camera is fully calibrated. The first source constraint is the structural information about the observer environment, represented as a depth map of the scene in front of the camera. From this representation the search space can be constrained in terms of the range of scales where a face might be found as different positions in the image. The second source of constraint is the geometrical relationship between the camera and the scene, which allows us to project a model of the subject into the scene in order to eliminate those areas where faces are unlikely to be found. In order to test the proposed framework, a system based on the premises stated above was constructed. It is based on three different modules: a face/non-face classifier, a depth estimation module and a search module. The classifier is composed of a set of convolutional neural networks (CNN) that were trained to differentiate between face and non-face patterns, the depth estimation modules uses a multilevel algorithm to compute the scene depth map from a sequence of images captured the depth information and the subject model into the image where the search will be performed in order to constrain the search space. Finally, the proposed system was validated by running a set of experiments on the individual modules and then on the whole system

    Using Bilinear Transformations to Estimate The Ratios of Accommodation and Vergence Responses of Binocular Vision

    No full text

    Colour depth-from-defocus incorporating experimental point spread function measurements

    Get PDF
    Depth-From-Defocus (DFD) is a monocular computer vision technique for creating depth maps from two images taken on the same optical axis with different intrinsic camera parameters. A pre-processing stage for optimally converting colour images to monochrome using a linear combination of the colour planes has been shown to improve the accuracy of the depth map. It was found that the first component formed using Principal Component Analysis (PCA) and a technique to maximise the signal-to-noise ratio (SNR) performed better than using an equal weighting of the colour planes with an additive noise model. When the noise is non-isotropic the Mean Square Error (MSE) of the depth map by maximising the SNR was improved by 7.8 times compared to an equal weighting and 1.9 compared to PCA. The fractal dimension (FD) of a monochrome image gives a measure of its roughness and an algorithm was devised to maximise its FD through colour mixing. The formulation using a fractional Brownian motion (mm) model reduced the SNR and thus produced depth maps that were less accurate than using PCA or an equal weighting. An active DFD algorithm to reduce the image overlap problem has been developed, called Localisation through Colour Mixing (LCM), that uses a projected colour pattern. Simulation results showed that LCM produces a MSE 9.4 times lower than equal weighting and 2.2 times lower than PCA. The Point Spread Function (PSF) of a camera system models how a point source of light is imaged. For depth maps to be accurately created using DFD a high-precision PSF must be known. Improvements to a sub-sampled, knife-edge based technique are presented that account for non-uniform illumination of the light box and this reduced the MSE by 25%. The Generalised Gaussian is presented as a model of the PSF and shown to be up to 16 times better than the conventional models of the Gaussian and pillbox

    Approximate Spatial Layout Processing in the Visual System: Modeling Texture-Based Segmentation and Shape Estimation

    Get PDF
    Moving through the environment, grasping objects, orienting oneself, and countless other tasks all require information about spatial organization. This in turn requires determining where surfaces, objects and other elements of a scene are located and how they are arranged. Humans and other animals can extract spatial organization from vision rapidly and automatically. To better understand this capability, it would be useful to know how the visual system can make an initial estimate of the spatial layout. Without time or opportunity for a more careful analysis, a rough estimate may be all that the system can extract. Nevertheless, rough spatial information may be sufficient for many purposes, even if it is devoid of details that are important for tasks such as object recognition. The human visual system uses many sources of information for estimating layout. Here I focus on one source in particular: visual texture. I present a biologically reasonable, computational model of how the system can exploit patterns of texture for performing two basic tasks in spatial layout processing: locating possible surfaces in the visual input, and estimating their approximate shapes. Separately, these two tasks have been studied extensively, but they have not previously been examined together in the context of a model grounded in neurophysiology and psychophysics. I show that by integrating segmentation and shape estimation, a system can share information between these processes, allowing the processes to constrain and inform each other as well as save on computations. The model developed here begins with the responses of simulated complex cells of the primary visual cortex, and combines a weak membrane/functional minimization approach to segmentation with a shape estimation method based on tracking changes in the average dominant spatial frequencies across a surface. It includes mechanisms for detecting untextured areas and flat areas in an input image. In support of the model, I present a software simulation that can perform texture-based segmentation and shape estimation on images containing multiple, curved, textured surfaces.Ph.D.Applied SciencesBiological SciencesCognitive psychologyComputer scienceNeurosciencesPsychologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/131446/2/9909908.pd

    EVOLUTION OF THE SUBCONTINENTAL LITHOSPHERE DURING MESOZOIC TETHYAN RIFTING: CONSTRAINTS FROM THE EXTERNAL LIGURIAN MANTLE SECTION (NORTHERN APENNINE, ITALY)

    Get PDF
    Our study is focussed on mantle bodies from the External Ligurian ophiolites, within the Monte Gavi and Monte Sant'Agostino areas. Here, two distinct pyroxenite-bearing mantle sections were recognized, mainly based on their plagioclase-facies evolution. The Monte Gavi mantle section is nearly undeformed and records reactive melt infiltration under plagioclase-facies conditions. This process involved both peridotites (clinopyroxene-poor lherzolites) and enclosed spinel pyroxenite layers, and occurred at 0.7–0.8 GPa. In the Monte Gavi peridotites and pyroxenites, the spinel-facies clinopyroxene was replaced by Ca-rich plagioclase and new orthopyroxene, typically associated with secondary clinopyroxene. The reactive melt migration caused increase of TiO2 contents in relict clinopyroxene and spinel, with the latter also recording a Cr2O3 increase. In the Monte Gavi peridotites and pyroxenites, geothermometers based on slowly diffusing elements (REE and Y) record high temperature conditions (1200-1250 °C) related to the melt infiltration event, followed by subsolidus cooling until ca. 900°C. The Monte Sant'Agostino mantle section is characterized by widespread ductile shearing with no evidence of melt infiltration. The deformation recorded by the Monte Sant'Agostino peridotites (clinopyroxene-rich lherzolites) occurred at 750–800 °C and 0.3–0.6 GPa, leading to protomylonitic to ultramylonitic textures with extreme grain size reduction (10–50 μm). Compared to the peridotites, the enclosed pyroxenite layers gave higher temperature-pressure estimates for the plagioclase-facies re-equilibration (870–930 °C and 0.8–0.9 GPa). We propose that the earlier plagioclase crystallization in the pyroxenites enhanced strain localization and formation of mylonite shear zones in the entire mantle section. We subdivide the subcontinental mantle section from the External Ligurian ophiolites into three distinct domains, developed in response to the rifting evolution that ultimately formed a Middle Jurassic ocean-continent transition: (1) a spinel tectonite domain, characterized by subsolidus static formation of plagioclase, i.e. the Suvero mantle section (Hidas et al., 2020), (2) a plagioclase mylonite domain experiencing melt-absent deformation and (3) a nearly undeformed domain that underwent reactive melt infiltration under plagioclase-facies conditions, exemplified by the the Monte Sant'Agostino and the Monte Gavi mantle sections, respectively. We relate mantle domains (1) and (2) to a rifting-driven uplift in the late Triassic accommodated by large-scale shear zones consisting of anhydrous plagioclase mylonites. Hidas K., Borghini G., Tommasi A., Zanetti A. & Rampone E. 2021. Interplay between melt infiltration and deformation in the deep lithospheric mantle (External Liguride ophiolite, North Italy). Lithos 380-381, 105855
    corecore