38 research outputs found

    Deformable Beamsplitters: Enhancing Perception with Wide Field of View, Varifocal Augmented Reality Displays

    Get PDF
    An augmented reality head-mounted display with full environmental awareness could present data in new ways and provide a new type of experience, allowing seamless transitions between real life and virtual content. However, creating a light-weight, optical see-through display providing both focus support and wide field of view remains a challenge. This dissertation describes a new dynamic optical element, the deformable beamsplitter, and its applications for wide field of view, varifocal, augmented reality displays. Deformable beamsplitters combine a traditional deformable membrane mirror and a beamsplitter into a single element, allowing reflected light to be manipulated by the deforming membrane mirror, while transmitted light remains unchanged. This research enables both single element optical design and correct focus while maintaining a wide field of view, as demonstrated by the description and analysis of two prototype hardware display systems which incorporate deformable beamsplitters. As a user changes the depth of their gaze when looking through these displays, the focus of virtual content can quickly be altered to match the real world by simply modulating air pressure in a chamber behind the deformable beamsplitter; thus ameliorating vergence–accommodation conflict. Two user studies verify the display prototypes’ capabilities and show the potential of the display in enhancing human performance at quickly perceiving visual stimuli. This work shows that near-eye displays built with deformable beamsplitters allow for simple optical designs that enable wide field of view and comfortable viewing experiences with the potential to enhance user perception.Doctor of Philosoph

    A real-time low-cost vision sensor for robotic bin picking

    Get PDF
    This thesis presents an integrated approach of a vision sensor for bin picking. The vision system that has been devised consists of three major components. The first addresses the implementation of a bifocal range sensor which estimates the depth by measuring the relative blurring between two images captured with different focal settings. A key element in the success of this approach is that it overcomes some of the limitations that were associated with other related implementations and the experimental results indicate that the precision offered by the sensor discussed in this thesis is precise enough for a large variety of industrial applications. The second component deals with the implementation of an edge-based segmentation technique which is applied in order to detect the boundaries of the objects that define the scene. An important issue related to this segmentation technique consists of minimising the errors in the edge detected output, an operation that is carried out by analysing the information associated with the singular edge points. The last component addresses the object recognition and pose estimation using the information resulting from the application of the segmentation algorithm. The recognition stage consists of matching the primitives derived from the scene regions, while the pose estimation is addressed using an appearance-based approach augmented with a range data analysis. The developed system is suitable for real-time operation and in order to demonstrate the validity of the proposed approach it has been examined under varying real-world scenes

    Stereoscopic high dynamic range imaging

    Get PDF
    Two modern technologies show promise to dramatically increase immersion in virtual environments. Stereoscopic imaging captures two images representing the views of both eyes and allows for better depth perception. High dynamic range (HDR) imaging accurately represents real world lighting as opposed to traditional low dynamic range (LDR) imaging. HDR provides a better contrast and more natural looking scenes. The combination of the two technologies in order to gain advantages of both has been, until now, mostly unexplored due to the current limitations in the imaging pipeline. This thesis reviews both fields, proposes stereoscopic high dynamic range (SHDR) imaging pipeline outlining the challenges that need to be resolved to enable SHDR and focuses on capture and compression aspects of that pipeline. The problems of capturing SHDR images that would potentially require two HDR cameras and introduce ghosting, are mitigated by capturing an HDR and LDR pair and using it to generate SHDR images. A detailed user study compared four different methods of generating SHDR images. Results demonstrated that one of the methods may produce images perceptually indistinguishable from the ground truth. Insights obtained while developing static image operators guided the design of SHDR video techniques. Three methods for generating SHDR video from an HDR-LDR video pair are proposed and compared to the ground truth SHDR videos. Results showed little overall error and identified a method with the least error. Once captured, SHDR content needs to be efficiently compressed. Five SHDR compression methods that are backward compatible are presented. The proposed methods can encode SHDR content to little more than that of a traditional single LDR image (18% larger for one method) and the backward compatibility property encourages early adoption of the format. The work presented in this thesis has introduced and advanced capture and compression methods for the adoption of SHDR imaging. In general, this research paves the way for a novel field of SHDR imaging which should lead to improved and more realistic representation of captured scenes

    Design and Development of a Research Framework for Prototyping Control Tower Augmented Reality Tools

    Get PDF
    The purpose of the air traffic management system is to ensure the safe and efficient flow of air traffic. Therefore, while augmenting efficiency, throughput and capacity in airport operations, attention has rightly been placed on doing it in a safe manner. In the control tower, many advances in operational safety have come in the form of visualization tools for tower controllers. However, there is a paradox in developing such systems to increase controllers' situational awareness: by creating additional computer displays, the controller's vision is pulled away from the outside view and the time spent looking down at the monitors is increased. This reduces their situational awareness by forcing them to mentally and physically switch between the head-down equipment and the outside view. This research is based on the idea that augmented reality may be able to address this issue. The augmented reality concept has become increasingly popular over the past decade and is being proficiently used in many fields, such as entertainment, cultural heritage, aviation, military & defense. This know-how could be transferred to air traffic control with a relatively low effort and substantial benefits for controllers’ situation awareness. Research on this topic is consistent with SESAR objectives of increasing air traffic controllers’ situation awareness and enable up to 10 % of additional flights at congested airports while still increasing safety and efficiency. During the Ph.D., a research framework for prototyping augmented reality tools was set up. This framework consists of methodological tools for designing the augmented reality overlays, as well as of hardware and software equipment to test them. Several overlays have been designed and implemented in a simulated tower environment, which is a virtual reconstruction of Bologna airport control tower. The positive impact of such tools was preliminary assessed by means of the proposed methodology

    Augmented reality fonts with enhanced out-of-focus text legibility

    Get PDF
    In augmented reality, information is often distributed between real and virtual contexts, and often appears at different distances from the viewer. This raises the issues of (1) context switching, when attention is switched between real and virtual contexts, (2) focal distance switching, when the eye accommodates to see information in sharp focus at a new distance, and (3) transient focal blur, when information is seen out of focus, during the time interval of focal distance switching. This dissertation research has quantified the impact of context switching, focal distance switching, and transient focal blur on human performance and eye fatigue in both monocular and binocular viewing conditions. Further, this research has developed a novel font that when seen out-of-focus looks sharper than standard fonts. This SharpView font promises to mitigate the effect of transient focal blur. Developing this font has required (1) mathematically modeling out-of-focus blur with Zernike polynomials, which model focal deficiencies of human vision, (2) developing a focus correction algorithm based on total variation optimization, which corrects out-of-focus blur, and (3) developing a novel algorithm for measuring font sharpness. Finally, this research has validated these fonts through simulation and optical camera-based measurement. This validation has shown that, when seen out of focus, SharpView fonts are as much as 40 to 50% sharper than standard fonts. This promises to improve font legibility in many applications of augmented reality

    Methods for Light Field Display Profiling and Scalable Super-Multiview Video Coding

    Get PDF
    Light field 3D displays reproduce the light field of real or synthetic scenes, as observed by multiple viewers, without the necessity of wearing 3D glasses. Reproducing light fields is a technically challenging task in terms of optical setup, content creation, distributed rendering, among others; however, the impressive visual quality of hologramlike scenes, in full color, with real-time frame rates, and over a very wide field of view justifies the complexity involved. Seeing objects popping far out from the screen plane without glasses impresses even those viewers who have experienced other 3D displays before.Content for these displays can either be synthetic or real. The creation of synthetic (rendered) content is relatively well understood and used in practice. Depending on the technique used, rendering has its own complexities, quite similar to the complexity of rendering techniques for 2D displays. While rendering can be used in many use-cases, the holy grail of all 3D display technologies is to become the future 3DTVs, ending up in each living room and showing realistic 3D content without glasses. Capturing, transmitting, and rendering live scenes as light fields is extremely challenging, and it is necessary if we are about to experience light field 3D television showing real people and natural scenes, or realistic 3D video conferencing with real eye-contact.In order to provide the required realism, light field displays aim to provide a wide field of view (up to 180°), while reproducing up to ~80 MPixels nowadays. Building gigapixel light field displays is realistic in the next few years. Likewise, capturing live light fields involves using many synchronized cameras that cover the same display wide field of view and provide the same high pixel count. Therefore, light field capture and content creation has to be well optimized with respect to the targeted display technologies. Two major challenges in this process are addressed in this dissertation.The first challenge is how to characterize the display in terms of its capabilities to create light fields, that is how to profile the display in question. In clearer terms this boils down to finding the equivalent spatial resolution, which is similar to the screen resolution of 2D displays, and angular resolution, which describes the smallest angle, the color of which the display can control individually. Light field is formalized as 4D approximation of the plenoptic function in terms of geometrical optics through spatiallylocalized and angularly-directed light rays in the so-called ray space. Plenoptic Sampling Theory provides the required conditions to sample and reconstruct light fields. Subsequently, light field displays can be characterized in the Fourier domain by the effective display bandwidth they support. In the thesis, a methodology for displayspecific light field analysis is proposed. It regards the display as a signal processing channel and analyses it as such in spectral domain. As a result, one is able to derive the display throughput (i.e. the display bandwidth) and, subsequently, the optimal camera configuration to efficiently capture and filter light fields before displaying them.While the geometrical topology of optical light sources in projection-based light field displays can be used to theoretically derive display bandwidth, and its spatial and angular resolution, in many cases this topology is not available to the user. Furthermore, there are many implementation details which cause the display to deviate from its theoretical model. In such cases, profiling light field displays in terms of spatial and angular resolution has to be done by measurements. Measurement methods that involve the display showing specific test patterns, which are then captured by a single static or moving camera, are proposed in the thesis. Determining the effective spatial and angular resolution of a light field display is then based on an automated analysis of the captured images, as they are reproduced by the display, in the frequency domain. The analysis reveals the empirical limits of the display in terms of pass-band both in the spatial and angular dimension. Furthermore, the spatial resolution measurements are validated by subjective tests confirming that the results are in line with the smallest features human observers can perceive on the same display. The resolution values obtained can be used to design the optimal capture setup for the display in question.The second challenge is related with the massive number of views and pixels captured that have to be transmitted to the display. It clearly requires effective and efficient compression techniques to fit in the bandwidth available, as an uncompressed representation of such a super-multiview video could easily consume ~20 gigabits per second with today’s displays. Due to the high number of light rays to be captured, transmitted and rendered, distributed systems are necessary for both capturing and rendering the light field. During the first attempts to implement real-time light field capturing, transmission and rendering using a brute force approach, limitations became apparent. Still, due to the best possible image quality achievable with dense multi-camera light field capturing and light ray interpolation, this approach was chosen as the basis of further work, despite the massive amount of bandwidth needed. Decompression of all camera images in all rendering nodes, however, is prohibitively time consuming and is not scalable. After analyzing the light field interpolation process and the data-access patterns typical in a distributed light field rendering system, an approach to reduce the amount of data required in the rendering nodes has been proposed. This approach, on the other hand, requires rectangular parts (typically vertical bars in case of a Horizontal Parallax Only light field display) of the captured images to be available in the rendering nodes, which might be exploited to reduce the time spent with decompression of video streams. However, partial decoding is not readily supported by common image / video codecs. In the thesis, approaches aimed at achieving partial decoding are proposed for H.264, HEVC, JPEG and JPEG2000 and the results are compared.The results of the thesis on display profiling facilitate the design of optimal camera setups for capturing scenes to be reproduced on 3D light field displays. The developed super-multiview content encoding also facilitates light field rendering in real-time. This makes live light field transmission and real-time teleconferencing possible in a scalable way, using any number of cameras, and at the spatial and angular resolution the display actually needs for achieving a compelling visual experience
    corecore