123 research outputs found

    Applying image processing techniques to pose estimation and view synthesis.

    Get PDF
    Fung Yiu-fai Phineas.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 142-148).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Model-based Pose Estimation --- p.3Chapter 1.1.1 --- Application - 3D Motion Tracking --- p.4Chapter 1.2 --- Image-based View Synthesis --- p.4Chapter 1.3 --- Thesis Contribution --- p.7Chapter 1.4 --- Thesis Outline --- p.8Chapter 2 --- General Background --- p.9Chapter 2.1 --- Notations --- p.9Chapter 2.2 --- Camera Models --- p.10Chapter 2.2.1 --- Generic Camera Model --- p.10Chapter 2.2.2 --- Full-perspective Camera Model --- p.11Chapter 2.2.3 --- Affine Camera Model --- p.12Chapter 2.2.4 --- Weak-perspective Camera Model --- p.13Chapter 2.2.5 --- Paraperspective Camera Model --- p.14Chapter 2.3 --- Model-based Motion Analysis --- p.15Chapter 2.3.1 --- Point Correspondences --- p.16Chapter 2.3.2 --- Line Correspondences --- p.18Chapter 2.3.3 --- Angle Correspondences --- p.19Chapter 2.4 --- Panoramic Representation --- p.20Chapter 2.4.1 --- Static Mosaic --- p.21Chapter 2.4.2 --- Dynamic Mosaic --- p.22Chapter 2.4.3 --- Temporal Pyramid --- p.23Chapter 2.4.4 --- Spatial Pyramid --- p.23Chapter 2.5 --- Image Pre-processing --- p.24Chapter 2.5.1 --- Feature Extraction --- p.24Chapter 2.5.2 --- Spatial Filtering --- p.27Chapter 2.5.3 --- Local Enhancement --- p.31Chapter 2.5.4 --- Dynamic Range Stretching or Compression --- p.32Chapter 2.5.5 --- YIQ Color Model --- p.33Chapter 3 --- Model-based Pose Estimation --- p.35Chapter 3.1 --- Previous Work --- p.35Chapter 3.1.1 --- Estimation from Established Correspondences --- p.36Chapter 3.1.2 --- Direct Estimation from Image Intensities --- p.49Chapter 3.1.3 --- Perspective-3-Point Problem --- p.51Chapter 3.2 --- Our Iterative P3P Algorithm --- p.58Chapter 3.2.1 --- Gauss-Newton Method --- p.60Chapter 3.2.2 --- Dealing with Ambiguity --- p.61Chapter 3.2.3 --- 3D-to-3D Motion Estimation --- p.66Chapter 3.3 --- Experimental Results --- p.68Chapter 3.3.1 --- Synthetic Data --- p.68Chapter 3.3.2 --- Real Images --- p.72Chapter 3.4 --- Discussions --- p.73Chapter 4 --- Panoramic View Analysis --- p.76Chapter 4.1 --- Advanced Mosaic Representation --- p.76Chapter 4.1.1 --- Frame Alignment Policy --- p.77Chapter 4.1.2 --- Multi-resolution Representation --- p.77Chapter 4.1.3 --- Parallax-based Representation --- p.78Chapter 4.1.4 --- Multiple Moving Objects --- p.79Chapter 4.1.5 --- Layers and Tiles --- p.79Chapter 4.2 --- Panorama Construction --- p.79Chapter 4.2.1 --- Image Acquisition --- p.80Chapter 4.2.2 --- Image Alignment --- p.82Chapter 4.2.3 --- Image Integration --- p.88Chapter 4.2.4 --- Significant Residual Estimation --- p.89Chapter 4.3 --- Advanced Alignment Algorithms --- p.90Chapter 4.3.1 --- Patch-based Alignment --- p.91Chapter 4.3.2 --- Global Alignment (Block Adjustment) --- p.92Chapter 4.3.3 --- Local Alignment (Deghosting) --- p.93Chapter 4.4 --- Mosaic Application --- p.94Chapter 4.4.1 --- Visualization Tool --- p.94Chapter 4.4.2 --- Video Manipulation --- p.95Chapter 4.5 --- Experimental Results --- p.96Chapter 5 --- Panoramic Walkthrough --- p.99Chapter 5.1 --- Problem Statement and Notations --- p.100Chapter 5.2 --- Previous Work --- p.101Chapter 5.2.1 --- 3D Modeling and Rendering --- p.102Chapter 5.2.2 --- Branching Movies --- p.103Chapter 5.2.3 --- Texture Window Scaling --- p.104Chapter 5.2.4 --- Problems with Simple Texture Window Scaling --- p.105Chapter 5.3 --- Our Walkthrough Approach --- p.106Chapter 5.3.1 --- Cylindrical Projection onto Image Plane --- p.106Chapter 5.3.2 --- Generating Intermediate Frames --- p.108Chapter 5.3.3 --- Occlusion Handling --- p.114Chapter 5.4 --- Experimental Results --- p.116Chapter 5.5 --- Discussions --- p.116Chapter 6 --- Conclusion --- p.121Chapter A --- Formulation of Fischler and Bolles' Method for P3P Problems --- p.123Chapter B --- Derivation of z1 and z3 in terms of z2 --- p.127Chapter C --- Derivation of e1 and e2 --- p.129Chapter D --- Derivation of the Update Rule for Gauss-Newton Method --- p.130Chapter E --- Proof of (λ1λ2-λ 4)>〉0 --- p.132Chapter F --- Derivation of φ and hi --- p.133Chapter G --- Derivation of w1j to w4j --- p.134Chapter H --- More Experimental Results on Panoramic Stitching Algorithms --- p.138Bibliography --- p.14

    Spherical Image Processing for Immersive Visualisation and View Generation

    Get PDF
    This research presents the study of processing panoramic spherical images for immersive visualisation of real environments and generation of in-between views based on two views acquired. For visualisation based on one spherical image, the surrounding environment is modelled by a unit sphere mapped with the spherical image and the user is then allowed to navigate within the modelled scene. For visualisation based on two spherical images, a view generation algorithm is developed for modelling an indoor manmade environment and new views can be generated at an arbitrary position with respect to the existing two. This allows the scene to be modelled using multiple spherical images and the user to move smoothly from one sphere mapped image to another one by going through in-between sphere mapped images generated

    Spherical image processing for immersive visualisation and view generation

    Get PDF
    This research presents the study of processing panoramic spherical images for immersive visualisation of real environments and generation of in-between views based on two views acquired. For visualisation based on one spherical image, the surrounding environment is modelled by a unit sphere mapped with the spherical image and the user is then allowed to navigate within the modelled scene. For visualisation based on two spherical images, a view generation algorithm is developed for modelling an indoor manmade environment and new views can be generated at an arbitrary position with respect to the existing two. This allows the scene to be modelled using multiple spherical images and the user to move smoothly from one sphere mapped image to another one by going through in-between sphere mapped images generated.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Appearance Preserving Rendering of Out-of-Core Polygon and NURBS Models

    Get PDF
    In Computer Aided Design (CAD) trimmed NURBS surfaces are widely used due to their flexibility. For rendering and simulation however, piecewise linear representations of these objects are required. A relatively new field in CAD is the analysis of long-term strain tests. After such a test the object is scanned with a 3d laser scanner for further processing on a PC. In all these areas of CAD the number of primitives as well as their complexity has grown constantly in the recent years. This growth is exceeding the increase of processor speed and memory size by far and posing the need for fast out-of-core algorithms. This thesis describes a processing pipeline from the input data in the form of triangular or trimmed NURBS models until the interactive rendering of these models at high visual quality. After discussing the motivation for this work and introducing basic concepts on complex polygon and NURBS models, the second part of this thesis starts with a review of existing simplification and tessellation algorithms. Additionally, an improved stitching algorithm to generate a consistent model after tessellation of a trimmed NURBS model is presented. Since surfaces need to be modified interactively during the design phase, a novel trimmed NURBS rendering algorithm is presented. This algorithm removes the bottleneck of generating and transmitting a new tessellation to the graphics card after each modification of a surface by evaluating and trimming the surface on the GPU. To achieve high visual quality, the appearance of a surface can be preserved using texture mapping. Therefore, a texture mapping algorithm for trimmed NURBS surfaces is presented. To reduce the memory requirements for the textures, the algorithm is modified to generate compressed normal maps to preserve the shading of the original surface. Since texturing is only possible, when a parametric mapping of the surface - requiring additional memory - is available, a new simplification and tessellation error measure is introduced that preserves the appearance of the original surface by controlling the deviation of normal vectors. The preservation of normals and possibly other surface attributes allows interactive visualization for quality control applications (e.g. isophotes and reflection lines). In the last part out-of-core techniques for processing and rendering of gigabyte-sized polygonal and trimmed NURBS models are presented. Then the modifications necessary to support streaming of simplified geometry from a central server are discussed and finally and LOD selection algorithm to support interactive rendering of hard and soft shadows is described

    Virtual light fields for global illumination in computer graphics

    Get PDF
    This thesis presents novel techniques for the generation and real-time rendering of globally illuminated environments with surfaces described by arbitrary materials. Real-time rendering of globally illuminated virtual environments has for a long time been an elusive goal. Many techniques have been developed which can compute still images with full global illumination and this is still an area of active flourishing research. Other techniques have only dealt with certain aspects of global illumination in order to speed up computation and thus rendering. These include radiosity, ray-tracing and hybrid methods. Radiosity due to its view independent nature can easily be rendered in real-time after pre-computing and storing the energy equilibrium. Ray-tracing however is view-dependent and requires substantial computational resources in order to run in real-time. Attempts at providing full global illumination at interactive rates include caching methods, fast rendering from photon maps, light fields, brute force ray-tracing and GPU accelerated methods. Currently, these methods either only apply to special cases, are incomplete exhibiting poor image quality and/or scale badly such that only modest scenes can be rendered in real-time with current hardware. The techniques developed in this thesis extend upon earlier research and provide a novel, comprehensive framework for storing global illumination in a data structure - the Virtual Light Field - that is suitable for real-time rendering. The techniques trade off rapid rendering for memory usage and precompute time. The main weaknesses of the VLF method are targeted in this thesis. It is the expensive pre-compute stage with best-case O(N^2) performance, where N is the number of faces, which make the light propagation unpractical for all but simple scenes. This is analysed and greatly superior alternatives are presented and evaluated in terms of efficiency and error. Several orders of magnitude improvement in computational efficiency is achieved over the original VLF method. A novel propagation algorithm running entirely on the Graphics Processing Unit (GPU) is presented. It is incremental in that it can resolve visibility along a set of parallel rays in O(N) time and can produce a virtual light field for a moderately complex scene (tens of thousands of faces), with complex illumination stored in millions of elements, in minutes and for simple scenes in seconds. It is approximate but gracefully converges to a correct solution; a linear increase in resolution results in a linear increase in computation time. Finally a GPU rendering technique is presented which can render from Virtual Light Fields at real-time frame rates in high resolution VR presentation devices such as the CAVETM

    Development of Immersive and Interactive Virtual Reality Environment for Two-Player Table Tennis

    Get PDF
    Although the history of Virtual Reality (VR) is only about half a century old, all kinds of technologies in the VR field are developing rapidly. VR is a computer generated simulation that replaces or augments the real world by various media. In a VR environment, participants have a perception of “presence”, which can be described by the sense of immersion and intuitive interaction. One of the major VR applications is in the field of sports, in which a life-like sports environment is simulated, and the body actions of players can be tracked and represented by using VR tracking and visualisation technology. In the entertainment field, exergaming that merges video game with physical exercise activities by employing tracking or even 3D display technology can be considered as a small scale VR. For the research presented in this thesis, a novel realistic real-time table tennis game combining immersive, interactive and competitive features is developed. The implemented system integrates the InterSense tracking system, SwissRanger 3D camera and a three-wall rear projection stereoscopic screen. The Intersense tracking system is based on ultrasonic and inertia sensing techniques which provide fast and accurate 6-DOF (i.e. six degrees of freedom) tracking information of four trackers. Two trackers are placed on the two players’ heads to provide the players’ viewing positions. The other two trackers are held by players as the racquets. The SwissRanger 3D camera is mounted on top of the screen to capture the player’

    Low Latency Rendering with Dataflow Architectures

    Get PDF
    The research presented in this thesis concerns latency in VR and synthetic environments. Latency is the end-to-end delay experienced by the user of an interactive computer system, between their physical actions and the perceived response to these actions. Latency is a product of the various processing, transport and buffering delays present in any current computer system. For many computer mediated applications, latency can be distracting, but it is not critical to the utility of the application. Synthetic environments on the other hand attempt to facilitate direct interaction with a digitised world. Direct interaction here implies the formation of a sensorimotor loop between the user and the digitised world - that is, the user makes predictions about how their actions affect the world, and see these predictions realised. By facilitating the formation of the this loop, the synthetic environment allows users to directly sense the digitised world, rather than the interface, and induce perceptions, such as that of the digital world existing as a distinct physical place. This has many applications for knowledge transfer and efficient interaction through the use of enhanced communication cues. The complication is, the formation of the sensorimotor loop that underpins this is highly dependent on the fidelity of the virtual stimuli, including latency. The main research questions we ask are how can the characteristics of dataflow computing be leveraged to improve the temporal fidelity of the visual stimuli, and what implications does this have on other aspects of the fidelity. Secondarily, we ask what effects latency itself has on user interaction. We test the effects of latency on physical interaction at levels previously hypothesized but unexplored. We also test for a previously unconsidered effect of latency on higher level cognitive functions. To do this, we create prototype image generators for interactive systems and virtual reality, using dataflow computing platforms. We integrate these into real interactive systems to gain practical experience of how the real perceptible benefits of alternative rendering approaches, but also what implications are when they are subject to the constraints of real systems. We quantify the differences of our systems compared with traditional systems using latency and objective image fidelity measures. We use our novel systems to perform user studies into the effects of latency. Our high performance apparatuses allow experimentation at latencies lower than previously tested in comparable studies. The low latency apparatuses are designed to minimise what is currently the largest delay in traditional rendering pipelines and we find that the approach is successful in this respect. Our 3D low latency apparatus achieves lower latencies and higher fidelities than traditional systems. The conditions under which it can do this are highly constrained however. We do not foresee dataflow computing shouldering the bulk of the rendering workload in the future but rather facilitating the augmentation of the traditional pipeline with a very high speed local loop. This may be an image distortion stage or otherwise. Our latency experiments revealed that many predictions about the effects of low latency should be re-evaluated and experimenting in this range requires great care

    Point based graphics rendering with unified scalability solutions.

    Get PDF
    Standard real-time 3D graphics rendering algorithms use brute force polygon rendering, with complexity linear in the number of polygons and little regard for limiting processing to data that contributes to the image. Modern hardware can now render smaller scenes to pixel levels of detail, relaxing surface connectivity requirements. Sub-linear scalability optimizations are typically self-contained, requiring specific data structures, without shared functions and data. A new point based rendering algorithm 'Canopy' is investigated that combines multiple typically sub-linear scalability solutions, using a small core of data structures. Specifically, locale management, hierarchical view volume culling, backface culling, occlusion culling, level of detail and depth ordering are addressed. To demonstrate versatility further, shadows and collision detection are examined. Polygon models are voxelized with interpolated attributes to provide points. A scene tree is constructed, based on a BSP tree of points, with compressed attributes. The scene tree is embedded in a compressed, partitioned, procedurally based scene graph architecture that mimics conventional systems with groups, instancing, inlines and basic read on demand rendering from backing store. Hierarchical scene tree refinement constructs an image tree image space equivalent, with object space scene node points projected, forming image node equivalents. An image graph of image nodes is maintained, describing image and object space occlusion relationships, hierarchically refined with front to back ordering to a specified threshold whilst occlusion culling with occluder fusion. Visible nodes at medium levels of detail are refined further to rasterization scales. Occlusion culling defines a set of visible nodes that can support caching for temporal coherence. Occlusion culling is approximate, possibly not suiting critical applications. Qualities and performance are tested against standard rendering. Although the algorithm has a 0(f) upper bound in the scene sizef, it is shown to practically scale sub-linearly. Scenes with several hundred billion polygons conventionally, are rendered at interactive frame rates with minimal graphics hardware support

    Example Based Caricature Synthesis

    Get PDF
    The likeness of a caricature to the original face image is an essential and often overlooked part of caricature production. In this paper we present an example based caricature synthesis technique, consisting of shape exaggeration, relationship exaggeration, and optimization for likeness. Rather than relying on a large training set of caricature face pairs, our shape exaggeration step is based on only one or a small number of examples of facial features. The relationship exaggeration step introduces two definitions which facilitate global facial feature synthesis. The first is the T-Shape rule, which describes the relative relationship between the facial elements in an intuitive manner. The second is the so called proportions, which characterizes the facial features in a proportion form. Finally we introduce a similarity metric as the likeness metric based on the Modified Hausdorff Distance (MHD) which allows us to optimize the configuration of facial elements, maximizing likeness while satisfying a number of constraints. The effectiveness of our algorithm is demonstrated with experimental results

    User-oriented markerless augmented reality framework based on 3D reconstruction and loop closure detection

    Get PDF
    An augmented reality (AR) system needs to track the user-view to perform an accurate augmentation registration. The present research proposes a conceptual marker-less, natural feature-based AR framework system, the process for which is divided into two stages - an offline database training session for the application developers, and an online AR tracking and display session for the final users. In the offline session, two types of 3D reconstruction application, RGBD-SLAM and SfM are integrated into the development framework for building the reference template of a target environment. The performance and applicable conditions of these two methods are presented in the present thesis, and the application developers can choose which method to apply for their developmental demands. A general developmental user interface is provided to the developer for interaction, including a simple GUI tool for augmentation configuration. The present proposal also applies a Bag of Words strategy to enable a rapid "loop-closure detection" in the online session, for efficiently querying the application user-view from the trained database to locate the user pose. The rendering and display process of augmentation is currently implemented within an OpenGL window, which is one result of the research that is worthy of future detailed investigation and development
    • …
    corecore