540 research outputs found

    A view-based deformation tool-kit, Master\u27s Thesis, August 2006

    Get PDF
    Camera manipulation is a hard problem since a graphics camera is defined by specifying 11 independent parameters. Manipulating such a high-dimensional space to accomplish specific tasks is difficult and requires a certain amount of expertise. We present an intuitive interface that allows novice users to perform camera operations in terms of the change they want see in the image. In addition to developing a natural means for camera interaction, our system also includes a novel interface for viewing and organizing previously saved views. When exploring complex 3D data-sets a single view is not sufficient. Instead, a composite view built from multiple views may be more useful. While changing a single camera is hard enough, manipulating several cameras in a single scene is still harder. In this thesis, we also present a framework for creating composite views and an interface that allows users to manipulate such views in real-time

    A Tool for the 3D Spatio-Temporal Structuring of Historic Building Reconstructions

    Get PDF
    The difficulty in the description, the analysis and the comprehension of cultural heritage often stands on the fact that buildings undergo numerous changes over time. Three factors condition the knowledge of historical heritage. Firstly, 3D reconstructions of heritage buildings focus normally on existing states and not on the management of historical evolutions. Secondly, if on one side iconographic sources are generally used like visual memory of a building temporal state to be restored graphically, on the other side few works today focus on the use of all metric and visual information contained in sources. At last, iconographic documentation concerning building past states is sometimes contradictory, dubious and incomplete. As a consequence, in 3D reconstructions uncertainties, contradictions and gaps in information should be highlighted. We present a methodological approach basing on the existing iconographic corpus for the analysis and the 3D management of building transformations. This approach joins three main aspects in a complete workflow. Firstly, it concerns the spatial and temporal referencing of 2D iconographic sources for the 3D reconstruction of disappeared building states. Secondly, it allows the analysis of building transformations by means of a temporal state distribution. Lastly, it uses spatial relations established between 2D iconography and 3D representation for the visual browsing of information based on spatiotemporal criteria. In particular, in this paper we detail the interface developed in order to accomplish multiple related tasks concerning the spatio-temporal structuring of the morphology to be reconstructed

    Development, Implementation and Pre-clinical Evaluation of Medical Image Computing Tools in Support of Computer-aided Diagnosis: Respiratory, Orthopedic and Cardiac Applications

    Get PDF
    Over the last decade, image processing tools have become crucial components of all clinical and research efforts involving medical imaging and associated applications. The imaging data available to the radiologists continue to increase their workload, raising the need for efficient identification and visualization of the required image data necessary for clinical assessment. Computer-aided diagnosis (CAD) in medical imaging has evolved in response to the need for techniques that can assist the radiologists to increase throughput while reducing human error and bias without compromising the outcome of the screening, diagnosis or disease assessment. More intelligent, but simple, consistent and less time-consuming methods will become more widespread, reducing user variability, while also revealing information in a more clear, visual way. Several routine image processing approaches, including localization, segmentation, registration, and fusion, are critical for enhancing and enabling the development of CAD techniques. However, changes in clinical workflow require significant adjustments and re-training and, despite the efforts of the academic research community to develop state-of-the-art algorithms and high-performance techniques, their footprint often hampers their clinical use. Currently, the main challenge seems to not be the lack of tools and techniques for medical image processing, analysis, and computing, but rather the lack of clinically feasible solutions that leverage the already developed and existing tools and techniques, as well as a demonstration of the potential clinical impact of such tools. Recently, more and more efforts have been dedicated to devising new algorithms for localization, segmentation or registration, while their potential and much intended clinical use and their actual utility is dwarfed by the scientific, algorithmic and developmental novelty that only result in incremental improvements over already algorithms. In this thesis, we propose and demonstrate the implementation and evaluation of several different methodological guidelines that ensure the development of image processing tools --- localization, segmentation and registration --- and illustrate their use across several medical imaging modalities --- X-ray, computed tomography, ultrasound and magnetic resonance imaging --- and several clinical applications: Lung CT image registration in support for assessment of pulmonary nodule growth rate and disease progression from thoracic CT images. Automated reconstruction of standing X-ray panoramas from multi-sector X-ray images for assessment of long limb mechanical axis and knee misalignment. Left and right ventricle localization, segmentation, reconstruction, ejection fraction measurement from cine cardiac MRI or multi-plane trans-esophageal ultrasound images for cardiac function assessment. When devising and evaluating our developed tools, we use clinical patient data to illustrate the inherent clinical challenges associated with highly variable imaging data that need to be addressed before potential pre-clinical validation and implementation. In an effort to provide plausible solutions to the selected applications, the proposed methodological guidelines ensure the development of image processing tools that help achieve sufficiently reliable solutions that not only have the potential to address the clinical needs, but are sufficiently streamlined to be potentially translated into eventual clinical tools provided proper implementation. G1: Reducing the number of degrees of freedom (DOF) of the designed tool, with a plausible example being avoiding the use of inefficient non-rigid image registration methods. This guideline addresses the risk of artificial deformation during registration and it clearly aims at reducing complexity and the number of degrees of freedom. G2: The use of shape-based features to most efficiently represent the image content, either by using edges instead of or in addition to intensities and motion, where useful. Edges capture the most useful information in the image and can be used to identify the most important image features. As a result, this guideline ensures a more robust performance when key image information is missing. G3: Efficient method of implementation. This guideline focuses on efficiency in terms of the minimum number of steps required and avoiding the recalculation of terms that only need to be calculated once in an iterative process. An efficient implementation leads to reduced computational effort and improved performance. G4: Commence the workflow by establishing an optimized initialization and gradually converge toward the final acceptable result. This guideline aims to ensure reasonable outcomes in consistent ways and it avoids convergence to local minima, while gradually ensuring convergence to the global minimum solution. These guidelines lead to the development of interactive, semi-automated or fully-automated approaches that still enable the clinicians to perform final refinements, while they reduce the overall inter- and intra-observer variability, reduce ambiguity, increase accuracy and precision, and have the potential to yield mechanisms that will aid with providing an overall more consistent diagnosis in a timely fashion

    Applying image processing techniques to pose estimation and view synthesis.

    Get PDF
    Fung Yiu-fai Phineas.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 142-148).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Model-based Pose Estimation --- p.3Chapter 1.1.1 --- Application - 3D Motion Tracking --- p.4Chapter 1.2 --- Image-based View Synthesis --- p.4Chapter 1.3 --- Thesis Contribution --- p.7Chapter 1.4 --- Thesis Outline --- p.8Chapter 2 --- General Background --- p.9Chapter 2.1 --- Notations --- p.9Chapter 2.2 --- Camera Models --- p.10Chapter 2.2.1 --- Generic Camera Model --- p.10Chapter 2.2.2 --- Full-perspective Camera Model --- p.11Chapter 2.2.3 --- Affine Camera Model --- p.12Chapter 2.2.4 --- Weak-perspective Camera Model --- p.13Chapter 2.2.5 --- Paraperspective Camera Model --- p.14Chapter 2.3 --- Model-based Motion Analysis --- p.15Chapter 2.3.1 --- Point Correspondences --- p.16Chapter 2.3.2 --- Line Correspondences --- p.18Chapter 2.3.3 --- Angle Correspondences --- p.19Chapter 2.4 --- Panoramic Representation --- p.20Chapter 2.4.1 --- Static Mosaic --- p.21Chapter 2.4.2 --- Dynamic Mosaic --- p.22Chapter 2.4.3 --- Temporal Pyramid --- p.23Chapter 2.4.4 --- Spatial Pyramid --- p.23Chapter 2.5 --- Image Pre-processing --- p.24Chapter 2.5.1 --- Feature Extraction --- p.24Chapter 2.5.2 --- Spatial Filtering --- p.27Chapter 2.5.3 --- Local Enhancement --- p.31Chapter 2.5.4 --- Dynamic Range Stretching or Compression --- p.32Chapter 2.5.5 --- YIQ Color Model --- p.33Chapter 3 --- Model-based Pose Estimation --- p.35Chapter 3.1 --- Previous Work --- p.35Chapter 3.1.1 --- Estimation from Established Correspondences --- p.36Chapter 3.1.2 --- Direct Estimation from Image Intensities --- p.49Chapter 3.1.3 --- Perspective-3-Point Problem --- p.51Chapter 3.2 --- Our Iterative P3P Algorithm --- p.58Chapter 3.2.1 --- Gauss-Newton Method --- p.60Chapter 3.2.2 --- Dealing with Ambiguity --- p.61Chapter 3.2.3 --- 3D-to-3D Motion Estimation --- p.66Chapter 3.3 --- Experimental Results --- p.68Chapter 3.3.1 --- Synthetic Data --- p.68Chapter 3.3.2 --- Real Images --- p.72Chapter 3.4 --- Discussions --- p.73Chapter 4 --- Panoramic View Analysis --- p.76Chapter 4.1 --- Advanced Mosaic Representation --- p.76Chapter 4.1.1 --- Frame Alignment Policy --- p.77Chapter 4.1.2 --- Multi-resolution Representation --- p.77Chapter 4.1.3 --- Parallax-based Representation --- p.78Chapter 4.1.4 --- Multiple Moving Objects --- p.79Chapter 4.1.5 --- Layers and Tiles --- p.79Chapter 4.2 --- Panorama Construction --- p.79Chapter 4.2.1 --- Image Acquisition --- p.80Chapter 4.2.2 --- Image Alignment --- p.82Chapter 4.2.3 --- Image Integration --- p.88Chapter 4.2.4 --- Significant Residual Estimation --- p.89Chapter 4.3 --- Advanced Alignment Algorithms --- p.90Chapter 4.3.1 --- Patch-based Alignment --- p.91Chapter 4.3.2 --- Global Alignment (Block Adjustment) --- p.92Chapter 4.3.3 --- Local Alignment (Deghosting) --- p.93Chapter 4.4 --- Mosaic Application --- p.94Chapter 4.4.1 --- Visualization Tool --- p.94Chapter 4.4.2 --- Video Manipulation --- p.95Chapter 4.5 --- Experimental Results --- p.96Chapter 5 --- Panoramic Walkthrough --- p.99Chapter 5.1 --- Problem Statement and Notations --- p.100Chapter 5.2 --- Previous Work --- p.101Chapter 5.2.1 --- 3D Modeling and Rendering --- p.102Chapter 5.2.2 --- Branching Movies --- p.103Chapter 5.2.3 --- Texture Window Scaling --- p.104Chapter 5.2.4 --- Problems with Simple Texture Window Scaling --- p.105Chapter 5.3 --- Our Walkthrough Approach --- p.106Chapter 5.3.1 --- Cylindrical Projection onto Image Plane --- p.106Chapter 5.3.2 --- Generating Intermediate Frames --- p.108Chapter 5.3.3 --- Occlusion Handling --- p.114Chapter 5.4 --- Experimental Results --- p.116Chapter 5.5 --- Discussions --- p.116Chapter 6 --- Conclusion --- p.121Chapter A --- Formulation of Fischler and Bolles' Method for P3P Problems --- p.123Chapter B --- Derivation of z1 and z3 in terms of z2 --- p.127Chapter C --- Derivation of e1 and e2 --- p.129Chapter D --- Derivation of the Update Rule for Gauss-Newton Method --- p.130Chapter E --- Proof of (λ1λ2-λ 4)>〉0 --- p.132Chapter F --- Derivation of φ and hi --- p.133Chapter G --- Derivation of w1j to w4j --- p.134Chapter H --- More Experimental Results on Panoramic Stitching Algorithms --- p.138Bibliography --- p.14

    Learning Equivariant Representations

    Get PDF
    State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present

    Panoramic Image-to-Image Translation

    Full text link
    In this paper, we tackle the challenging task of Panoramic Image-to-Image translation (Pano-I2I) for the first time. This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse conditions, like weather or time. To address these challenges, we propose a panoramic distortion-aware I2I model that preserves the structure of the panoramic images while consistently translating their global style referenced from a pinhole image. To mitigate the distortion issue in naive 360 panorama translation, we adopt spherical positional embedding to our transformer encoders, introduce a distortion-free discriminator, and apply sphere-based rotation for augmentation and its ensemble. We also design a content encoder and a style encoder to be deformation-aware to deal with a large domain gap between panoramas and pinhole images, enabling us to work on diverse conditions of pinhole images. In addition, considering the large discrepancy between panoramas and pinhole images, our framework decouples the learning procedure of the panoramic reconstruction stage from the translation stage. We show distinct improvements over existing I2I models in translating the StreetLearn dataset in the daytime into diverse conditions. The code will be publicly available online for our community

    Laser Pointer Tracking in Projector-Augmented Architectural Environments

    Get PDF
    We present a system that applies a custom-built pan-tilt-zoom camera for laser-pointer tracking in arbitrary real environments. Once placed in a building environment, it carries out a fully automatic self-registration, registrations of projectors, and sampling of surface parameters, such as geometry and reflectivity. After these steps, it can be used for tracking a laser spot on the surface as well as an LED marker in 3D space, using inter-playing fisheye context and controllable detail cameras. The captured surface information can be used for masking out areas that are critical to laser-pointer tracking, and for guiding geometric and radiometric image correction techniques that enable a projector-based augmentation on arbitrary surfaces. We describe a distributed software framework that couples laser-pointer tracking for interaction, projector-based AR as well as video see-through AR for visualizations with the domain specific functionality of existing desktop tools for architectural planning, simulation and building surveying

    A Tool for the 3D Spatio-Temporal Structuring of Historic Building Reconstructions

    Get PDF
    The difficulty in the description, the analysis and the comprehension of cultural heritage often stands on the fact that buildings undergo numerous changes over time. Three factors condition the knowledge of historical heritage. Firstly, 3D reconstructions of heritage buildings focus normally on existing states and not on the management of historical evolutions. Secondly, if on one side iconographic sources are generally used like visual memory of a building temporal state to be restored graphically, on the other side few works today focus on the use of all metric and visual information contained in sources. At last, iconographic documentation concerning building past states is sometimes contradictory, dubious and incomplete. As a consequence, in 3D reconstructions uncertainties, contradictions and gaps in information should be highlighted. We present a methodological approach basing on the existing iconographic corpus for the analysis and the 3D management of building transformations. This approach joins three main aspects in a complete workflow. Firstly, it concerns the spatial and temporal referencing of 2D iconographic sources for the 3D reconstruction of disappeared building states. Secondly, it allows the analysis of building transformations by means of a temporal state distribution. Lastly, it uses spatial relations established between 2D iconography and 3D representation for the visual browsing of information based on spatiotemporal criteria. In particular, in this paper we detail the interface developed in order to accomplish multiple related tasks concerning the spatio-temporal structuring of the morphology to be reconstructed

    Drone Journalism as Visual Aggregation: Toward a Critical History

    Get PDF
    The use of unmanned aerial vehicles (UAVs—commonly referred to as drones) in journalism has emerged only recently, and has grown significantly. This article explores what makes drone imagery as an instance of what scholars of visual culture call an aerial view so compelling for major news organizations as to warrant such attention and investment. To do this, the concept ‘visual aggregation’ is introduced to theorize the authority of drone imagery in conventional journalistic practice. Imagery produced through drone journalism is a visual analogy to statistical summary and, more recently, of what is referred to as data journalism. Just as these combine an aggregate of cases to produce an understanding of an overall trend, drone imagery aggregates space visually, its broad visual field revealing large-scale spatial patterns in ways analogous to the statistical capture/analysis of large bodies of data. The article then employs a cultural and historical approach to identify key points in the emergence of visual aggregation as authoritative truth. The aerial view as a claim to truth is manifest in a wide range of antecedent social formations, devices and practices prior to their amalgamation in what has today become drone journalism. This analysis aids understanding of how drone journalism is a response to the institutional crises of journalism today
    • …
    corecore