1,667 research outputs found

    Bandpass filters for unconstrained target recognition and their implementation in coherent optical correlators

    Get PDF
    An up-dateable correlator is simulated which is based on the non-degenerate four wave mixing (NDFWM) interaction in the photorefractive material bismuth silicon oxide (Bi12SiO20). Specifically, it is shown that variable bandpass filters can be implemented directly in the correlator by adjusting the relative strengths of the signal and reference beams used to write the Fourier transform hologram into the photorefractive. The synthetic discriminant function (SDF) method of grey-level multiplexing is reviewed. A bandpass modification of this technique is used in the design of a multiplexed filter for the recognition of an industrial test component from a limited number of known stable state orientations when viewed from an overhead camera position. Its performance in this task when implemented in the up-dateable correlator is assessed through simulation. The conclusion of this work is that filter multiplexing must be used judiciously for orientation invariant recognition. Only a limited number of images, typically under ten, may be multiplexed into each filter since correlation peak heights and peak-to-sidelobe ratios inevitably progressively deteriorate as images are added to the filter. The effect of severe amplitude disruptions in the frequency plane on correlation peak localisation is examined. In two or higher dimensions simulations show the localisation is very robust to this disruption; an analysis is developed to indicate the reason for this. The effect is exploited by the implementation of an algorithm that locally removes the spatial frequencies that exhibit close phase matching between intra- and inter-class images. The inter-class response can be forced to zero while simultaneously improving the intra-class tolerance to orientation changes. The technique is assessed through simulation with images of two types of motor vehicle, in a variety of orientations, and shown to be effective in improving discrimination and intra-class tolerance for examples in which these were initially very poor. Bandpass filters are experimentally implemented in a joint transform correlator (JTC) based on a NDFWM interaction in Bi12SiO20. The JTC is described and its full bandwidth performance initially assessed. As anticipated from the previous considerations, inter-class discrimination was high but the intra-class tolerance very poor due to the high sensitivity of the filter. The difference of Gaussian approximation to a Laplacian of a Gaussian filter is described and its experimental implementation in the JTC detailed. Experimental results are presented for the orientation independent recognition of a car while maintaining discrimination against another car. An intra-class to inter-class correlation ratio of 7.5 dB was obtained as a best case and 3.6 dB as a worst case, the intra-class variation being at 11 ° increments in orientation at zero elevation angle. The results are extrapolated to estimate that approximately 80 filters would be required for a full 2 steradian orientation coverage. The implementation of the frequency removal technique and the Wiener filter in the JTC is briefly considered in conclusion to this work

    Virtual Reality Aided Mobile C-arm Positioning for Image-Guided Surgery

    Get PDF
    Image-guided surgery (IGS) is the minimally invasive procedure based on the pre-operative volume in conjunction with intra-operative X-ray images which are commonly captured by mobile C-arms for the confirmation of surgical outcomes. Although currently some commercial navigation systems are employed, one critical issue of such systems is the neglect regarding the radiation exposure to the patient and surgeons. In practice, when one surgical stage is finished, several X-ray images have to be acquired repeatedly by the mobile C-arm to obtain the desired image. Excessive radiation exposure may increase the risk of some complications. Therefore, it is necessary to develop a positioning system for mobile C-arms, and achieve one-time imaging to avoid the additional radiation exposure. In this dissertation, a mobile C-arm positioning system is proposed with the aid of virtual reality (VR). The surface model of patient is reconstructed by a camera mounted on the mobile C-arm. A novel registration method is proposed to align this model and pre-operative volume based on a tracker, so that surgeons can visualize the hidden anatomy directly from the outside view and determine a reference pose of C-arm. Considering the congested operating room, the C-arm is modeled as manipulator with a movable base to maneuver the image intensifier to the desired pose. In the registration procedure above, intensity-based 2D/3D registration is used to transform the pre-operative volume into the coordinate system of tracker. Although it provides a high accuracy, the small capture range hinders its clinical use due to the initial guess. To address such problem, a robust and fast initialization method is proposed based on the automatic tracking based initialization and multi-resolution estimation in frequency domain. This hardware-software integrated approach provides almost optimal transformation parameters for intensity-based registration. To determine the pose of mobile C-arm, high-quality visualization is necessary to locate the pathology in the hidden anatomy. A novel dimensionality reduction method based on sparse representation is proposed for the design of multi-dimensional transfer function in direct volume rendering. It not only achieves the similar performance to the conventional methods, but also owns the capability to deal with the large data sets

    Content based image pose manipulation

    Get PDF
    This thesis proposes the application of space-frequency transformations to the domain of pose estimation in images. This idea is explored using the Wavelet Transform with illustrative applications in pose estimation for face images, and images of planar scenes. The approach is based on examining the spatial frequency components in an image, to allow the inherent scene symmetry balance to be recovered. For face images with restricted pose variation (looking left or right), an algorithm is proposed to maximise this symmetry in order to transform the image into a fronto-parallel pose. This scheme is further employed to identify the optimal frontal facial pose from a video sequence to automate facial capture processes. These features are an important pre-requisite in facial recognition and expression classification systems. The under lying principles of this spatial-frequency approach are examined with respect to images with planar scenes. Using the Continuous Wavelet Transform, full perspective planar transformations are estimated within a featureless framework. Restoring central symmetry to the wavelet transformed images in an iterative optimisation scheme removes this perspective pose. This advances upon existing spatial approaches that require segmentation and feature matching, and frequency only techniques that are limited to affine transformation recovery. To evaluate the proposed techniques, the pose of a database of subjects portraying varying yaw orientations is estimated and the accuracy is measured against the captured ground truth information. Additionally, full perspective homographies for synthesised and imaged textured planes are estimated. Experimental results are presented for both situations that compare favourably with existing techniques in the literature

    An algebraic reconstruction technique (ART) for the synthesis of three-dimensional models of particle aggregates from projective representations

    Get PDF
    There exists considerable evidence that the shear behavior and flow behavior of granular materials is significantly dependent on particle morphology. However, quantification of this dependence is a challenging task owing to a dearth of quantitative models for describing particle shape and the difficulty of modeling angular particle assemblies. The situation becomes more complex when discrete element analyses of realistic 3-D particle shapes are required. The thesis attempts to address this problem by adapting the algebraic reconstruction technique (ART) to synthesize composite 3-D granular particles from statistically obtained 3-D shape descriptors of the particles in an aggregate mixture. This thesis extends previous work where it was demonstrated that the 3-D shape characteristics of particles in an aggregate mixture can be numerically expressed by statistical models obtained from 2-D projective representations of multiple particles in the mixture. In this thesis, attempts were made to validate the premise that multiple projective representations of multiple particles could be used to synthesize a composite 3-D particle that represents the entire mixture in terms of its 3-D shape descriptors. Also, single particles isolated from the aggregate mix were scanned using optical and X-ray tomography techniques to generate 2-D multiple projections and synthesize the 3-D particle shape. This research work proves useful for generating realistic shapes for discrete element applications or in obtaining more fundamental understanding of the micromechanics of granular solids

    A semidiscrete version of the Citti-Petitot-Sarti model as a plausible model for anthropomorphic image reconstruction and pattern recognition

    Full text link
    In his beautiful book [66], Jean Petitot proposes a sub-Riemannian model for the primary visual cortex of mammals. This model is neurophysiologically justified. Further developments of this theory lead to efficient algorithms for image reconstruction, based upon the consideration of an associated hypoelliptic diffusion. The sub-Riemannian model of Petitot and Citti-Sarti (or certain of its improvements) is a left-invariant structure over the group SE(2)SE(2) of rototranslations of the plane. Here, we propose a semi-discrete version of this theory, leading to a left-invariant structure over the group SE(2,N)SE(2,N), restricting to a finite number of rotations. This apparently very simple group is in fact quite atypical: it is maximally almost periodic, which leads to much simpler harmonic analysis compared to SE(2).SE(2). Based upon this semi-discrete model, we improve on previous image-reconstruction algorithms and we develop a pattern-recognition theory that leads also to very efficient algorithms in practice.Comment: 123 pages, revised versio

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs

    Learning Equivariant Representations

    Get PDF
    State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present

    3-D surface modelling of the human body and 3-D surface anthropometry

    Get PDF
    This thesis investigates three-dimensional (3-D) surface modelling of the human body and 3-D surface anthropometry. These are two separate, but closely related, areas. 3-D surface modelling is an essential technology for representing and describing the surface shape of an object on a computer. 3-D surface modelling of the human body has wide applications in engineering design, work space simulation, the clothing industry, medicine, biomechanics and animation. These applications require increasingly realistic surface models of the human body. 3-D surface anthropometry is a new interdisciplinary subject. It is defined in this thesis as the art, science, and technology of acquiring, modelling and interrogating 3-D surface data of the human body. [Continues.

    Deep learning for fast and robust medical image reconstruction and analysis

    Get PDF
    Medical imaging is an indispensable component of modern medical research as well as clinical practice. Nevertheless, imaging techniques such as magnetic resonance imaging (MRI) and computational tomography (CT) are costly and are less accessible to the majority of the world. To make medical devices more accessible, affordable and efficient, it is crucial to re-calibrate our current imaging paradigm for smarter imaging. In particular, as medical imaging techniques have highly structured forms in the way they acquire data, they provide us with an opportunity to optimise the imaging techniques holistically by leveraging data. The central theme of this thesis is to explore different opportunities where we can exploit data and deep learning to improve the way we extract information for better, faster and smarter imaging. This thesis explores three distinct problems. The first problem is the time-consuming nature of dynamic MR data acquisition and reconstruction. We propose deep learning methods for accelerated dynamic MR image reconstruction, resulting in up to 10-fold reduction in imaging time. The second problem is the redundancy in our current imaging pipeline. Traditionally, imaging pipeline treated acquisition, reconstruction and analysis as separate steps. However, we argue that one can approach them holistically and optimise the entire pipeline jointly for a specific target goal. To this end, we propose deep learning approaches for obtaining high fidelity cardiac MR segmentation directly from significantly undersampled data, greatly exceeding the undersampling limit for image reconstruction. The final part of this thesis tackles the problem of interpretability of the deep learning algorithms. We propose attention-models that can implicitly focus on salient regions in an image to improve accuracy for ultrasound scan plane detection and CT segmentation. More crucially, these models can provide explainability, which is a crucial stepping stone for the harmonisation of smart imaging and current clinical practice.Open Acces
    corecore