143 research outputs found
Dense 3D Face Correspondence
We present an algorithm that automatically establishes dense correspondences
between a large number of 3D faces. Starting from automatically detected sparse
correspondences on the outer boundary of 3D faces, the algorithm triangulates
existing correspondences and expands them iteratively by matching points of
distinctive surface curvature along the triangle edges. After exhausting
keypoint matches, further correspondences are established by generating evenly
distributed points within triangles by evolving level set geodesic curves from
the centroids of large triangles. A deformable model (K3DM) is constructed from
the dense corresponded faces and an algorithm is proposed for morphing the K3DM
to fit unseen faces. This algorithm iterates between rigid alignment of an
unseen face followed by regularized morphing of the deformable model. We have
extensively evaluated the proposed algorithms on synthetic data and real 3D
faces from the FRGCv2, Bosphorus, BU3DFE and UND Ear databases using
quantitative and qualitative benchmarks. Our algorithm achieved dense
correspondences with a mean localisation error of 1.28mm on synthetic faces and
detected anthropometric landmarks on unseen real faces from the FRGCv2
database with 3mm precision. Furthermore, our deformable model fitting
algorithm achieved 98.5% face recognition accuracy on the FRGCv2 and 98.6% on
Bosphorus database. Our dense model is also able to generalize to unseen
datasets.Comment: 24 Pages, 12 Figures, 6 Tables and 3 Algorithm
Retinal Fundus Image Registration via Vascular Structure Graph Matching
Motivated by the observation that a retinal fundus image may contain some unique geometric structures within
its vascular trees which can be utilized for feature matching, in this paper, we proposed a graph-based registration
framework called GM-ICP to align pairwise retinal images. First, the retinal vessels are automatically detected and
represented as vascular structure graphs. A graph matching is then performed to find global correspondences between
vascular bifurcations. Finally, a revised ICP algorithm incorporating with quadratic transformation model is used at
fine level to register vessel shape models. In order to eliminate the incorrect matches from global correspondence
set obtained via graph matching, we proposed a structure-based sample consensus (STRUCT-SAC) algorithm. The
advantages of our approach are threefold: (1) global optimum solution can be achieved with graph matching; (2)
our method is invariant to linear geometric transformations; and (3) heavy local feature descriptors are not required.
The effectiveness of our method is demonstrated by the experiments with 48 pairs retinal images collected from
clinical patients
A Survey on Deep Learning in Medical Image Registration: New Technologies, Uncertainty, Evaluation Metrics, and Beyond
Over the past decade, deep learning technologies have greatly advanced the
field of medical image registration. The initial developments, such as
ResNet-based and U-Net-based networks, laid the groundwork for deep
learning-driven image registration. Subsequent progress has been made in
various aspects of deep learning-based registration, including similarity
measures, deformation regularizations, and uncertainty estimation. These
advancements have not only enriched the field of deformable image registration
but have also facilitated its application in a wide range of tasks, including
atlas construction, multi-atlas segmentation, motion estimation, and 2D-3D
registration. In this paper, we present a comprehensive overview of the most
recent advancements in deep learning-based image registration. We begin with a
concise introduction to the core concepts of deep learning-based image
registration. Then, we delve into innovative network architectures, loss
functions specific to registration, and methods for estimating registration
uncertainty. Additionally, this paper explores appropriate evaluation metrics
for assessing the performance of deep learning models in registration tasks.
Finally, we highlight the practical applications of these novel techniques in
medical imaging and discuss the future prospects of deep learning-based image
registration
From 3D Point Clouds to Pose-Normalised Depth Maps
We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)
3-D Scene Reconstruction from Aerial Imagery
3-D scene reconstructions derived from Structure from Motion (SfM) and Multi-View Stereo (MVS) techniques were analyzed to determine the optimal reconnaissance flight characteristics suitable for target reconstruction. In support of this goal, a preliminary study of a simple 3-D geometric object facilitated the analysis of convergence angles and number of camera frames within a controlled environment. Reconstruction accuracy measurements revealed at least 3 camera frames and a 6 convergence angle were required to achieve results reminiscent of the original structure. The central investigative effort sought the applicability of certain airborne reconnaissance flight profiles to reconstructing ground targets. The data sets included images collected within a synthetic 3-D urban environment along circular, linear and s-curve aerial flight profiles equipped with agile and non-agile sensors. S-curve and dynamically controlled linear flight paths provided superior results, whereas with sufficient data conditioning and combination of orthogonal flight paths, all flight paths produced quality reconstructions under a wide variety of operational considerations
Motion correction of PET/CT images
Indiana University-Purdue University Indianapolis (IUPUI)The advances in health care technology help physicians make more accurate diagnoses about the health conditions of their patients. Positron Emission Tomography/Computed Tomography (PET/CT) is one of the many tools currently used to diagnose health and disease in patients. PET/CT explorations are typically used to detect: cancer, heart diseases, disorders in the central nervous system. Since PET/CT studies can take up to 60 minutes or more, it is impossible for patients to remain motionless throughout the scanning process. This movements create motion-related artifacts which alter the quantitative and qualitative results produced by the scanning process. The patient's motion results in image blurring, reduction in the image signal to noise ratio, and reduced image contrast, which could lead to misdiagnoses.
In the literature, software and hardware-based techniques have been studied to implement motion correction over medical files. Techniques based on the use of an external motion tracking system are preferred by researchers because they present a better accuracy. This thesis proposes a motion correction system that uses 3D affine registrations using particle swarm optimization and an off-the-shelf Microsoft Kinect camera to eliminate or reduce errors caused by the patient's motion during a medical imaging study
Methods for multi-spectral image fusion: identifying stable and repeatable information across the visible and infrared spectra
Fusion of images captured from different viewpoints is a well-known challenge in computer vision with many established approaches and applications; however, if the observations are captured by sensors also separated by wavelength, this challenge is compounded significantly. This dissertation presents an investigation into the fusion of visible and thermal image information from two front-facing sensors mounted side-by-side. The primary focus of this work is the development of methods that enable us to map and overlay multi-spectral information; the goal is to establish a combined image in which each pixel contains both colour and thermal information. Pixel-level fusion of these distinct modalities is approached using computational stereo methods; the focus is on the viewpoint alignment and correspondence search/matching stages of processing. Frequency domain analysis is performed using a method called phase congruency. An extensive investigation of this method is carried out with two major objectives: to identify predictable relationships between the elements extracted from each modality, and to establish a stable representation of the common information captured by both sensors. Phase congruency is shown to be a stable edge detector and repeatable spatial similarity measure for multi-spectral information; this result forms the basis for the methods developed in the subsequent chapters of this work. The feasibility of automatic alignment with sparse feature-correspondence methods is investigated. It is found that conventional methods fail to match inter-spectrum correspondences, motivating the development of an edge orientation histogram (EOH) descriptor which incorporates elements of the phase congruency process. A cost function, which incorporates the outputs of the phase congruency process and the mutual information similarity measure, is developed for computational stereo correspondence matching. An evaluation of the proposed cost function shows it to be an effective similarity measure for multi-spectral information
Robotic Assembly Using 3D and 2D Computer Vision
The content of this thesis concerns the development and evaluation of a robotic cell used for
automated assembly. The automated assembly is made possible by a combination of an eye-inhand
2D camera and a stationary 3D camera used to automatically detect objects. Computer
vision, kinematics and programming is the main topics of the thesis. Possible approaches to
object detection has been investigated and evaluated in terms of performance. The kinematic
relation between the cameras in the robotic cell and robotic manipulator movements has been
described. A functioning solution has been implemented in the robotic cell at the Department
of Production and Quality Engineering laboratory.
Theory with significant importance to the developed solution is presented. The methods used
to achieve each part of the solution is anchored in theory and presented with the decisions and guidelines made throughout the project work in order to achieve the final solution.
Each part of the system is presented with associated results. The combination of these results yields a solution which proves that the methods developed to achieve automated assembly works as intended. Limitations, challenges and future possibilities and improvements for the solution is then discussed.
The results from the experiments presented in this thesis demonstrates the performance of the
developed system. The system fulfills the specifications defined in the problem description and is functioning as intended considering the instrumentation used
- …