77 research outputs found

    Active appearance pyramids for object parametrisation and fitting

    Get PDF
    Object class representation is one of the key problems in various medical image analysis tasks. We propose a part-based parametric appearance model we refer to as an Active Appearance Pyramid (AAP). The parts are delineated by multi-scale Local Feature Pyramids (LFPs) for superior spatial specificity and distinctiveness. An AAP models the variability within a population with local translations of multi-scale parts and linear appearance variations of the assembly of the parts. It can fit and represent new instances by adjusting the shape and appearance parameters. The fitting process uses a two-step iterative strategy: local landmark searching followed by shape regularisation. We present a simultaneous local feature searching and appearance fitting algorithm based on the weighted Lucas and Kanade method. A shape regulariser is derived to calculate the maximum likelihood shape with respect to the prior and multiple landmark candidates from multi-scale LFPs, with a compact closed-form solution. We apply the 2D AAP on the modelling of variability in patients with lumbar spinal stenosis (LSS) and validate its performance on 200 studies consisting of routine axial and sagittal MRI scans. Intervertebral sagittal and parasagittal cross-sections are typically used for the diagnosis of LSS, we therefore build three AAPs on L3/4, L4/5 and L5/S1 axial cross-sections and three on parasagittal slices. Experiments show significant improvement in convergence range, robustness to local minima and segmentation precision compared with Constrained Local Models (CLMs), Active Shape Models (ASMs) and Active Appearance Models (AAMs), as well as superior performance in appearance reconstruction compared with AAMs. We also validate the performance on 3D CT volumes of hip joints from 38 studies. Compared to AAMs, AAPs achieve a higher segmentation and reconstruction precision. Moreover, AAPs have a significant improvement in efficiency, consuming about half the memory and less than 10% of the training time and 15% of the testing time

    Deformable appearance pyramids for anatomy representation, landmark detection and pathology classification

    Get PDF
    Purpose Representation of anatomy appearance is one of the key problems in medical image analysis. An appearance model represents the anatomies with parametric forms, which are then vectorised for prior learning, segmentation and classification tasks. Methods We propose a part-based parametric appearance model we refer to as a deformable appearance pyramid (DAP). The parts are delineated by multi-scale local feature pyramids extracted from an image pyramid. Each anatomy is represented by an appearance pyramid, with the variability within a population approximated by local translations of the multi-scale parts and linear appearance variations in the assembly of the parts. We introduce DAPs built on two types of image pyramids, namely Gaussian and wavelet pyramids, and present two approaches to model the prior and fit the model, one explicitly using a subspace Lucas–Kanade algorithm and the other implicitly using the supervised descent method (SDM). Results We validate the performance of the DAP instances with difference configurations on the problem of lumbar spinal stenosis for localising the landmarks and classifying the pathologies. We also compare them with classic methods such as active shape models, active appearance models and constrained local models. Experimental results show that the DAP built on wavelet pyramids and fitted with SDM gives the best results in both landmark localisation and classification. Conclusion A new appearance model is introduced with several configurations presented and evaluated. The DAPs can be readily applied for other clinical problems for the tasks of prior learning, landmark detection and pathology classification

    A learning approach to swarm-based path detection and tracking

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de ComputadoresThis dissertation presents a set of top-down modulation mechanisms for the modulation of the swarm-based visual saliency computation process proposed by Santana et al. (2010) in context of path detection and tracking. In the original visual saliency computation process, two swarms of agents sensitive to bottom-up conspicuity information interact via pheromone-like signals so as to converge on the most likely location of the path being sought. The behaviours ruling the agents’motion are composed of a set of perception-action rules that embed top-down knowledge about the path’s overall layout. This reduces ambiguity in the face of distractors. However, distractors with a shape similar to the one of the path being sought can still misguide the system. To mitigate this issue, this dissertation proposes the use of a contrast model to modulate the conspicuity computation and the use of an appearance model to modulate the pheromone deployment. Given the heterogeneity of the paths, these models are learnt online. Using in a modulation context and not in a direct image processing, the complexity of these models can be reduced without hampering robustness. The result is a system computationally parsimonious with a work frequency of 20 Hz. Experimental results obtained from a data set encompassing 39 diverse videos show the ability of the proposed model to localise the path in 98.67 % of the 29789 evaluated frames

    Deformable and articulated 3D reconstruction from monocular video sequences

    Get PDF
    PhDThis thesis addresses the problem of deformable and articulated structure from motion from monocular uncalibrated video sequences. Structure from motion is defined as the problem of recovering information about the 3D structure of scenes imaged by a camera in a video sequence. Our study aims at the challenging problem of non-rigid shapes (e.g. a beating heart or a smiling face). Non-rigid structures appear constantly in our everyday life, think of a bicep curling, a torso twisting or a smiling face. Our research seeks a general method to perform 3D shape recovery purely from data, without having to rely on a pre-computed model or training data. Open problems in the field are the difficulty of the non-linear estimation, the lack of a real-time system, large amounts of missing data in real-world video sequences, measurement noise and strong deformations. Solving these problems would take us far beyond the current state of the art in non-rigid structure from motion. This dissertation presents our contributions in the field of non-rigid structure from motion, detailing a novel algorithm that enforces the exact metric structure of the problem at each step of the minimisation by projecting the motion matrices onto the correct deformable or articulated metric motion manifolds respectively. An important advantage of this new algorithm is its ability to handle missing data which becomes crucial when dealing with real video sequences. We present a generic bilinear estimation framework, which improves convergence and makes use of the manifold constraints. Finally, we demonstrate a sequential, frame-by-frame estimation algorithm, which provides a 3D model and camera parameters for each video frame, while simultaneously building a model of object deformation

    Statistical shape analysis for bio-structures : local shape modelling, techniques and applications

    Get PDF
    A Statistical Shape Model (SSM) is a statistical representation of a shape obtained from data to study variation in shapes. Work on shape modelling is constrained by many unsolved problems, for instance, difficulties in modelling local versus global variation. SSM have been successfully applied in medical image applications such as the analysis of brain anatomy. Since brain structure is so complex and varies across subjects, methods to identify morphological variability can be useful for diagnosis and treatment. The main objective of this research is to generate and develop a statistical shape model to analyse local variation in shapes. Within this particular context, this work addresses the question of what are the local elements that need to be identified for effective shape analysis. Here, the proposed method is based on a Point Distribution Model and uses a combination of other well known techniques: Fractal analysis; Markov Chain Monte Carlo methods; and the Curvature Scale Space representation for the problem of contour localisation. Similarly, Diffusion Maps are employed as a spectral shape clustering tool to identify sets of local partitions useful in the shape analysis. Additionally, a novel Hierarchical Shape Analysis method based on the Gaussian and Laplacian pyramids is explained and used to compare the featured Local Shape Model. Experimental results on a number of real contours such as animal, leaf and brain white matter outlines have been shown to demonstrate the effectiveness of the proposed model. These results show that local shape models are efficient in modelling the statistical variation of shape of biological structures. Particularly, the development of this model provides an approach to the analysis of brain images and brain morphometrics. Likewise, the model can be adapted to the problem of content based image retrieval, where global and local shape similarity needs to be measured

    The application of range imaging for improved local feature representations

    Get PDF
    This thesis presents an investigation into the integration of information extracted from co-aligned range and intensity images to achieve pose invariant object recognition. Local feature matching is a fundamental technique in image analysis that underpins many computer vision-based applications; the approach comprises identifying a collection of interest points in an image, characterising the local image region surrounding the interest point by means of a descriptor, and matching these descriptors between example images. Such local feature descriptors are formed from a measure of the local image statistics in the region surrounding the interest point. The interest point locations and the means of measuring local image statistics should be chosen such that resultant descriptor remains stable across a range of common image transformations. Recently the availability of low cost, high quality range imaging devices has motivated an interest in local feature extraction from range images. It has been widely assumed in the vision community that the range imaging domain has properties which remain quasi-invariant through a wide range of changes in illumination and pose. Accordingly, it has been suggested that local feature extraction in the range domain should allow the calculation of local feature descriptors that are potentially more robust than those calculated from the intensity imaging domain alone. However, range images represent differing characteristics from those represented within intensity images which are frequently used, independently from range images, to create robust local features. Therefore, this work attempts to establish the best means of combining information from these two imaging modalities to further increase the reliability of matching local features. Local feature extraction comprises a series of processes applied to an image location such that a collection of repeatable descriptors can be established. By using co-aligned range and intensity images this work investigates the choice of modality and method for each step in the extraction process as an approach to optimising the resulting descriptor. Additionally, multimodal features are formed by combining information from both domains in a single stage in the extraction process. To further improve the quality of feature descriptors, a calculation of the surface normals and a use of the 3D structure from the range image are applied to correct the 3D appearance of a local sample patch, thereby increasing the similarity between observations. The matching performance of local features is evaluated using an experimental setup comprising a turntable and stereo pair of cameras. This experimental setup is used to create a database of intensity and range images for 5 objects imaged at 72 calibrated viewpoints, creating a database of 360 object observations. The use of a calibrated turntable in combination with the 3D object surface coordiantes, supplied by the range image allow location correspondences between object observations to be established; and therefore descriptor matches to be labelled as either true positive or false positive. Applying this methodology to the formulated local features show that two approaches demonstrate state-of-the-art performance, with a ~40% increase in area under ROC curve at a False Positive Rate of 10% when compared with standard SIFT. These approaches are range affine corrected intensity SIFT and element corrected surface gradients SIFT. Furthermore,this work uses the 3D structure encoded in the range image to organise collections of interest points from a series of observations into a collection of canonical views in a new model local feature. The canonical views for a interest point are stored in a view compartmentalised structure which allows the appearance of a local interest point to be characterised across the view sphere. Each canonical view is assigned a confidence measure based on the 3D pose of the interest point at observation, this confidence measure is then used to match similar canonical views of model and query interest points thereby achieving a pose invariant interest point description. This approach does not produce a statistically significant performance increase. However, does contribute a validated methodology for combining multiple descriptors with differing confidence weightings into a single keypoint

    Statistical Modelling of Craniofacial Shape

    Get PDF
    With prior knowledge and experience, people can easily observe rich shape and texture variation for a certain type of objects, such as human faces, cats or chairs, in both 2D and 3D images. This ability helps us recognise the same person, distinguish different kinds of creatures and sketch unseen samples of the same object class. The process of capturing this prior knowledge is mathematically interpreted as statistical modelling. The outcome is a morphable model, a vector space representation of objects, that captures the variation of shape and texture. This thesis presents research aimed at constructing 3DMMs of craniofacial shape and texture using new algorithms and processing pipelines to offer enhanced modelling abilities over existing techniques. In particular, we present several fully automatic modelling approaches and apply them to a large dataset of 3D images of the human head, the Headspace dataset, thus generating the first public shape-and- texture 3D Morphable Model (3DMM) of the full human head. We call this the Liverpool-York Head Model, reflecting the data collection and statistical modelling respectively. We also explore the craniofacial symmetry and asymmetry in template morphing and statistical modelling. We propose a Symmetry-aware Coherent Point Drift (SA-CPD) algorithm, which mitigates the tangential sliding problem seen in competing morphing algorithms. Based on the symmetry-constrained correspondence output of SA-CPD, we present a symmetry-factored statistical modelling method for craniofacial shape. Also, we propose an iterative process of refinement for a 3DMM of the human ear that employs data augmentation. Then we merge the proposed 3DMMs of the ear with the full head model. As craniofacial clinicians like to look at head profiles, we propose a new pipeline to build a 2D morphable model of the craniofacial sagittal profile and augment it with profile models from frontal and top-down views. Our models and data are made publicly available online for research purposes

    Transform domain texture synthesis on surfaces

    Get PDF
    In the recent past application areas such as virtual reality experiences, digital cinema and computer gamings have resulted in a renewed interest in advanced research topics in computer graphics. Although many research challenges in computer graphics have been met due to worldwide efforts, many more are yet to be met. Two key challenges which still remain open research problems are, the lack of perfect realism in animated/virtually-created objects when represented in graphical format and the need for the transmissiim/storage/exchange of a massive amount of information in between remote locations, when 3D computer generated objects are used in remote visualisations. These challenges call for further research to be focused in the above directions. Though a significant amount of ideas have been proposed by the international research community in their effort to meet the above challenges, the ideas still suffer from excessive complexity related issues resulting in high processing times and their practical inapplicability when bandwidth constraint transmission mediums are used or when the storage space or computational power of the display device is limited. In the proposed work we investigate the appropriate use of geometric representations of 3D structure (e.g. Bezier surface, NURBS, polygons) and multi-resolution, progressive representation of texture on such surfaces. This joint approach to texture synthesis has not been considered before and has significant potential in resolving current challenges in virtual realism, digital cinema and computer gaming industry. The main focus of the novel approaches that are proposed in this thesis is performing photo-realistic texture synthesis on surfaces. We have provided experimental results and detailed analysis to prove that the proposed algorithms allow fast, progressive building of texture on arbitrarily shaped 3D surfaces. In particular we investigate the above ideas in association with Bezier patch representation of 3D objects, an approach which has not been considered so far by any published world wide research effort, yet has flexibility of utmost practical importance. Further we have discussed the novel application domains that can be served by the inclusion of additional functionality within the proposed algorithms.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore