2,108 research outputs found

    Pose-invariant, model-based object recognition, using linear combination of views and Bayesian statistics

    Get PDF
    This thesis presents an in-depth study on the problem of object recognition, and in particular the detection of 3-D objects in 2-D intensity images which may be viewed from a variety of angles. A solution to this problem remains elusive to this day, since it involves dealing with variations in geometry, photometry and viewing angle, noise, occlusions and incomplete data. This work restricts its scope to a particular kind of extrinsic variation; variation of the image due to changes in the viewpoint from which the object is seen. A technique is proposed and developed to address this problem, which falls into the category of view-based approaches, that is, a method in which an object is represented as a collection of a small number of 2-D views, as opposed to a generation of a full 3-D model. This technique is based on the theoretical observation that the geometry of the set of possible images of an object undergoing 3-D rigid transformations and scaling may, under most imaging conditions, be represented by a linear combination of a small number of 2-D views of that object. It is therefore possible to synthesise a novel image of an object given at least two existing and dissimilar views of the object, and a set of linear coefficients that determine how these views are to be combined in order to synthesise the new image. The method works in conjunction with a powerful optimization algorithm, to search and recover the optimal linear combination coefficients that will synthesize a novel image, which is as similar as possible to the target, scene view. If the similarity between the synthesized and the target images is above some threshold, then an object is determined to be present in the scene and its location and pose are defined, in part, by the coefficients. The key benefits of using this technique is that because it works directly with pixel values, it avoids the need for problematic, low-level feature extraction and solution of the correspondence problem. As a result, a linear combination of views (LCV) model is easy to construct and use, since it only requires a small number of stored, 2-D views of the object in question, and the selection of a few landmark points on the object, the process which is easily carried out during the offline, model building stage. In addition, this method is general enough to be applied across a variety of recognition problems and different types of objects. The development and application of this method is initially explored looking at two-dimensional problems, and then extending the same principles to 3-D. Additionally, the method is evaluated across synthetic and real-image datasets, containing variations in the objects’ identity and pose. Future work on possible extensions to incorporate a foreground/background model and lighting variations of the pixels are examined

    DEFORM'06 - Proceedings of the Workshop on Image Registration in Deformable Environments

    Get PDF
    Preface These are the proceedings of DEFORM'06, the Workshop on Image Registration in Deformable Environments, associated to BMVC'06, the 17th British Machine Vision Conference, held in Edinburgh, UK, in September 2006. The goal of DEFORM'06 was to bring together people from different domains having interests in deformable image registration. In response to our Call for Papers, we received 17 submissions and selected 8 for oral presentation at the workshop. In addition to the regular papers, Andrew Fitzgibbon from Microsoft Research Cambridge gave an invited talk at the workshop. The conference website including online proceedings remains open, see http://comsee.univ-bpclermont.fr/events/DEFORM06. We would like to thank the BMVC'06 co-chairs, Mike Chantler, Manuel Trucco and especially Bob Fisher for is great help in the local arrangements, Andrew Fitzgibbon, and the Programme Committee members who provided insightful reviews of the submitted papers. Special thanks go to Marc Richetin, head of the CNRS Research Federation TIMS, which sponsored the workshop. August 2006 Adrien Bartoli Nassir Navab Vincent Lepeti

    Doctor of Philosophy

    Get PDF
    dissertationImage segmentation entails the partitioning of an image domain, usually two or three dimensions, so that each partition or segment has some meaning that is relevant to the application at hand. Accurate image segmentation is a crucial challenge in many disciplines, including medicine, computer vision, and geology. In some applications, heterogeneous pixel intensities; noisy, ill-defined, or diffusive boundaries; and irregular shapes with high variability can make it challenging to meet accuracy requirements. Various segmentation approaches tackle such challenges by casting the segmentation problem as an energy-minimization problem, and solving it using efficient optimization algorithms. These approaches are broadly classified as either region-based or edge (surface)-based depending on the features on which they operate. The focus of this dissertation is on the development of a surface-based energy model, the design of efficient formulations of optimization frameworks to incorporate such energy, and the solution of the energy-minimization problem using graph cuts. This dissertation utilizes a set of four papers whose motivation is the efficient extraction of the left atrium wall from the late gadolinium enhancement magnetic resonance imaging (LGE-MRI) image volume. This dissertation utilizes these energy formulations for other applications, including contact lens segmentation in the optical coherence tomography (OCT) data and the extraction of geologic features in seismic data. Chapters 2 through 5 (papers 1 through 4) explore building a surface-based image segmentation model by progressively adding components to improve its accuracy and robustness. The first paper defines a parametric search space and its discrete formulation in the form of a multilayer three-dimensional mesh model within which the segmentation takes place. It includes a generative intensity model, and we optimize using a graph formulation of the surface net problem. The second paper proposes a Bayesian framework with a Markov random field (MRF) prior that gives rise to another class of surface nets, which provides better segmentation with smooth boundaries. The third paper presents a maximum a posteriori (MAP)-based surface estimation framework that relies on a generative image model by incorporating global shape priors, in addition to the MRF, within the Bayesian formulation. Thus, the resulting surface not only depends on the learned model of shapes,but also accommodates the test data irregularities through smooth deviations from these priors. Further, the paper proposes a new shape parameter estimation scheme, in closed form, for segmentation as a part of the optimization process. Finally, the fourth paper (under review at the time of this document) presents an extensive analysis of the MAP framework and presents improved mesh generation and generative intensity models. It also performs a thorough analysis of the segmentation results that demonstrates the effectiveness of the proposed method qualitatively, quantitatively, and clinically. Chapter 6, consisting of unpublished work, demonstrates the application of an MRF-based Bayesian framework to segment coupled surfaces of contact lenses in optical coherence tomography images. This chapter also shows an application related to the extraction of geological structures in seismic volumes. Due to the large sizes of seismic volume datasets, we also present fast, approximate surface-based energy minimization strategies that achieve better speed-ups and memory consumption

    Retinal Vessel Segmentation Using the 2-D Morlet Wavelet and Supervised Classification

    Get PDF
    We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or non-vessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and continuous two-dimensional Morlet wavelet transform responses taken at multiple scales. The Morlet wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces and compare its performance with the linear minimum squared error classifier. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE and STARE databases of manually labeled non-mydriatic images. On the DRIVE database, it achieves an area under the receiver operating characteristic (ROC) curve of 0.9598, being slightly superior than that presented by the method of Staal et al.Comment: 9 pages, 7 figures and 1 table. Accepted for publication in IEEE Trans Med Imag; added copyright notic

    3D Brain Segmentation Using Dual-Front Active Contours with Optional User Interaction

    Get PDF
    Important attributes of 3D brain cortex segmentation algorithms include robustness, accuracy, computational efficiency, and facilitation of user interaction, yet few algorithms incorporate all of these traits. Manual segmentation is highly accurate but tedious and laborious. Most automatic techniques, while less demanding on the user, are much less accurate. It would be useful to employ a fast automatic segmentation procedure to do most of the work but still allow an expert user to interactively guide the segmentation to ensure an accurate final result. We propose a novel 3D brain cortex segmentation procedure utilizing dual-front active contours which minimize image-based energies in a manner that yields flexibly global minimizers based on active regions. Region-based information and boundary-based information may be combined flexibly in the evolution potentials for accurate segmentation results. The resulting scheme is not only more robust but much faster and allows the user to guide the final segmentation through simple mouse clicks which add extra seed points. Due to the flexibly global nature of the dual-front evolution model, single mouse clicks yield corrections to the segmentation that extend far beyond their initial locations, thus minimizing the user effort. Results on 15 simulated and 20 real 3D brain images demonstrate the robustness, accuracy, and speed of our scheme compared with other methods
    corecore