1,172 research outputs found

    Variational methods for shape and image registrations.

    Get PDF
    Estimating and analysis of deformation, either rigid or non-rigid, is an active area of research in various medical imaging and computer vision applications. Its importance stems from the inherent inter- and intra-variability in biological and biomedical object shapes and from the dynamic nature of the scenes usually dealt with in computer vision research. For instance, quantifying the growth of a tumor, recognizing a person\u27s face, tracking a facial expression, or retrieving an object inside a data base require the estimation of some sort of motion or deformation undergone by the object of interest. To solve these problems, and other similar problems, registration comes into play. This is the process of bringing into correspondences two or more data sets. Depending on the application at hand, these data sets can be for instance gray scale/color images or objects\u27 outlines. In the latter case, one talks about shape registration while in the former case, one talks about image/volume registration. In some situations, the combinations of different types of data can be used complementarily to establish point correspondences. One of most important image analysis tools that greatly benefits from the process of registration, and which will be addressed in this dissertation, is the image segmentation. This process consists of localizing objects in images. Several challenges are encountered in image segmentation, including noise, gray scale inhomogeneities, and occlusions. To cope with such issues, the shape information is often incorporated as a statistical model into the segmentation process. Building such statistical models requires a good and accurate shape alignment approach. In addition, segmenting anatomical structures can be accurately solved through the registration of the input data set with a predefined anatomical atlas. Variational approaches for shape/image registration and segmentation have received huge interest in the past few years. Unlike traditional discrete approaches, the variational methods are based on continuous modelling of the input data through the use of Partial Differential Equations (PDE). This brings into benefit the extensive literature on theory and numerical methods proposed to solve PDEs. This dissertation addresses the registration problem from a variational point of view, with more focus on shape registration. First, a novel variational framework for global-to-local shape registration is proposed. The input shapes are implicitly represented through their signed distance maps. A new Sumof- Squared-Differences (SSD) criterion which measures the disparity between the implicit representations of the input shapes, is introduced to recover the global alignment parameters. This new criteria has the advantages over some existing ones in accurately handling scale variations. In addition, the proposed alignment model is less expensive computationally. Complementary to the global registration field, the local deformation field is explicitly established between the two globally aligned shapes, by minimizing a new energy functional. This functional incrementally and simultaneously updates the displacement field while keeping the corresponding implicit representation of the globally warped source shape as close to a signed distance function as possible. This is done under some regularization constraints that enforce the smoothness of the recovered deformations. The overall process leads to a set of coupled set of equations that are simultaneously solved through a gradient descent scheme. Several applications, where the developed tools play a major role, are addressed throughout this dissertation. For instance, some insight is given as to how one can solve the challenging problem of three dimensional face recognition in the presence of facial expressions. Statistical modelling of shapes will be presented as a way of benefiting from the proposed shape registration framework. Second, this dissertation will visit th

    Feature detection and description for image matching: from hand-crafted design to deep learning

    Get PDF
    In feature based image matching, distinctive features in images are detected and represented by feature descriptors. Matching is then carried out by assessing the similarity of the descriptors of potentially conjugate points. In this paper, we first shortly discuss the general framework. Then, we review feature detection as well as the determination of affine shape and orientation of local features, before analyzing feature description in more detail. In the feature description review, the general framework of local feature description is presented first. Then, the review discusses the evolution from hand-crafted feature descriptors, e.g. SIFT (Scale Invariant Feature Transform), to machine learning and deep learning based descriptors. The machine learning models, the training loss and the respective training data of learning-based algorithms are looked at in more detail; subsequently the various advantages and challenges of the different approaches are discussed. Finally, we present and assess some current research directions before concluding the paper

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Study of object recognition and identification based on shape and texture analysis

    Get PDF
    The objective of object recognition is to enable computers to recognize image patterns without human intervention. According to its applications, it is mainly divided into two parts: recognition of object categories and detection/identification of objects. My thesis studied the techniques of object feature analysis and identification strategies, which solve the object recognition problem by employing effective and perceptually important object features. The shape information is of particular interest and a review of the shape representation and description is presented, as well as the latest research work on object recognition. In the second chapter of the thesis, a novel content-based approach is proposed for efficient shape classification and retrieval of 2D objects. Two object detection approaches, which are designed according to the characteristics of the shape context and SIFT descriptors, respectively, are analyzed and compared. It is found that the identification strategy constructed on a single type of object feature is only able to recognize the target object under specific conditions which the identifier is adapted to. These identifiers are usually designed to detect the target objects which are rich in the feature type captured by the identifier. In addition, this type of feature often distinguishes the target object from the complex scene. To overcome this constraint, a novel prototyped-based object identification method is presented to detect the target object in the complex scene by employing different types of descriptors to capture the heterogeneous features. All types of descriptors are modified to meet the requirement of the detection strategy’s framework. Thus this new method is able to describe and identify various kinds of objects whose dominant features are quite different. The identification system employs the cosine similarity to evaluate the resemblance between the prototype image and image windows on the complex scene. Then a ‘resemblance map’ is established with values on each patch representing the likelihood of the target object’s presence. The simulation approved that this novel object detection strategy is efficient, robust and of scale and rotation invariance

    Registration of 3D Point Clouds and Meshes: A Survey From Rigid to Non-Rigid

    Get PDF
    Three-dimensional surface registration transforms multiple three-dimensional data sets into the same coordinate system so as to align overlapping components of these sets. Recent surveys have covered different aspects of either rigid or nonrigid registration, but seldom discuss them as a whole. Our study serves two purposes: 1) To give a comprehensive survey of both types of registration, focusing on three-dimensional point clouds and meshes and 2) to provide a better understanding of registration from the perspective of data fitting. Registration is closely related to data fitting in which it comprises three core interwoven components: model selection, correspondences and constraints, and optimization. Study of these components 1) provides a basis for comparison of the novelties of different techniques, 2) reveals the similarity of rigid and nonrigid registration in terms of problem representations, and 3) shows how overfitting arises in nonrigid registration and the reasons for increasing interest in intrinsic techniques. We further summarize some practical issues of registration which include initializations and evaluations, and discuss some of our own observations, insights and foreseeable research trends

    Efficient Human Activity Recognition in Large Image and Video Databases

    Get PDF
    Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

    Fully Automatic Expression-Invariant Face Correspondence

    Full text link
    We consider the problem of computing accurate point-to-point correspondences among a set of human face scans with varying expressions. Our fully automatic approach does not require any manually placed markers on the scan. Instead, the approach learns the locations of a set of landmarks present in a database and uses this knowledge to automatically predict the locations of these landmarks on a newly available scan. The predicted landmarks are then used to compute point-to-point correspondences between a template model and the newly available scan. To accurately fit the expression of the template to the expression of the scan, we use as template a blendshape model. Our algorithm was tested on a database of human faces of different ethnic groups with strongly varying expressions. Experimental results show that the obtained point-to-point correspondence is both highly accurate and consistent for most of the tested 3D face models
    • …
    corecore