662 research outputs found

    Novel Correspondence-based Approach for Consistent Human Skeleton Extraction

    Get PDF
    This paper presents a novel base-points-driven shape correspondence (BSC) approach to extract skeletons of articulated objects from 3D mesh shapes. The skeleton extraction based on BSC approach is more accurate than the traditional direct skeleton extraction methods. Since 3D shapes provide more geometric information, BSC offers the consistent information between the source shape and the target shapes. In this paper, we first extract the skeleton from a template shape such as the source shape automatically. Then, the skeletons of the target shapes of different poses are generated based on the correspondence relationship with source shape. The accuracy of the proposed method is demonstrated by presenting a comprehensive performance evaluation on multiple benchmark datasets. The results of the proposed approach can be applied to various applications such as skeleton-driven animation, shape segmentation and human motion analysis

    Slippage Features

    Get PDF
    In this report, we present a novel feature detection technique for unstructured point clouds. We introduce a generalized concept of geometric features that detects locally uniquely identifiable keypoints as centroids of area with locally minimal slippage. We extend the concept to multiple scales and extract features using multi-scale mean shift clustering. In order to validate matches between feature points, we employ a two stage technique that first sorts out unlikely matches, followed by an approximate alignment between remaining features by a rotational cross-correlation analysis and a local iterative closest point (ICP) registration. The resulting residuals are then used as final similarity measure. The proposed combination of techniques results in a robust and reliable correspondence detection technique that yields registration results in situations where previous techniques are not able to detect usable feature correspondences. We provide a detailed empirical analysis of the method, and apply the technique to global registration, symmetry detection and deformable matching problems

    Indexing and Retrieval of 3D Articulated Geometry Models

    Get PDF
    In this PhD research study, we focus on building a content-based search engine for 3D articulated geometry models. 3D models are essential components in nowadays graphic applications, and are widely used in the game, animation and movies production industry. With the increasing number of these models, a search engine not only provides an entrance to explore such a huge dataset, it also facilitates sharing and reusing among different users. In general, it reduces production costs and time to develop these 3D models. Though a lot of retrieval systems have been proposed in recent years, search engines for 3D articulated geometry models are still in their infancies. Among all the works that we have surveyed, reliability and efficiency are the two main issues that hinder the popularity of such systems. In this research, we have focused our attention mainly to address these two issues. We have discovered that most existing works design features and matching algorithms in order to reflect the intrinsic properties of these 3D models. For instance, to handle 3D articulated geometry models, it is common to extract skeletons and use graph matching algorithms to compute the similarity. However, since this kind of feature representation is complex, it leads to high complexity of the matching algorithms. As an example, sub-graph isomorphism can be NP-hard for model graph matching. Our solution is based on the understanding that skeletal matching seeks correspondences between the two comparing models. If we can define descriptive features, the correspondence problem can be solved by bag-based matching where fast algorithms are available. In the first part of the research, we propose a feature extraction algorithm to extract such descriptive features. We then convert the skeletal matching problems into bag-based matching. We further define metric similarity measure so as to support fast search. We demonstrate the advantages of this idea in our experiments. The improvement on precision is 12\% better at high recall. The indexing search of 3D model is 24 times faster than the state of the art if only the first relevant result is returned. However, improving the quality of descriptive features pays the price of high dimensionality. Curse of dimensionality is a notorious problem on large multimedia databases. The computation time scales exponentially as the dimension increases, and indexing techniques may not be useful in such situation. In the second part of the research, we focus ourselves on developing an embedding retrieval framework to solve the high dimensionality problem. We first argue that our proposed matching method projects 3D models on manifolds. We then use manifold learning technique to reduce dimensionality and maximize intra-class distances. We further propose a numerical method to sub-sample and fast search databases. To preserve retrieval accuracy using fewer landmark objects, we propose an alignment method which is also beneficial to existing works for fast search. The advantages of the retrieval framework are demonstrated in our experiments that it alleviates the problem of curse of dimensionality. It also improves the efficiency (3.4 times faster) and accuracy (30\% more accurate) of our matching algorithm proposed above. In the third part of the research, we also study a closely related area, 3D motions. 3D motions are captured by sticking sensor on human beings. These captured data are real human motions that are used to animate 3D articulated geometry models. Creating realistic 3D motions is an expensive and tedious task. Although 3D motions are very different from 3D articulated geometry models, we observe that existing works also suffer from the problem of temporal structure matching. This also leads to low efficiency in the matching algorithms. We apply the same idea of bag-based matching into the work of 3D motions. From our experiments, the proposed method has a 13\% improvement on precision at high recall and is 12 times faster than existing works. As a summary, we have developed algorithms for 3D articulated geometry models and 3D motions, covering feature extraction, feature matching, indexing and fast search methods. Through various experiments, our idea of converting restricted matching to bag-based matching improves matching efficiency and reliability. These have been shown in both 3D articulated geometry models and 3D motions. We have also connected 3D matching to the area of manifold learning. The embedding retrieval framework not only improves efficiency and accuracy, but has also opened a new area of research

    Doctor of Philosophy

    Get PDF
    dissertationShape analysis is a well-established tool for processing surfaces. It is often a first step in performing tasks such as segmentation, symmetry detection, and finding correspondences between shapes. Shape analysis is traditionally employed on well-sampled surfaces where the geometry and topology is precisely known. When the form of the surface is that of a point cloud containing nonuniform sampling, noise, and incomplete measurements, traditional shape analysis methods perform poorly. Although one may first perform reconstruction on such a point cloud prior to performing shape analysis, if the geometry and topology is far from the true surface, then this can have an adverse impact on the subsequent analysis. Furthermore, for triangulated surfaces containing noise, thin sheets, and poorly shaped triangles, existing shape analysis methods can be highly unstable. This thesis explores methods of shape analysis applied directly to such defect-laden shapes. We first study the problem of surface reconstruction, in order to obtain a better understanding of the types of point clouds for which reconstruction methods contain difficulties. To this end, we have devised a benchmark for surface reconstruction, establishing a standard for measuring error in reconstruction. We then develop a new method for consistently orienting normals of such challenging point clouds by using a collection of harmonic functions, intrinsically defined on the point cloud. Next, we develop a new shape analysis tool which is tolerant to imperfections, by constructing distances directly on the point cloud defined as the likelihood of two points belonging to a mutually common medial ball, and apply this for segmentation and reconstruction. We extend this distance measure to define a diffusion process on the point cloud, tolerant to missing data, which is used for the purposes of matching incomplete shapes undergoing a nonrigid deformation. Lastly, we have developed an intrinsic method for multiresolution remeshing of a poor-quality triangulated surface via spectral bisection

    Appearance-based motion recognition of human actions

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (leaves 52-53).by James William Davis.M.S

    Video content analysis for intelligent forensics

    Get PDF
    The networks of surveillance cameras installed in public places and private territories continuously record video data with the aim of detecting and preventing unlawful activities. This enhances the importance of video content analysis applications, either for real time (i.e. analytic) or post-event (i.e. forensic) analysis. In this thesis, the primary focus is on four key aspects of video content analysis, namely; 1. Moving object detection and recognition, 2. Correction of colours in the video frames and recognition of colours of moving objects, 3. Make and model recognition of vehicles and identification of their type, 4. Detection and recognition of text information in outdoor scenes. To address the first issue, a framework is presented in the first part of the thesis that efficiently detects and recognizes moving objects in videos. The framework targets the problem of object detection in the presence of complex background. The object detection part of the framework relies on background modelling technique and a novel post processing step where the contours of the foreground regions (i.e. moving object) are refined by the classification of edge segments as belonging either to the background or to the foreground region. Further, a novel feature descriptor is devised for the classification of moving objects into humans, vehicles and background. The proposed feature descriptor captures the texture information present in the silhouette of foreground objects. To address the second issue, a framework for the correction and recognition of true colours of objects in videos is presented with novel noise reduction, colour enhancement and colour recognition stages. The colour recognition stage makes use of temporal information to reliably recognize the true colours of moving objects in multiple frames. The proposed framework is specifically designed to perform robustly on videos that have poor quality because of surrounding illumination, camera sensor imperfection and artefacts due to high compression. In the third part of the thesis, a framework for vehicle make and model recognition and type identification is presented. As a part of this work, a novel feature representation technique for distinctive representation of vehicle images has emerged. The feature representation technique uses dense feature description and mid-level feature encoding scheme to capture the texture in the frontal view of the vehicles. The proposed method is insensitive to minor in-plane rotation and skew within the image. The capability of the proposed framework can be enhanced to any number of vehicle classes without re-training. Another important contribution of this work is the publication of a comprehensive up to date dataset of vehicle images to support future research in this domain. The problem of text detection and recognition in images is addressed in the last part of the thesis. A novel technique is proposed that exploits the colour information in the image for the identification of text regions. Apart from detection, the colour information is also used to segment characters from the words. The recognition of identified characters is performed using shape features and supervised learning. Finally, a lexicon based alignment procedure is adopted to finalize the recognition of strings present in word images. Extensive experiments have been conducted on benchmark datasets to analyse the performance of proposed algorithms. The results show that the proposed moving object detection and recognition technique superseded well-know baseline techniques. The proposed framework for the correction and recognition of object colours in video frames achieved all the aforementioned goals. The performance analysis of the vehicle make and model recognition framework on multiple datasets has shown the strength and reliability of the technique when used within various scenarios. Finally, the experimental results for the text detection and recognition framework on benchmark datasets have revealed the potential of the proposed scheme for accurate detection and recognition of text in the wild

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs
    corecore