184 research outputs found

    Geometric modeling of non-rigid 3D shapes : theory and application to object recognition.

    Get PDF
    One of the major goals of computer vision is the development of flexible and efficient methods for shape representation. This is true, especially for non-rigid 3D shapes where a great variety of shapes are produced as a result of deformations of a non-rigid object. Modeling these non-rigid shapes is a very challenging problem. Being able to analyze the properties of such shapes and describe their behavior is the key issue in research. Also, considering photometric features can play an important role in many shape analysis applications, such as shape matching and correspondence because it contains rich information about the visual appearance of real objects. This new information (contained in photometric features) and its important applications add another, new dimension to the problem\u27s difficulty. Two main approaches have been adopted in the literature for shape modeling for the matching and retrieval problem, local and global approaches. Local matching is performed between sparse points or regions of the shape, while the global shape approaches similarity is measured among entire models. These methods have an underlying assumption that shapes are rigidly transformed. And Most descriptors proposed so far are confined to shape, that is, they analyze only geometric and/or topological properties of 3D models. A shape descriptor or model should be isometry invariant, scale invariant, be able to capture the fine details of the shape, computationally efficient, and have many other good properties. A shape descriptor or model is needed. This shape descriptor should be: able to deal with the non-rigid shape deformation, able to handle the scale variation problem with less sensitivity to noise, able to match shapes related to the same class even if these shapes have missing parts, and able to encode both the photometric, and geometric information in one descriptor. This dissertation will address the problem of 3D non-rigid shape representation and textured 3D non-rigid shapes based on local features. Two approaches will be proposed for non-rigid shape matching and retrieval based on Heat Kernel (HK), and Scale-Invariant Heat Kernel (SI-HK) and one approach for modeling textured 3D non-rigid shapes based on scale-invariant Weighted Heat Kernel Signature (WHKS). For the first approach, the Laplace-Beltrami eigenfunctions is used to detect a small number of critical points on the shape surface. Then a shape descriptor is formed based on the heat kernels at the detected critical points for different scales. Sparse representation is used to reduce the dimensionality of the calculated descriptor. The proposed descriptor is used for classification via the Collaborative Representation-based Classification with a Regularized Least Square (CRC-RLS) algorithm. The experimental results have shown that the proposed descriptor can achieve state-of-the-art results on two benchmark data sets. For the second approach, an improved method to introduce scale-invariance has been also proposed to avoid noise-sensitive operations in the original transformation method. Then a new 3D shape descriptor is formed based on the histograms of the scale-invariant HK for a number of critical points on the shape at different time scales. A Collaborative Classification (CC) scheme is then employed for object classification. The experimental results have shown that the proposed descriptor can achieve high performance on the two benchmark data sets. An important observation from the experiments is that the proposed approach is more able to handle data under several distortion scenarios (noise, shot-noise, scale, and under missing parts) than the well-known approaches. For modeling textured 3D non-rigid shapes, this dissertation introduces, for the first time, a mathematical framework for the diffusion geometry on textured shapes. This dissertation presents an approach for shape matching and retrieval based on a weighted heat kernel signature. It shows how to include photometric information as a weight over the shape manifold, and it also propose a novel formulation for heat diffusion over weighted manifolds. Then this dissertation presents a new discretization method for the weighted heat kernel induced by the linear FEM weights. Finally, the weighted heat kernel signature is used as a shape descriptor. The proposed descriptor encodes both the photometric, and geometric information based on the solution of one equation. Finally, this dissertation proposes an approach for 3D face recognition based on the front contours of heat propagation over the face surface. The front contours are extracted automatically as heat is propagating starting from a detected set of landmarks. The propagation contours are used to successfully discriminate the various faces. The proposed approach is evaluated on the largest publicly available database of 3D facial images and successfully compared to the state-of-the-art approaches in the literature. This work can be extended to the problem of dense correspondence between non-rigid shapes. The proposed approaches with the properties of the Laplace-Beltrami eigenfunction can be utilized for 3D mesh segmentation. Another possible application of the proposed approach is the view point selection for 3D objects by selecting the most informative views that collectively provide the most descriptive presentation of the surface

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Extraction and representation of semantic information in digital media

    Get PDF

    Learning effective binary representation with deep hashing technique for large-scale multimedia similarity search

    Get PDF
    The explosive growth of multimedia data in modern times inspires the research of performing an efficient large-scale multimedia similarity search in the existing information retrieval systems. In the past decades, the hashing-based nearest neighbor search methods draw extensive attention in this research field. By representing the original data with compact hash code, it enables the efficient similarity retrieval by only conducting bitwise operation when computing the Hamming distance. Moreover, less memory space is required to process and store the massive amounts of features for the search engines owing to the nature of compact binary code. These advantages make hashing a competitive option in large-scale visual-related retrieval tasks. Motivated by the previous dedicated works, this thesis focuses on learning compact binary representation via hashing techniques for the large-scale multimedia similarity search tasks. Particularly, several novel frameworks are proposed for popular hashing-based applications like a local binary descriptor for patch-level matching (Chapter 3), video-to-video retrieval (Chapter 4) and cross-modality retrieval (Chapter 5). This thesis starts by addressing the problem of learning local binary descriptor for better patch/image matching performance. To this end, we propose a novel local descriptor termed Unsupervised Deep Binary Descriptor (UDBD) for the patch-level matching tasks, which learns the transformation invariant binary descriptor via embedding the original visual data and their transformed sets into a common Hamming space. By imposing a l2,1-norm regularizer on the objective function, the learned binary descriptor gains robustness against noises. Moreover, a weak bit scheme is applied to address the ambiguous matching in the local binary descriptor, where the best match is determined for each query by comparing a series of weak bits between the query instance and the candidates, thus improving the matching performance. Furthermore, Unsupervised Deep Video Hashing (UDVH) is proposed to facilitate large-scale video-to-video retrieval. To tackle the imbalanced distribution issue in the video feature, balanced rotation is developed to identify a proper projection matrix such that the information of each dimension can be balanced in the fixed-bit quantization, thus improving the retrieval performance dramatically with better code quality. To provide comprehensive insights on the proposed rotation, two different video feature learning structures: stacked LSTM units (UDVH-LSTM) and Temporal Segment Network (UDVH-TSN) are presented in Chapter 4. Lastly, we extend the research topic from single-modality to cross-modality retrieval, where Self-Supervised Deep Multimodal Hashing (SSDMH) based on matrix factorization is proposed to learn unified binary code for different modalities directly without the need for relaxation. By minimizing graph regularization loss, it is prone to produce discriminative hash code via preserving the original data structure. Moreover, Binary Gradient Descent (BGD) accelerates the discrete optimization against the bit-by-bit fashion. Besides, an unsupervised version termed Unsupervised Deep Cross-Modal Hashing (UDCMH) is proposed to tackle the large-scale cross-modality retrieval when prior knowledge is unavailable

    Similarity reasoning for local surface analysis and recognition

    Get PDF
    This thesis addresses the similarity assessment of digital shapes, contributing to the analysis of surface characteristics that are independent of the global shape but are crucial to identify a model as belonging to the same manufacture, the same origin/culture or the same typology (color, common decorations, common feature elements, compatible style elements, etc.). To face this problem, the interpretation of the local surface properties is crucial. We go beyond the retrieval of models or surface patches in a collection of models, facing the recognition of geometric patterns across digital models with different overall shape. To address this challenging problem, the use of both engineered and learning-based descriptions are investigated, building one of the first contributions towards the localization and identification of geometric patterns on digital surfaces. Finally, the recognition of patterns adds a further perspective in the exploration of (large) 3D data collections, especially in the cultural heritage domain. Our work contributes to the definition of methods able to locally characterize the geometric and colorimetric surface decorations. Moreover, we showcase our benchmarking activity carried out in recent years on the identification of geometric features and the retrieval of digital models completely characterized by geometric or colorimetric patterns
    corecore