453 research outputs found

    Generalized Two-Dimensional Quaternion Principal Component Analysis with Weighting for Color Image Recognition

    Full text link
    A generalized two-dimensional quaternion principal component analysis (G2DQPCA) approach with weighting is presented for color image analysis. As a general framework of 2DQPCA, G2DQPCA is flexible to adapt different constraints or requirements by imposing LpL_{p} norms both on the constraint function and the objective function. The gradient operator of quaternion vector functions is redefined by the structure-preserving gradient operator of real vector function. Under the framework of minorization-maximization (MM), an iterative algorithm is developed to obtain the optimal closed-form solution of G2DQPCA. The projection vectors generated by the deflating scheme are required to be orthogonal to each other. A weighting matrix is defined to magnify the effect of main features. The weighted projection bases remain the accuracy of face recognition unchanged or moving in a tight range as the number of features increases. The numerical results based on the real face databases validate that the newly proposed method performs better than the state-of-the-art algorithms.Comment: 15 pages, 15 figure

    Robust online subspace learning

    No full text
    In this thesis, I aim to advance the theories of online non-linear subspace learning through the development of strategies which are both efficient and robust. The use of subspace learning methods is very popular in computer vision and they have been employed to numerous tasks. With the increasing need for real-time applications, the formulation of online (i.e. incremental and real-time) learning methods is a vibrant research field and has received much attention from the research community. A major advantage of incremental systems is that they update the hypothesis during execution, thus allowing for the incorporation of the real data seen in the testing phase. Tracking acts as an attractive and popular evaluation tool for incremental systems, and thus, the connection between online learning and adaptive tracking is seen commonly in the literature. The proposed system in this thesis facilitates learning from noisy input data, e.g. caused by occlusions, casted shadows and pose variations, that are challenging problems in general tracking frameworks. First, a fast and robust alternative to standard L2-norm principal component analysis (PCA) is introduced, which I coin Euler PCA (e-PCA). The formulation of e-PCA is based on robust, non-linear kernel PCA (KPCA) with a cosine-based kernel function that is expressed via an explicit feature space. When applied to tracking, face reconstruction and background modeling, promising results are achieved. In the second part, the problem of matching vectors of 3D rotations is explicitly targeted. A novel distance which is robust for 3D rotations is introduced, and formulated as a kernel function. The kernel leads to a new representation of 3D rotations, the full-angle quaternion (FAQ) representation. Finally, I propose 3D object recognition from point clouds, and object tracking with color values using FAQs. A domain-specific kernel function designed for visual data is then presented. KPCA with Krein space kernels is introduced, as this kernel is indefinite, and an exact incremental learning framework for the new kernel is developed. In a tracker framework, the presented online learning outperforms the competitors in nine popular and challenging video sequences. In the final part, the generalized eigenvalue problem is studied. Specifically, incremental slow feature analysis (SFA) with indefinite kernels is proposed, and applied to temporal video segmentation and tracking with change detection. As online SFA allows for drift detection, further improvements are achieved in the evaluation of the tracking task.Open Acces

    The Study of Scene Classification in the Multisensor Remote Sensing Image Fusion

    Get PDF
    We propose a scene classification method for speeding up the multisensor remote sensing image fusion by using the singular value decomposition of quaternion matrix and the kernel principal component analysis (KPCA) to extract features. At first, images are segmented to patches by a regular grid, and for each patch, we extract color features by using quaternion singular value decomposition (QSVD) method, and the grey features are extracted by Gabor filter and then by using orientation histogram to describe the grey information. After that, we combine the color features and the orientation histogram together with the same weight to obtain the descriptor for each patch. All the patch descriptors are clustered to get visual words for each category. Then we apply KPCA to the visual words to get the subspaces of the category. The descriptors of a test image then are projected to the subspaces of all categories to get the projection length to all categories for the test image. Finally, support vector machine (SVM) with linear kernel function is used to get the scene classification performance. We experiment with three classification situations on OT8 dataset and compare our method with the typical scene classification method, probabilistic latent semantic analysis (pLSA), and the results confirm the feasibility of our method

    Robust and affordable localization and mapping for 3D reconstruction. Application to architecture and construction

    Get PDF
    La localización y mapeado simultáneo a partir de una sola cámara en movimiento se conoce como Monocular SLAM. En esta tesis se aborda este problema con cámaras de bajo coste cuyo principal reto consiste en ser robustos al ruido, blurring y otros artefactos que afectan a la imagen. La aproximación al problema es discreta, utilizando solo puntos de la imagen significativos para localizar la cámara y mapear el entorno. La principal contribución es una simplificación del grafo de poses que permite mejorar la precisión en las escenas más habituales, evaluada de forma exhaustiva en 4 datasets. Los resultados del mapeado permiten obtener una reconstrucción 3D de la escena que puede ser utilizada en arquitectura y construcción para Modelar la Información del Edificio (BIM). En la segunda parte de la tesis proponemos incorporar dicha información en un sistema de visualización avanzada usando WebGL que ayude a simplificar la implantación de la metodología BIM.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Doctorado en Informátic
    corecore