245 research outputs found

    Modular dynamic RBF neural network for face recognition

    Get PDF
    Over the years, we have seen an increase in the use of RBF neural networks for the task of face recognition. However, the use of second order algorithms as the learning algorithm for all the adjustable parameters in such networks are rare due to the high computational complexity of the calculation of the Jacobian and Hessian matrix. Hence, in this paper, we propose a modular structural training architecture to adapt the Levenberg-Marquardt based RBF neural network for the application of face recognition. In addition to the proposal of the modular structural training architecture, we have also investigated the use of different front-end processors to reduce the dimension size of the feature vectors prior to its application to the LM-based RBF neural network. The investigative study was done on three standard face databases; ORL, Yale and AR databases

    Multimodal decision-level fusion for person authentication

    Get PDF
    In this paper, the use of clustering algorithms for decision-level data fusion is proposed. Person authentication results coming from several modalities (e.g., still image, speech), are combined by using fuzzy k-means (FKM), fuzzy vector quantization (FVQ) algorithms, and median radial basis function (MRBF) network. The quality measure of the modalities data is used for fuzzification. Two modifications of the FKM and FVQ algorithms, based on a novel fuzzy vector distance definition, are proposed to handle the fuzzy data and utilize the quality measure. Simulations show that fuzzy clustering algorithms have better performance compared to the classical clustering algorithms and other known fusion algorithms. MRBF has better performance especially when two modalities are combined. Moreover, the use of the quality via the proposed modified algorithms increases the performance of the fusion system

    Contribuciones a la estimación de pose de cámara

    Get PDF
    El problema cuya resolución tiene como objetivo determinar la orientación y localización de una cámara respecto a un sistema de coordenadas se denomina Estimación de la pose de la cámara. Las soluciones basadas en imágenes para la resolución de este problema son una opción interesante debido a su bajo coste. El inconveniente fundamental de esta opción es que su precisión puede verse afectada debido a la presencia de ruido en la imagen. Trabajar con imágenes para estimar la pose de cámara está muy relacionado con dos problemas denominados Perspective-n-Point (PnP) y Bundle Adjustment (ajuste del haz). Dado un conjunto de n correspondencias entre puntos del espacio 3D y sus proyecciones 2D en una imagen, los métodos PnP tratan de obtener la pose de la cámara. Cuando la información acerca de la posición 3D de los puntos es desconocida, pero sí se tiene conocimiento de una serie de proyecciones 2D tomadas desde diferentes puntos de vista del mismo punto 3D, el ajuste del haz trata de estimar simultáneamente la posición tridimensional de los puntos y la pose de la cámara. Debido a esto la tarea de buscar correspondencias, ya sea entre puntos de la escena 3D y su proyección 2D en la imagen, o entre varias proyecciones 2D de imágenes diferentes no es trivial y resulta fundamental para la resolución de los problemas mencionados anteriormente. En esta Tesis Doctoral se han propuesto dos métodos novedosos para el problema de búsqueda de correspondencias usando marcas naturales y artificiales. En nuestra primera contribución, basada en el uso de marcas naturales, proponemos un método para encontrar correspondencias entre puntos 2D de diferentes imágenes, utilizando un nuevo enfoque de fusión que combina la información proporcionada por varios descriptores haciendo uso de la Teoría de Dempster-Shafer. El método propuesto es capaz de fusionar diferentes fuentes de información teniendo en cuenta además su confianza relativa con el fin de obtener una mejor solución. La segunda contribución se centra en el problema de búsqueda de proyecciones 2D de puntos 3D conocidos. Proponemos un enfoque novedoso para identificar marcadores artificiales, que son una alternativa muy popular cuando se requiere robustez y velocidad. En concreto, proponemos abordar el problema de identificación de marcadores artificiales como un problema de clasificación. Como consecuencia, hemos entrenado métodos capaces de detectar marcadores en imágenes afectadas por situaciones complejas como el desenfoque o la luz no uniforme. Ambas propuestas realizadas en esta Tesis han sido comparadas con métodos del estado del arte mostrando mejoras que son estadísticamente significativas.Camera pose estimation is the problem of finding the orientation and localization of a camera with respect to an arbitrary coordinate system. Image-based solutions for this problem are an interesting option because its reduced cost. However, their main drawback is that the accuracy of the results is afected by the presence of noise in the images. The use of images for the camera pose estimation task is strongly related to the Perspective-n-Point (PnP) and Bundle Adjustment problem. Given a set of n correspondences between 3D points and its 2D projections on the image, PnP methods provide estimations of the camera pose. In addition, when the information about the 3D positions is unknow but a set of 2D projections taken from diferent viewpoints of the same 3D point are known, Bundle Adjustment methods are capable of finding simultaneously the 3D position of the points and the camera pose. Then the task of finding correspondences between 3D points and its 2D projections, and between 2D projections of diferent images is a fundamental step for the above mentioned problems. This PhD Thesis proposes two novel approaches to solve the problem of finding correspondeces using both natural and artificial features. In our first contribution, based on natural features, we propose a novel approach to find 2D correspondeces between images by a novel fusion approach combining information provided by several descriptors using the Dempster-Shafer Theory. The proposed method is able to fuse diferent sources of information considering their relative confidence in order to provide a better solution. Our second contribution focuses on the problem of nding the 2D projections of 3D points. We propose a novel approach for identification of artificial landmarks, which are a very popular method when robustness and speed are required. In particular, we propose to tackle the marker identi cation problem as a classi cation one. As a consequence, we develop methods able to detect such markers in complex real situations such as blurring and non-uniform lightning. The two contributions made in this Thesis have been compared with the state-of-art methods showing statistically significant improvements

    Inexpensive fusion methods for enhancing feature detection

    Get PDF
    Recent successful approaches to high-level feature detection in image and video data have treated the problem as a pattern classification task. These typically leverage the techniques learned from statistical machine learning, coupled with ensemble architectures that create multiple feature detection models. Once created, co-occurrence between learned features can be captured to further boost performance. At multiple stages throughout these frameworks, various pieces of evidence can be fused together in order to boost performance. These approaches whilst very successful are computationally expensive, and depending on the task, require the use of significant computational resources. In this paper we propose two fusion methods that aim to combine the output of an initial basic statistical machine learning approach with a lower-quality information source, in order to gain diversity in the classified results whilst requiring only modest computing resources. Our approaches, validated experimentally on TRECVid data, are designed to be complementary to existing frameworks and can be regarded as possible replacements for the more computationally expensive combination strategies used elsewhere

    The effective use of the DSmT for multi-class classification

    Get PDF
    International audienceThe extension of the Dezert-Smarandache theory (DSmT) for the multi-class framework has a feasible computational complexity for various applications when the number of classes is limited or reduced typically two classes. In contrast, when the number of classes is large, the DSmT generates a high computational complexity. This paper proposes to investigate the effective use of the DSmT for multi-class classification in conjunction with the Support Vector Machines using the One-Against-All (OAA) implementation, which allows offering two advantages: firstly, it allows modeling the partial ignorance by including the complementary classes in the set of focal elements during the combination process and, secondly, it allows reducing drastically the number of focal elements using a supervised model by introducing exclusive constraints when classes are naturally and mutually exclusive. To illustrate the effective use of the DSmT for multi-class classification, two SVM-OAA implementations are combined according three steps: transformation of the SVM classifier outputs into posterior probabilities using a sigmoid technique of Platt, estimation of masses directly through the proposed model and combination of masses through the Proportional Conflict Redistribution (PCR6). To prove the effective use of the proposed framework, a case study is conducted on the handwritten digit recognition. Experimental results show that it is possible to reduce efficiently both the number of focal elements and the classification error rate

    A New Approach to Arabic Sign Language Recognition System

    Get PDF

    Visual tracking with spatio-temporal Dempster-Shafer information fusion

    Get PDF
    A key problem in visual tracking is how to effectively combine spatio-temporal visual information from throughout a video to accurately estimate the state of an object. We address this problem by incorporating Dempster-Shafer information fusion into the tracking approach. To implement this fusion task, the entire image sequence is partitioned into spatially and temporally adjacent subsequences. A support vector machine (SVM) classifier is trained for object=non-object classification on each of these subsequences, the outputs of which act as separate data sources. To combine the discriminative information from these classifiers, we further present a spatio-temporal weighted Dempster-Shafer (STWDS) scheme. Moreover, temporally adjacent sources are likely to share discriminative information on object/non-object classification. In order to use such information, an adaptive SVM learning scheme is designed to transfer discriminative information across sources. Finally, the corresponding Dempster-Shafer belief function of the STWDS scheme is embedded into a Bayesian tracking model. Experimental results on challenging videos demonstrate the effectiveness and robustness of the proposed tracking approach.Xi Li, Anthony Dick, Chunhua Shen, Zhongfei Zhang, Anton van den Hengel, Hanzi Wan
    corecore