572 research outputs found

    Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding

    Get PDF
    Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness

    Riemannian Sparse Coding for Positive Definite Matrices

    Get PDF
    International audienceInspired by the great success of sparse coding for vector valued data, our goal is to represent symmetric positive definite (SPD) data matrices as sparse linear combinations of atoms from a dictionary, where each atom itself is an SPD matrix. Since SPD matrices follow a non-Euclidean (in fact a Riemannian) geometry, existing sparse coding techniques for Euclidean data cannot be directly extended. Prior works have approached this problem by defining a sparse coding loss function using either extrinsic similarity measures (such as the log-Euclidean distance) or kernelized variants of statistical measures (such as the Stein divergence, Jeffrey's divergence, etc.). In contrast, we propose to use the intrinsic Riemannian distance on the manifold of SPD matrices. Our main contribution is a novel mathematical model for sparse coding of SPD matrices; we also present a computationally simple algorithm for optimizing our model. Experiments on several computer vision datasets showcase superior classification and retrieval performance compared with state-of-the-art approaches

    Improved terrain type classification using UAV downwash dynamic texture effect

    Get PDF
    The ability to autonomously navigate in an unknown, dynamic environment, while at the same time classifying various terrain types, are significant challenges still faced by the computer vision research community. Addressing these problems is of great interest for the development of collaborative autonomous navigation robots. For example, an Unmanned Aerial Vehicle (UAV) can be used to determine a path, while an Unmanned Surface Vehicle (USV) follows that path to reach the target destination. For the UAV to be able to determine if a path is valid or not, it must be able to identify the type of terrain it is flying over. With the help of its rotor air flow (known as downwash e↵ect), it becomes possible to extract advanced texture features, used for terrain type classification. This dissertation presents a complete analysis on the extraction of static and dynamic texture features, proposing various algorithms and analyzing their pros and cons. A UAV equipped with a single RGB camera was used to capture images and a Multilayer Neural Network was used for the automatic classification of water and non-water-type terrains by means of the downwash e↵ect created by the UAV rotors. The terrain type classification results are then merged into a georeferenced dynamic map, where it is possible to distinguish between water and non-water areas in real time. To improve the algorithms’ processing time, several sequential processes were con verted into parallel processes and executed in the UAV onboard GPU with the CUDA framework achieving speedups up to 10x. A comparison between the processing time of these two processing modes, sequential in the CPU and parallel in the GPU, is also presented in this dissertation. All the algorithms were developed using open-source libraries, and were analyzed and validated both via simulation and real environments. To evaluate the robustness of the proposed algorithms, the studied terrains were tested with and without the presence of the downwash e↵ect. It was concluded that the classifier could be improved by per forming combinations between static and dynamic features, achieving an accuracy higher than 99% in the classification of water and non-water terrain.Dotar equipamentos moveis da funcionalidade de navegação autónoma em ambientes desconhecidos e dinâmicos, ao mesmo tempo que, classificam terrenos do tipo água e não água, são desafios que se colocam atualmente a investigadores na área da visão computacional. As soluções para estes problemas são de grande interesse para a navegação autónoma e a colaboração entre robôs. Por exemplo, um veículo aéreo não tripulado (UAV) pode ser usado para determinar o caminho que um veículo terrestre não tripulado (USV) deve percorrer para alcançar o destino pretendido. Para o UAV conseguir determinar se o caminho é válido ou não, tem de ser capaz de identificar qual o tipo de terreno que está a sobrevoar. Com a ajuda do fluxo de ar gerado pelos motores (conhecido como efeito downwash), é possível extrair características de textura avançadas, que serão usadas para a classificação do tipo de terreno. Esta dissertação apresenta uma análise completa sobre extração de texturas estáticas e dinâmicas, propondo diversos algoritmos e analisando os seus prós e contras. Um UAV equipado com uma única câmera RGB foi usado para capturar as imagens. Para classi ficar automaticamente terrenos do tipo água e não água foi usada uma rede neuronal multicamada e recorreu-se ao efeito de downwash criado pelos motores do UAV. Os re sultados da classificação do tipo de terreno são depois colocados num mapa dinâmico georreferenciado, onde é possível distinguir, em tempo real, terrenos do tipo água e não água. De forma a melhorar o tempo de processamento dos algoritmos desenvolvidos, vários processos sequenciais foram convertidos em processos paralelos e executados na GPU a bordo do UAV, com a ajuda da framework CUDA, tornando o algoritmo até 10x mais rápido. Também são apresentadas nesta dissertação comparações entre o tempo de processamento destes dois modos de processamento, sequencial na CPU e paralelo na GPU. Todos os algoritmos foram desenvolvidos através de bibliotecas open-source, e foram analisados e validados, tanto através de ambientes de simulação como em ambientes reais. Para avaliar a robustez dos algoritmos propostos, os terrenos estudados foram testados com e sem a presença do efeito downwash. Concluiu-se que o classificador pode ser melhorado realizando combinações entre as características de textura estáticas e dinâmicas, alcançando uma precisão superior a 99% na classificação de terrenos do tipo água e não água

    Result Oriented Based Face Recognition using Neural Network with Erosion and Dilation Technique

    Get PDF
    It has been observed that many face recognition algorithms fail to recognize faces after plastic surgery and wearing the spec/glasses which are the new challenge to automatic face recognition. Face detection is one of the challenging problems in the image processing. This seminar, introduce a face detection and recognition system to detect (finds) faces from database of known people. To detect the face before trying to recognize it saves a lot of work, as only a restricted region of the image is analyzed, opposite to many algorithms which work considering the whole image. In This , we gives study on Face Recognition After Plastic Surgery (FRAPS )and after wearing the spec/glasses with careful analysis of the effects on face appearance and its challenges to face recognition. To address FRAPS and wearing the spec/glasses problem, an ensemble of An Optimize Wait Selection By Genetic Algorithm For Training Artificial Neural Network Based On Image Erosion and Dilution Technology. Furthermore, with our impressive results, we suggest that face detection should be paid more attend to. To address this problem, we also used Edge detection method to detect i/p image properly or effectively. With this Edge Detection also used genetic algorithm to optimize weight using artificial neural network (ANN)and save that ANN file to database .And use that ANN file to compare face recognition in future DOI: 10.17762/ijritcc2321-8169.16041

    Design and Real-World Application of Novel Machine Learning Techniques for Improving Face Recognition Algorithms

    Get PDF
    Recent progress in machine learning has made possible the development of real-world face recognition applications that can match face images as good as or better than humans. However, several challenges remain unsolved. In this PhD thesis, some of these challenges are studied and novel machine learning techniques to improve the performance of real-world face recognition applications are proposed. Current face recognition algorithms based on deep learning techniques are able to achieve outstanding accuracy when dealing with face images taken in unconstrained environments. However, training these algorithms is often costly due to the very large datasets and the high computational resources needed. On the other hand, traditional methods for face recognition are better suited when these requirements cannot be satisfied. This PhD thesis presents new techniques for both traditional and deep learning methods. In particular, a novel traditional face recognition method that combines texture and shape features together with subspace representation techniques is first presented. The proposed method is lightweight and can be trained quickly with small datasets. This method is used for matching face images scanned from identity documents against face images stored in the biometric chip of such documents. Next, two new techniques to increase the performance of face recognition methods based on convolutional neural networks are presented. Specifically, a novel training strategy that increases face recognition accuracy when dealing with face images presenting occlusions, and a new loss function that improves the performance of the triplet loss function are proposed. Finally, the problem of collecting large face datasets is considered, and a novel method based on generative adversarial networks to synthesize both face images of existing subjects in a dataset and face images of new subjects is proposed. The accuracy of existing face recognition algorithms can be increased by training with datasets augmented with the synthetic face images generated by the proposed method. In addition to the main contributions, this thesis provides a comprehensive literature review of face recognition methods and their evolution over the years. A significant amount of the work presented in this PhD thesis is the outcome of a 3-year-long research project partially funded by Innovate UK as part of a Knowledge Transfer Partnership between University of Hertfordshire and IDscan Biometrics Ltd (partnership number: 009547)
    • …
    corecore