Search CORE

605 research outputs found

Recommended from our members

A Review of Techniques on Gait-Based Person Re-Identification

Author: Li M
Qi M
Rahi B
Publication venue: Australia Academic Press (Scilight Press)
Publication date: 27/03/2023
Field of study

Copyright (c) 2023 Babak Rahi, Maozhen Li and Man Qi. Person re-identification at a distance across multiple non-overlapping cameras has been an active research area for years. In the past ten years, short-term Person re-identification techniques have made great strides in accuracy using only appearance features in limited environments. However, massive intra-class variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject's appearance and clothing might have changed after the elapse of a significant period. Facing these problems, soft biometric features such as Gait has stirred much interest in the past years. Nevertheless, even Gait can vary with illness, ageing and emotional states, walking surfaces, shoe types, clothes types, carried objects (by the subject) and even environment clutters. Therefore, Gait is considered as a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So extracting discriminative features from both spatial and temporal domains would benefit this research. This article examines the main approaches used in gait analysis for re-identification over the past decade. We identify several relevant dimensions of the problem and provide a taxonomic analysis of current research. We conclude by reviewing the performance levels achievable with current technology and providing a perspective on the most challenging and promising research directions.This research received no external funding

Brunel University Research Archive

MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition

Author: Dou Huanzhang
Li Xi
Su Wei
Yu Yunlong
Zhang Pengyi
Publication venue
Publication date: 06/06/2023
Field of study

Gait recognition, which aims at identifying individuals by their walking patterns, has recently drawn increasing research attention. However, gait recognition still suffers from the conflicts between the limited binary visual clues of the silhouette and numerous covariates with diverse scales, which brings challenges to the model's adaptiveness. In this paper, we address this conflict by developing a novel MetaGait that learns to learn an omni sample adaptive representation. Towards this goal, MetaGait injects meta-knowledge, which could guide the model to perceive sample-specific properties, into the calibration network of the attention mechanism to improve the adaptiveness from the omni-scale, omni-dimension, and omni-process perspectives. Specifically, we leverage the meta-knowledge across the entire process, where Meta Triple Attention and Meta Temporal Pooling are presented respectively to adaptively capture omni-scale dependency from spatial/channel/temporal dimensions simultaneously and to adaptively aggregate temporal information through integrating the merits of three complementary temporal aggregation methods. Extensive experiments demonstrate the state-of-the-art performance of the proposed MetaGait. On CASIA-B, we achieve rank-1 accuracy of 98.7%, 96.0%, and 89.3% under three conditions, respectively. On OU-MVLP, we achieve rank-1 accuracy of 92.4%.Comment: Accepted by ECCV202

arXiv.org e-Print Archive

GaitStrip: Gait Recognition via Effective Strip-based Feature Representations and Multi-Level Framework

Author: Guo Xianda
Li Lincheng
Lin Beibei
Sun Jiande
Wang Ming
Yu Xin
Zhang Shunli
Zhu Zheng
Publication venue
Publication date: 09/10/2022
Field of study

Many gait recognition methods first partition the human gait into N-parts and then combine them to establish part-based feature representations. Their gait recognition performance is often affected by partitioning strategies, which are empirically chosen in different datasets. However, we observe that strips as the basic component of parts are agnostic against different partitioning strategies. Motivated by this observation, we present a strip-based multi-level gait recognition network, named GaitStrip, to extract comprehensive gait information at different levels. To be specific, our high-level branch explores the context of gait sequences and our low-level one focuses on detailed posture changes. We introduce a novel StriP-Based feature extractor (SPB) to learn the strip-based feature representations by directly taking each strip of the human body as the basic unit. Moreover, we propose a novel multi-branch structure, called Enhanced Convolution Module (ECM), to extract different representations of gaits. ECM consists of the Spatial-Temporal feature extractor (ST), the Frame-Level feature extractor (FL) and SPB, and has two obvious advantages: First, each branch focuses on a specific representation, which can be used to improve the robustness of the network. Specifically, ST aims to extract spatial-temporal features of gait sequences, while FL is used to generate the feature representation of each frame. Second, the parameters of the ECM can be reduced in test by introducing a structural re-parameterization technique. Extensive experimental results demonstrate that our GaitStrip achieves state-of-the-art performance in both normal walking and complex conditions.Comment: Accepted to ACCV202

arXiv.org e-Print Archive

Person recognition based on deep gait: a survey.

Author: Deb Kaushik
Hasan Md Junayed
Khaliluzzaman Md.
Uddin Ashraf
Publication venue: 'MDPI AG'
Publication date: 18/05/2023
Field of study

Gait recognition, also known as walking pattern recognition, has expressed deep interest in the computer vision and biometrics community due to its potential to identify individuals from a distance. It has attracted increasing attention due to its potential applications and non-invasive nature. Since 2014, deep learning approaches have shown promising results in gait recognition by automatically extracting features. However, recognizing gait accurately is challenging due to the covariate factors, complexity and variability of environments, and human body representations. This paper provides a comprehensive overview of the advancements made in this field along with the challenges and limitations associated with deep learning methods. For that, it initially examines the various gait datasets used in the literature review and analyzes the performance of state-of-the-art techniques. After that, a taxonomy of deep learning methods is presented to characterize and organize the research landscape in this field. Furthermore, the taxonomy highlights the basic limitations of deep learning methods in the context of gait recognition. The paper is concluded by focusing on the present challenges and suggesting several research directions to improve the performance of gait recognition in the future

Open Access Institutional Repository at Robert Gordon University

Recommended from our members

View-invariant gait person re-identification with spatial and temporal attention

Author: Rahi Babak
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPerson re-identification at a distance across multiple none overlapping cameras has been an active research area for years. In the past ten years, Short term Person Re-Id techniques have made great strides in terms of accuracy using only appearance features in limited environments. However, massive intraclass variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject appearance and clothing might have changed after a significant period has elapsed. Facing these problems, soft biometric features such as Gait have been proposed in the past. Nevertheless, even Gait can vary with illness, ageing and changes in the emotional state, changes in walking surfaces, shoe type, clothes type, objects carried by the subject and even clutter in the scene. Therefore, Gait is considered a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So, extracting discriminative features from both spatial and temporal domains would be very beneficial to this research. Therefore, this thesis focuses on finding the best and most robust method to tackle the gait human Re-identification problem and solve it for practical applications. In real-world surveillance scenarios, the human gait cycle is primarily abnormal. These abnormalities include but not limited to temporal and spatial characteristics changes such as walking speed, broken gait phase and most importantly, varied camera angles. Our work performed an extensive literature study on spatial and temporal gait feature extraction methods with a focus on deep learning. Next, we conducted a comparative study and proposed a spatial-temporal approach for gait feature extraction using the fusion of multiple modalities, including optical-flow, raw silhouettes and RGB images. This approach was tested on two of the most challenging publicly available datasets for gait recognition TUM-GAID and CASIA-B, with excellent results presented in chapter 3. Furthermore, a modern spatial-temporal attention mechanism was proposed and tested on CASIA-B and OULP datasets which learns salient features independent of the gait cycle and view variations. The spatial attention layer in the proposed method extracts the spatial feature maps using a two-layered architecture that are fused using late fusion. It can pay attention to the identity-related salient regions in silhouette sequences discriminatively using the spatial feature maps. The temporal attention layer consists of an LSTM that encodes the temporal motion for silhouette sequences. It uses the encoded output vectors in the temporal attention architecture to focus on the most critical timesteps in the gait cycle and discard the rest. Furthermore, we improved the performance of our method by mapping our extracted spatial-temporal gait features to a discriminative null space for use in our Siamese architecture for crossmatching. We also conducted an element removal experiment on each segment of our spatial-temporal attentional network to gain insight into each component’s contribution to the performance. Our method showed outstanding robustness against abnormal gait cycles as well as viewpoint variations on both benchmark datasets

Brunel University Research Archive

Gait recognition for person re-identification

Author: Al-Maadeed Somaya
Almaadeed Noor
Bouridane Ahmed
Elharrouss Omar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Person re-identification across multiple cameras is an essential task in computer vision applications, particularly tracking the same person in different scenes. Gait recognition, which is the recognition based on the walking style, is mostly used for this purpose due to that human gait has unique characteristics that allow recognizing a person from a distance. However, human recognition via gait technique could be limited with the position of captured images or videos. Hence, this paper proposes a gait recognition approach for person re-identification. The proposed approach starts with estimating the angle of the gait first, and this is then followed with the recognition process, which is performed using convolutional neural networks. Herein, multitask convolutional neural network models and extracted gait energy images (GEIs) are used to estimate the angle and recognize the gait. GEIs are extracted by first detecting the moving objects, using background subtraction techniques. Training and testing phases are applied to the following three recognized datasets: CASIA-(B), OU-ISIR, and OU-MVLP. The proposed method is evaluated for background modeling using the Scene Background Modeling and Initialization (SBI) dataset. The proposed gait recognition method showed an accuracy of more than 98% for almost all datasets. Results of the proposed approach showed higher accuracy compared to obtained results of other methods result for CASIA-(B) and OU-MVLP and form the best results for the OU-ISIR dataset

Qatar University Institutional Repository

Northumbria Research Link

FigShare

Soft Biometric Analysis: MultiPerson and RealTime Pedestrian Attribute Recognition in Crowded Urban Environments

Author: Yaghoubi Ehsan
Publication venue
Publication date: 11/11/2021
Field of study

Traditionally, recognition systems were only based on human hard biometrics. However, the ubiquitous CCTV cameras have raised the desire to analyze human biometrics from far distances, without people attendance in the acquisition process. Highresolution face closeshots are rarely available at far distances such that facebased systems cannot provide reliable results in surveillance applications. Human soft biometrics such as body and clothing attributes are believed to be more effective in analyzing human data collected by security cameras. This thesis contributes to the human soft biometric analysis in uncontrolled environments and mainly focuses on two tasks: Pedestrian Attribute Recognition (PAR) and person reidentification (reid). We first review the literature of both tasks and highlight the history of advancements, recent developments, and the existing benchmarks. PAR and person reid difficulties are due to significant distances between intraclass samples, which originate from variations in several factors such as body pose, illumination, background, occlusion, and data resolution. Recent stateoftheart approaches present endtoend models that can extract discriminative and comprehensive feature representations from people. The correlation between different regions of the body and dealing with limited learning data is also the objective of many recent works. Moreover, class imbalance and correlation between human attributes are specific challenges associated with the PAR problem. We collect a large surveillance dataset to train a novel gender recognition model suitable for uncontrolled environments. We propose a deep residual network that extracts several posewise patches from samples and obtains a comprehensive feature representation. In the next step, we develop a model for multiple attribute recognition at once. Considering the correlation between human semantic attributes and class imbalance, we respectively use a multitask model and a weighted loss function. We also propose a multiplication layer on top of the backbone features extraction layers to exclude the background features from the final representation of samples and draw the attention of the model to the foreground area. We address the problem of person reid by implicitly defining the receptive fields of deep learning classification frameworks. The receptive fields of deep learning models determine the most significant regions of the input data for providing correct decisions. Therefore, we synthesize a set of learning data in which the destructive regions (e.g., background) in each pair of instances are interchanged. A segmentation module determines destructive and useful regions in each sample, and the label of synthesized instances are inherited from the sample that shared the useful regions in the synthesized image. The synthesized learning data are then used in the learning phase and help the model rapidly learn that the identity and background regions are not correlated. Meanwhile, the proposed solution could be seen as a data augmentation approach that fully preserves the label information and is compatible with other data augmentation techniques. When reid methods are learned in scenarios where the target person appears with identical garments in the gallery, the visual appearance of clothes is given the most importance in the final feature representation. Clothbased representations are not reliable in the longterm reid settings as people may change their clothes. Therefore, developing solutions that ignore clothing cues and focus on identityrelevant features are in demand. We transform the original data such that the identityrelevant information of people (e.g., face and body shape) are removed, while the identityunrelated cues (i.e., color and texture of clothes) remain unchanged. A learned model on the synthesized dataset predicts the identityunrelated cues (shortterm features). Therefore, we train a second model coupled with the first model and learns the embeddings of the original data such that the similarity between the embeddings of the original and synthesized data is minimized. This way, the second model predicts based on the identityrelated (longterm) representation of people. To evaluate the performance of the proposed models, we use PAR and person reid datasets, namely BIODI, PETA, RAP, Market1501, MSMTV2, PRCC, LTCC, and MIT and compared our experimental results with stateoftheart methods in the field. In conclusion, the data collected from surveillance cameras have low resolution, such that the extraction of hard biometric features is not possible, and facebased approaches produce poor results. In contrast, soft biometrics are robust to variations in data quality. So, we propose approaches both for PAR and person reid to learn discriminative features from each instance and evaluate our proposed solutions on several publicly available benchmarks.This thesis was prepared at the University of Beria Interior, IT Instituto de Telecomunicações, Soft Computing and Image Analysis Laboratory (SOCIA Lab), Covilhã Delegation, and was submitted to the University of Beira Interior for defense in a public examination session

UBibliorum repositorio digital da ubi

Human Gait Analysis using Spatiotemporal Data Obtained from Gait Videos

Author: Abbasi Anees Qumar
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 23/01/2023
Field of study

Mit der Entwicklung von Deep-Learning-Techniken sind Deep-acNN-basierte Methoden zum Standard für Bildverarbeitungsaufgaben geworden, wie z. B. die Verfolgung menschlicher Bewegungen und Posenschätzung, die Erkennung menschlicher Aktivitäten und die Erkennung von Gesichtern. Deep-Learning-Techniken haben den Entwurf, die Implementierung und den Einsatz komplexer und vielfältiger Anwendungen verbessert, die nun in einer Vielzahl von Bereichen, einschließlich der Biomedizintechnik, eingesetzt werden. Die Anwendung von Computer-Vision-Techniken auf die medizinische Bild- und Videoanalyse hat zu bemerkenswerten Ergebnissen bei der Erkennung von Ereignissen geführt. Die eingebaute Fähigkeit von convolutional neural network (CNN), Merkmale aus komplexen medizinischen Bildern zu extrahieren, hat in Verbindung mit der Fähigkeit von long short term memory network (LSTM), die zeitlichen Informationen zwischen Ereignissen zu erhalten, viele neue Horizonte für die medizinische Forschung geschaffen. Der Gang ist einer der kritischen physiologischen Bereiche, der viele Störungen im Zusammenhang mit Alterung und Neurodegeneration widerspiegeln kann. Eine umfassende und genaue Ganganalyse kann Einblicke in die physiologischen Bedingungen des Menschen geben. Bestehende Ganganalyseverfahren erfordern eine spezielle Umgebung, komplexe medizinische Geräte und geschultes Personal für die Erfassung der Gangdaten. Im Falle von tragbaren Systemen kann ein solches System die kognitiven Fähigkeiten beeinträchtigen und für die Patienten unangenehm sein. Außerdem wurde berichtet, dass die Patienten in der Regel versuchen, während des Labortests bessere Leistungen zu erbringen, was möglicherweise nicht ihrem tatsächlichen Gang entspricht. Trotz technologischer Fortschritte stoßen wir bei der Messung des menschlichen Gehens in klinischen und Laborumgebungen nach wie vor an Grenzen. Der Einsatz aktueller Ganganalyseverfahren ist nach wie vor teuer und zeitaufwändig und erschwert den Zugang zu Spezialgeräten und Fachwissen. Daher ist es zwingend erforderlich, über Methoden zu verfügen, die langfristige Daten über den Gesundheitszustand des Patienten liefern, ohne doppelte kognitive Aufgaben oder Unannehmlichkeiten bei der Verwendung tragbarer Sensoren. In dieser Arbeit wird daher eine einfache, leicht zu implementierende und kostengünstige Methode zur Erfassung von Gangdaten vorgeschlagen. Diese Methode basiert auf der Aufnahme von Gehvideos mit einer Smartphone-Kamera in einer häuslichen Umgebung unter freien Bedingungen. Deep neural network (NN) verarbeitet dann diese Videos, um die Gangereignisse zu extrahieren. Die erkannten Ereignisse werden dann weiter verwendet, um verschiedene räumlich-zeitliche Parameter des Gangs zu quantifizieren, die für jedes Ganganalysesystem wichtig sind. In dieser Arbeit wurden Gangvideos verwendet, die mit einer Smartphone-Kamera mit geringer Auflösung außerhalb der Laborumgebung aufgenommen wurden. Viele Deep- Learning-basierte NNs wurden implementiert, um die grundlegenden Gangereignisse wie die Fußposition in Bezug auf den Boden aus diesen Videos zu erkennen. In der ersten Studie wurde die Architektur von AlexNet verwendet, um das Modell anhand von Gehvideos und öffentlich verfügbaren Datensätzen von Grund auf zu trainieren. Mit diesem Modell wurde eine Gesamtgenauigkeit von 74% erreicht. Im nächsten Schritt wurde jedoch die LSTM-Schicht in dieselbe Architektur integriert. Die eingebaute Fähigkeit von LSTM in Bezug auf die zeitliche Information führte zu einer verbesserten Vorhersage der Etiketten für die Fußposition, und es wurde eine Genauigkeit von 91% erreicht. Allerdings gibt es Schwierigkeiten bei der Vorhersage der richtigen Bezeichnungen in der letzten Phase des Schwungs und der Standphase jedes Fußes. Im nächsten Schritt wird das Transfer-Lernen eingesetzt, um die Vorteile von bereits trainierten tiefen NNs zu nutzen, indem vortrainierte Gewichte verwendet werden. Zwei bekannte Modelle, inceptionresnetv2 (IRNV-2) und densenet201 (DN-201), wurden mit ihren gelernten Gewichten für das erneute Training des NN auf neuen Daten verwendet. Das auf Transfer-Lernen basierende vortrainierte NN verbesserte die Vorhersage von Kennzeichnungen für verschiedene Fußpositionen. Es reduzierte insbesondere die Schwankungen in den Vorhersagen in der letzten Phase des Gangschwungs und der Standphase. Bei der Vorhersage der Klassenbezeichnungen der Testdaten wurde eine Genauigkeit von 94% erreicht. Da die Abweichung bei der Vorhersage des wahren Labels hauptsächlich ein Bild betrug, konnte sie bei einer Bildrate von 30 Bildern pro Sekunde ignoriert werden. Die vorhergesagten Markierungen wurden verwendet, um verschiedene räumlich-zeitliche Parameter des Gangs zu extrahieren, die für jedes Ganganalysesystem entscheidend sind. Insgesamt wurden 12 Gangparameter quantifiziert und mit der durch Beobachtungsmethoden gewonnenen Grundwahrheit verglichen. Die NN-basierten räumlich-zeitlichen Parameter zeigten eine hohe Korrelation mit der Grundwahrheit, und in einigen Fällen wurde eine sehr hohe Korrelation erzielt. Die Ergebnisse belegen die Nützlichkeit der vorgeschlagenen Methode. DerWert des Parameters über die Zeit ergab eine Zeitreihe, eine langfristige Darstellung des Ganges. Diese Zeitreihe konnte mit verschiedenen mathematischen Methoden weiter analysiert werden. Als dritter Beitrag in dieser Dissertation wurden Verbesserungen an den bestehenden mathematischen Methoden der Zeitreihenanalyse von zeitlichen Gangdaten vorgeschlagen. Zu diesem Zweck werden zwei Verfeinerungen bestehender entropiebasierter Methoden zur Analyse von Schrittintervall-Zeitreihen vorgeschlagen. Diese Verfeinerungen wurden an Schrittintervall-Zeitseriendaten von normalen und neurodegenerativen Erkrankungen validiert, die aus der öffentlich zugänglichen Datenbank PhysioNet heruntergeladen wurden. Die Ergebnisse zeigten, dass die von uns vorgeschlagene Methode eine klare Trennung zwischen gesunden und kranken Gruppen ermöglicht. In Zukunft könnten fortschrittliche medizinische Unterstützungssysteme, die künstliche Intelligenz nutzen und von den hier vorgestellten Methoden abgeleitet sind, Ärzte bei der Diagnose und langfristigen Überwachung des Gangs von Patienten unterstützen und so die klinische Arbeitsbelastung verringern und die Patientensicherheit verbessern

KITopen

Gender and gaze gesture recognition for human-computer interaction

Author: Abdul Farooq
Alexandre
Arandjelovic
Asadifard
Campadelli
Chang
Cristinacce
Dalal
Do
Drewes
Duda
Fagertun
Grudin
Hamouz
Hu
Huynh
Hyrskykari
Jaimes
Jesorsky
Kapoor
Karray
Koenderink
Kroon
Lasinger
Lee
Leo
Lichtenauer
Liu
Lyndon N. Smith
Melvyn L. Smith
Moghaddam
Mäkinen
Mäkinen
Ng
Niu
Ojala
Perronnin
Phillips
Phillips
Phung
Reynolds
Rozado
Shan
Simonyan
Sánchez
Tawari
Timm
Tivive
Türkan
Ullah
Valenti
Valenti
Vedaldi
Viola
Viola
Wachs
Wang
Wang
Wenhao Zhang
Zhu
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/08/2016
Field of study

© 2016 Elsevier Inc. The identification of visual cues in facial images has been widely explored in the broad area of computer vision. However theoretical analyses are often not transformed into widespread assistive Human-Computer Interaction (HCI) systems, due to factors such as inconsistent robustness, low efficiency, large computational expense or strong dependence on complex hardware. We present a novel gender recognition algorithm, a modular eye centre localisation approach and a gaze gesture recognition method, aiming to escalate the intelligence, adaptability and interactivity of HCI systems by combining demographic data (gender) and behavioural data (gaze) to enable development of a range of real-world assistive-technology applications. The gender recognition algorithm utilises Fisher Vectors as facial features which are encoded from low-level local features in facial images. We experimented with four types of low-level features: greyscale values, Local Binary Patterns (LBP), LBP histograms and Scale Invariant Feature Transform (SIFT). The corresponding Fisher Vectors were classified using a linear Support Vector Machine. The algorithm has been tested on the FERET database, the LFW database and the FRGCv2 database, yielding 97.7%, 92.5% and 96.7% accuracy respectively. The eye centre localisation algorithm has a modular approach, following a coarse-to-fine, global-to-regional scheme and utilising isophote and gradient features. A Selective Oriented Gradient filter has been specifically designed to detect and remove strong gradients from eyebrows, eye corners and self-shadows (which sabotage most eye centre localisation methods). The trajectories of the eye centres are then defined as gaze gestures for active HCI. The eye centre localisation algorithm has been compared with 10 other state-of-the-art algorithms with similar functionality and has outperformed them in terms of accuracy while maintaining excellent real-time performance. The above methods have been employed for development of a data recovery system that can be employed for implementation of advanced assistive technology tools. The high accuracy, reliability and real-time performance achieved for attention monitoring, gaze gesture control and recovery of demographic data, can enable the advanced human-robot interaction that is needed for developing systems that can provide assistance with everyday actions, thereby improving the quality of life for the elderly and/or disabled

Crossref

UWE Bristol Research Repository