Search CORE

541 research outputs found

Learning optimised representations for view-invariant gait recognition

Author: Jia Ning
Li Chang-Tsun
Sanchez Silva Victor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Gait recognition can be performed without subject cooperation under harsh conditions, thus it is an important tool in forensic gait analysis, security control, and other commercial applications. One critical issue that prevents gait recognition systems from being widely accepted is the performance drop when the camera viewpoint varies between the registered templates and the query data. In this paper, we explore the potential of combining feature optimisers and representations learned by convolutional neural networks (CNN) to achieve efficient view-invariant gait recognition. The experimental results indicate that CNN learns highly discriminative representations across moderate view variations, and these representations can be further improved using view-invariant feature selectors, achieving a high matching accuracy across views

Deakin Research Online

Warwick Research Archives Portal Repository

Review of Person Re-identification Techniques

Author: Aini Hussain
Allouch A.
Bhattacharyya A.
Bilmes J.A.
Cong D‐N.T.
Cong T.
Corvee E.
De Oliveira I.O.
Du Y.
Forsśen P.E.
Gheissari N.
Goldmann L.
Halimah Badioze Zaman
Hamdoun O.
Horprasert T.
Kawai R.
Khedher M.I.
Lantagne M.
Layne R.
Mohamad Hanif Md. Saad
Mohammad Ali Saghafi
Musa Z.B.
Nguyen H.Q.
Ohara Y.
Skog D.
Stauffer C.
Sun J.
Wang J.
Xiang J.
Yang H.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/12/2014
Field of study

Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Dynamic distance-based shape features for gait recognition

Author: Belyaev Alexander
Robertson Neil
Whytock Tenika
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/03/2014
Field of study

Queen's University Belfast Research Portal

Heriot Watt Pure

Crossref

Springer - Publisher Connector

Person Re-Identification by Discriminative Selection in Video Ranking

Author: Gong S
Wang S
Wang T
Zhu X
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/01/2016
Field of study

Current person re-identification (ReID) methods typically rely on single-frame imagery features, whilst ignoring space-time information from image sequences often available in the practical surveillance scenarios. Single-frame (single-shot) based visual appearance matching is inherently limited for person ReID in public spaces due to the challenging visual ambiguity and uncertainty arising from non-overlapping camera views where viewing condition changes can cause significant people appearance variations. In this work, we present a novel model to automatically select the most discriminative video fragments from noisy/incomplete image sequences of people from which reliable space-time and appearance features can be computed, whilst simultaneously learning a video ranking function for person ReID. Using the PRID2011, iLIDS-VID, and HDA+ image sequence datasets, we extensively conducted comparative evaluations to demonstrate the advantages of the proposed model over contemporary gait recognition, holistic image sequence matching and state-of-the-art single-/multi-shot ReID methods

arXiv.org e-Print Archive

Crossref

Queen Mary Research Online

Covariate factor mitigation techniques for robust gait recognition

Author: Whytock Tenika P.
Publication venue: Engineering and Physical Sciences
Publication date: 01/05/2015
Field of study

The human gait is a discriminative feature capable of recognising a person by their unique walking manner. Currently gait recognition is based on videos captured in a controlled environment. These videos contain challenges, termed covariate factors, which affect the natural appearance and motion of gait, e.g. carrying a bag, clothing, shoe type and time. However gait recognition has yet to achieve robustness to these covariate factors. To achieve enhanced robustness capabilities, it is essential to address the existing gait recognition limitations. Specifically, this thesis develops an understanding of how covariate factors behave while a person is in motion and the impact covariate factors have on the natural appearance and motion of gait. Enhanced robustness is achieved by producing a combination of novel gait representations and novel covariate factor detection and removal procedures. Having addressed the limitations regarding covariate factors, this thesis achieves the goal of robust gait recognition. Using a skeleton representation of the human figure, the Skeleton Variance Image condenses a skeleton sequence into a single compact 2D gait representation to express the natural gait motion. In addition, a covariate factor detection and removal module is used to maximise the mitigation of covariate factor effects. By establishing the average pixel distribution within training (covariate factor free) representations, a comparison against test (covariate factor) representations achieves effective covariate factor detection. The corresponding difference can effectively remove covariate factors which occur at the boundary of, and hidden within, the human figure.The Engineering and Physical Sciences Research Council (EPSRC

ROS: The Research Output Service. Heriot-Watt University Edinburgh

A comparative study of pose representation and dynamics modelling for online motion quality assessment

Author: Adeline Paiement
Aggarwal
Arias
Beal
Blackburn
Charles
Coifman
Coifman
De Rosa
Dima Damen
Duong
Elgammal
Gerber
Hussein
Ian Craddock
Ke
Krüger
Kviatkovsky
Kwon
Laptev
Lili Tao
Lin
Liu
Lv
Majid Mirmehdi
Massimo Camplani
Narasimhan
Natarajan
Nater
Nowozin
Ohn-Bar
Paiement
Parra-Dominguez
Peursum
Pirsiavash
Platt
Popoola
Poppe
Rabiner
Shotton
Sion Hannuna
Snoek
Thuc
Tilo Burghardt
Tuytelaars
Uddin
Uddin
Valstar
VanSwearingen
Vaswani
Vemulapalli
Wang
Wang
Wang
Wolfson
Xia
Xia
Yao
Ye
Publication venue: 'Elsevier BV'
Publication date: 01/07/2016
Field of study

© 2015 The Authors. Published by Elsevier Inc. Quantitative assessment of the quality of motion is increasingly in demand by clinicians in healthcare and rehabilitation monitoring of patients. We study and compare the performances of different pose representations and HMM models of dynamics of movement for online quality assessment of human motion. In a general sense, our assessment framework builds a model of normal human motion from skeleton-based samples of healthy individuals. It encapsulates the dynamics of human body pose using robust manifold representation and a first-order Markovian assumption. We then assess deviations from it via a continuous online measure. We compare different feature representations, reduced dimensionality spaces, and HMM models on motions typically tested in clinical settings, such as gait on stairs and flat surfaces, and transitions between sitting and standing. Our dataset is manually labelled by a qualified physiotherapist. The continuous-state HMM, combined with pose representation based on body-joints' location, outperforms standard discrete-state HMM approaches and other skeleton-based features in detecting gait abnormalities, as well as assessing deviations from the motion model on a frame-by-frame basis

Elsevier - Publisher Connector

Crossref

UWE Bristol Research Repository

Cronfa at Swansea University

Explore Bristol Research

Person Recognition in Low-Quality Imagery.

Author: Cheng Zhiyi
Publication venue: Queen Mary University of London.
Publication date: 01/01/2021
Field of study

PhD thesesPerson recognition aims to recognise and track the same individuals over space and time with subtle identity class information in automatically detected person images captured by unconstrained camera views. There are multi-source visual biometrical cues for person identity recognition. Specifically, compared to other widely-used cues that tend to easily change over time and space, the facial appearance is considered as a more reliable non-intrusive visual cue. Person recognition, especially the person face recognition, enables a wide range of practical applications, ranging from law enforcement and information security to business, entertainment and e-commerce. However, person recognition under realistic application scenarios remains significantly challenging, mainly due to the usual low resolutions (LR) of the images captured by low-quality cameras with unconstrained distances between cameras and people. Compared to the high-resolution (HR) images, the LR person images contain much less fine-grained discriminative details for robust identity recognition. To tackle the challenge of person recognition on low-resolution imagery data, one effective approach is to utilise the super resolution (SR) methods to recover or enhance the image details that are beneficial for identity recognition. However, this thesis reveals that conventional SR models suffer from significant performance drop when applied to low-quality LR person images, especially the natively captured surveillance facial images. Moreover, as the SR and identity recognition models advance independently, direct super resolution is less compatible with identity recognition, and hence has minor benefit or even negative effect for low-resolution person recognition. To tackle the above problems, this thesis explores person recognition methods with improved generalisation ability to realistic low-quality person images, by adopting dedicated superresolution algorithms. More specifically, this thesis addresses the issues for person face recognition and body recognition in low-resolution images as follows: Chapter 3 Whilst recent person face recognition techniques have made significant progress on recognising constrained high-resolution web images, the same cannot be said on natively unconstrained low-resolution images at large scales. This chapter examines systematically this under-studied person face recognition problem, and introduce a novel Complement Super-Resolution and Identity (CSRI) joint deep learning method with a unified end-to-end network architecture. The proposed learning mechanism is dedicated to overcome the inherent challenge of genuine low-resolution, concerning with the absence of HR facial images coupled with native LR faces, typically required for optimising image super-resolution models. This is realised by transferring the super-resolving knowledge from good-quality HR web images to the genuine LR facial data subject to the face identity label constraints of native LR faces in every mini-batch training. This chapter further constructs a new large-scale dataset TinyFace of native unconstrained low-resolution face images from selected public datasets. The extensive experiments show that there is a significant gap between the reported person face recognition performances on popular benchmarks and the results on TinyFace, and the advantages of the proposed CSRI over a variety of state-of-the-art face recognition and super-resolution deep models on solving this largely ignored person face recognition scenario. However, the lack of supervision in pixel space leads to the low-fidelity super-resolved images. which may hinder the further downstream facial analysis applications. Chapter 4 Although with a more advanced joint-learning scheme for person face recognition by super resolution (introduced in Chapter 3), by no-means one can claim that the proposed method solves the real-world low-resolution face recognition problem, which remains a significantly challenging task. In terms of human understanding, when people are faced with a challenging face identity recognition task, they often make decisions by selecting discriminative facial features. If a recognition model can be optimised with results that can be explained in a human-understandable way, such an interpretable model may have the potential to shed light on discriminative facial features selection for better identity recognition. To achieve this, recognising faces from high-fidelity super-resolved outputs could be a viable approach. However, existing facial super-resolution methods focus mostly on improving “artificially down-sampled” low-resolution (LR) imagery. Such SR models, although strong at handling artificial LR images, often suffer from significant performance drop on genuine LR test data. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data as well as cycle consistency loss formulation. However, this renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. Importantly, this makes the end-to-end model training ineffective due to the difficulty of back-propagating gradients through two concatenated CNNs. To solve this problem, in this chapter, a method that joins the advantages of conventional SR and UDA models is formulated. Specifically, the optimisations for characteristics consistifying and image super-resolving are separated and controlled by introducing Characteristic Regularisation (CR) between them. This task split makes the model training more effective and computationally tractable, and enables the high-fidelity super resolution process on genuine low-resolution faces. Chapter 5 Although the facial appearance is a more reliable visual cue for person recognition, it is often challenging or even impossible to detect the facial region in images captured by unconstrained low-quality cameras, where the faces can be of extreme poses, blur, distortion, or even invisible in the human back-view images. In such cases, the person body recognition is an important aspect for identity recognition and tracking. However, person images captured by unconstrained surveillance cameras often have low resolutions (LR). This causes the resolution mismatch problem when matched against the high-resolution (HR) gallery images, negatively affecting the performance of person body recognition. An effective approach is to leverage image super-resolution (SR) along with body recognition in a joint learning manner. However, this scheme is limited due to dramatically more difficult gradients backpropagation during training. This chapter introduces a novel model training regularisation method, called Inter-Task Association Critic (INTACT), to address this fundamental problem. Specifically, INTACT discovers the underlying association knowledge between image SR and person body recognition, and leverages it as an extra learning constraint for enhancing the compatibility of SR model with person body recognition in HR image space. This is realised by parameterising the association constraint, which can be automatically learned from the training data. Extensive experiments validate the superiority of INTACT over the state-of-the-art approaches on the cross-resolution person body recognition task using five standard datasets. Chapter 6 draws conclusions and suggests future works on open questions arising from the studies of this thesis

Queen Mary Research Online

Recommended from our members

Image based human body rendering via regression & MRF energy minimization

Author: Li Xinfeng
Publication venue
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A machine learning method for synthesising human images is explored to create new images without relying on 3D modelling. Machine learning allows the creation of new images through prediction from existing data based on the use of training images. In the present study, image synthesis is performed at two levels: contour and pixel. A class of learning-based methods is formulated to create object contours from the training image for the synthetic image that allow pixel synthesis within the contours in the second level. The methods rely on applying robust object descriptions, dynamic learning models after appropriate motion segmentation, and machine learning-based frameworks. Image-based human image synthesis using machine learning is a research focus that has recently gained considerable attention in the field of computer graphics. It makes use of techniques from image/motion analysis in computer vision. The problem lies in the estimation of methods for image-based object configuration (i.e. segmentation, contour outline). Using the results of these analysis methods as bases, the research adopts the machine learning approach, in which human images are synthesised by executing the synthesis of contour and pixels through the learning from training image. Firstly, thesis shows how an accurate silhouette is distilled using developed background subtraction for accuracy and efficiency. The traditional vector machine approach is used to avoid ambiguities within the regression process. Images can be represented as a class of accurate and efficient vectors for single images as well as sequences. Secondly, the framework is explored using a unique view of machine learning methods, i.e., support vector regression (SVR), to obtain the convergence result of vectors for contour allocation. The changing relationship between the synthetic image and the training image is expressed as a vector and represented in functions. Finally, a pixel synthesis is performed based on belief propagation. This thesis proposes a novel image-based rendering method for colour image synthesis using SVR and belief propagation for generalisation to enable the prediction of contour and colour information from input colour images. The methods rely on using appropriately defined and robust input colour images, optimising the input contour images within a sparse SVR framework. Firstly, the thesis shows how contour can effectively and efficiently be predicted from small numbers of input contour images. In addition, the thesis exploits the sparse properties of SVR efficiency, and makes use of SVR to estimate regression function. The image-based rendering method employed in this study enables contour synthesis for the prediction of small numbers of input source images. This procedure avoids the use of complex models and geometry information. Secondly, the method used for human body contour colouring is extended to define eight differently connected pixels, and construct a link distance field via the belief propagation method. The link distance, which acts as the message in propagation, is transformed by improving the low-envelope method in fast distance transform. Finally, the methodology is tested by considering human facial and human body clothing information. The accuracy of the test results for the human body model confirms the efficiency of the proposed method

Brunel University Research Archive