16 research outputs found
GAIT Technology for Human Recognition using CNN
Gait is a distinctive biometric characteristic that can be detected from a distance; as a result, it has several uses in social security, forensic identification, and crime prevention. Existing gait identification techniques use a gait template, which makes it difficult to keep temporal information, or a gait sequence, which maintains pointless sequential limitations and loses the ability to portray a gait. Our technique, which is based on this deep set viewpoint, is immune to frame permutations and can seamlessly combine frames from many videos that were taken in various contexts, such as diversified watching, angles, various outfits, or various situations for transporting something. According to experiments, our single-model strategy obtains an average rank-1 accuracy of 96.1% on the CASIA-B gait dataset and an accuracy of 87.9% on the OU-MVLP gait dataset when used under typical walking conditions. Our model also demonstrates a great degree of robustness under numerous challenging circumstances. When carrying bags and wearing a coat while walking, it obtains accuracy on the CASIA-B of 90.8% and 70.3%, respectively, greatly surpassing the best approach currently in use. Additionally, the suggested method achieves a satisfactory level of accuracy even when there are few frames available in the test samples; for instance, it achieves 85.0% on the CASIA-B even with only 7 frames
SIS 2017. Statistics and Data Science: new challenges, new generations
The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data
Recommended from our members
View-invariant gait person re-identification with spatial and temporal attention
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPerson re-identification at a distance across multiple none overlapping cameras has
been an active research area for years. In the past ten years, Short term Person Re-Id
techniques have made great strides in terms of accuracy using only appearance features
in limited environments. However, massive intraclass variations and inter-class
confusion limit their ability to be used in practical applications. Moreover, appearance
consistency can only be assumed in a short time span from one camera to the other.
Since the holistic appearance will change drastically over days and weeks, the technique,
as mentioned above, will be ineffective. Practical applications usually require a
long-term solution in which the subject appearance and clothing might have changed
after a significant period has elapsed. Facing these problems, soft biometric features
such as Gait have been proposed in the past. Nevertheless, even Gait can vary with
illness, ageing and changes in the emotional state, changes in walking surfaces, shoe
type, clothes type, objects carried by the subject and even clutter in the scene. Therefore,
Gait is considered a temporal cue that could provide biometric motion information.
On the other hand, the shape of the human body could be viewed as a spatial signal
which can produce valuable information. So, extracting discriminative features from
both spatial and temporal domains would be very beneficial to this research. Therefore,
this thesis focuses on finding the best and most robust method to tackle the gait human Re-identification problem and solve it for practical applications. In real-world
surveillance scenarios, the human gait cycle is primarily abnormal. These abnormalities
include but not limited to temporal and spatial characteristics changes such as
walking speed, broken gait phase and most importantly, varied camera angles. Our
work performed an extensive literature study on spatial and temporal gait feature extraction
methods with a focus on deep learning. Next, we conducted a comparative
study and proposed a spatial-temporal approach for gait feature extraction using the
fusion of multiple modalities, including optical-flow, raw silhouettes and RGB images.
This approach was tested on two of the most challenging publicly available datasets for
gait recognition TUM-GAID and CASIA-B, with excellent results presented in chapter
3.
Furthermore, a modern spatial-temporal attention mechanism was proposed and
tested on CASIA-B and OULP datasets which learns salient features independent of
the gait cycle and view variations. The spatial attention layer in the proposed method
extracts the spatial feature maps using a two-layered architecture that are fused using
late fusion. It can pay attention to the identity-related salient regions in silhouette sequences
discriminatively using the spatial feature maps. The temporal attention layer
consists of an LSTM that encodes the temporal motion for silhouette sequences. It
uses the encoded output vectors in the temporal attention architecture to focus on the
most critical timesteps in the gait cycle and discard the rest. Furthermore, we improved
the performance of our method by mapping our extracted spatial-temporal gait
features to a discriminative null space for use in our Siamese architecture for crossmatching.
We also conducted an element removal experiment on each segment of our
spatial-temporal attentional network to gain insight into each component’s contribution to the performance. Our method showed outstanding robustness against abnormal
gait cycles as well as viewpoint variations on both benchmark datasets
Investigation of gait representations and partial body gait recognition
Recognising an individual by the way they walk is one of the most popular research subjects within
the field of soft biometrics in last few decades. The advancement of technology and equipment such
as Close Circuit Television (CCTV), wireless internet and wearable sensors makes it easier to obtain
gait data than ever before. The gait biometric can be used widely and in different areas such as
biomedical, forensic and surveillance. However, gait recognition still has many challenges and
fundamental issues. All of these problems only serve as a researcher’s motivation to learn more about
various gait topics to overcome the challenges and improve the field of gait recognition.
Gait recognition currently has high performance when carried out under very specific conditions such
as normal walking, obstruction from certain types of clothing and fixed camera view angles. When the
aforementioned conditions are changed, the classification rate dramatically drops. This study aims to
solve the problems of clothing, carrying objects and camera view angles within the indoor
environment and video-based data collection. Two gait related databases used for testing in this study
are CASIA dataset B and OU-ISIR Large population dataset with Bag (OU-LP-Bag). Three main tasks will
be tested with CASIA dataset B while only gait recognition is tested with OU-LP-Bag.
The gait recognition framework is developed to solve the three main tasks including gait recognition
by identical view, view classification and cross view recognition. This framework uses gait images
sequence as input to generate a gait compact image. Next, gait features are extracted with the optimal
feature map by Principal Component Analysis (PCA) and then a linear Support Vector Machine (SVM)
is used as the one-against-all multiclass classifier.
Four gait compact images including Gait Energy Image (GEI), Gait Entropy Image (GEnI), Gait Gaussian
Image (GGI) and the novel gait images called Gait Gaussian Entropy Image (GGEnI) are used as basic
gait representations. Then three secondary gait representations are generated from these basic
representations. These include Gradient Histogram Gait Image (GHGI) and two novel gait
representations called Convolutional Gait Image (CGI) and Convolutional Gradient Histogram Gait
Image (CGHGI). All representations are tested with three main tasks.
When people walk, each body part does not have the same locomotion information, for example,
there is much more motion in the leg than shoulder motion when walking. Moreover, clothing and
carrying objects do not have the same level of affect to every part of the body, for example, a handbag
does not generally affect leg motion. This study divides the human body into fourteen different body
parts based on height. Body parts and gait representations are combined to solve the three main tasks.
Three combined parts techniques which use two different parts to solve the problem are created. The
fist is Part Scores Fusion (PSF) which uses the summation score of two models based on each part. The
highest summation score model is chosen as the result. The second is Part Image Fusion (PIF) which
concatenates two parts into a single image with a 1:1 ratio. The highest scoring model which is
generated from image fusion is selected as the result. The third is Multi Region Duplication (MRD)
which uses the same idea as PIF, however, the second part’s ratio is increased to 1:2, 1:3 and 1:4.
These techniques are tested on the gait recognition by identical view.
In conclusion, the general framework is effectively for three main tasks. GHGI-GEI which is generated
from full silhouette is the most effective representation for gait recognition by identical view and cross
view recognition. GHGI-GGI with lower knee region is the most effective representation for view angle
classification. The GHGI-GEI CPI combination between full body and limb parts is the most effective
combination on OU-LP-Bag. A more detailed description of each aspect is in the following Chapters
Shortest Route at Dynamic Location with Node Combination-Dijkstra Algorithm
Abstract— Online transportation has become a basic
requirement of the general public in support of all activities to go
to work, school or vacation to the sights. Public transportation
services compete to provide the best service so that consumers
feel comfortable using the services offered, so that all activities
are noticed, one of them is the search for the shortest route in
picking the buyer or delivering to the destination. Node
Combination method can minimize memory usage and this
methode is more optimal when compared to A* and Ant Colony
in the shortest route search like Dijkstra algorithm, but can’t
store the history node that has been passed. Therefore, using
node combination algorithm is very good in searching the
shortest distance is not the shortest route. This paper is
structured to modify the node combination algorithm to solve the
problem of finding the shortest route at the dynamic location
obtained from the transport fleet by displaying the nodes that
have the shortest distance and will be implemented in the
geographic information system in the form of map to facilitate
the use of the system.
Keywords— Shortest Path, Algorithm Dijkstra, Node
Combination, Dynamic Location (key words
GII Representation-Based Cross-View Gait Recognition by Discriminative Projection With List-Wise Constraints
Remote person identification by gait is one of the most important topics in the field of computer vision and pattern recognition. However, gait recognition suffers severely from the appearance variance caused by the view change. It is very common that gait recognition has a high performance when the view is fixed but the performance will have a sharp decrease when the view variance becomes significant. Existing approaches have tried all kinds of strategies like tensor analysis or view transform models to slow down the trend of performance decrease but still have potential for further improvement. In this paper, a discriminative projection with list-wise constraints (DPLC) is proposed to deal with view variance in cross-view gait recognition, which has been further refined by introducing a rectification term to automatically capture the principal discriminative information. The DPLC with rectification (DPLCR) embeds list-wise relative similarity measurement among intraclass and inner-class individuals, which can learn a more discriminative and robust projection. Based on the original DPLCR, we have introduced the kernel trick to exploit nonlinear cross-view correlations and extended DPLCR to deal with the problem of multiview gait recognition. Moreover, a simple yet efficient gait representation, namely gait individuality image (GII), based on gait energy image is proposed, which could better capture the discriminative information for cross view gait recognition. Experiments have been conducted in the CASIA-B database and the experimental results demonstrate the outstanding performance of both the DPLCR framework and the new GII representation. It is shown that the DPLCR-based cross-view gait recognition has outperformed the-state-of-the-art approaches in almost all cases under large view variance. The combination of the GII representation and the DPLCR has further enhanced the performance to be a new benchmark for cross-view gait recognition
확률적인 3차원 자세 복원과 행동인식
학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 2. 오성회.These days, computer vision technology becomes popular and plays an important role in intelligent systems, such as augment reality, video and image analysis, and to name a few. Although cost effective depth cameras, like a Microsoft Kinect, have recently developed, most computer vision algorithms assume that observations are obtained from RGB cameras, which make 2D observations. If, somehow, we can estimate 3D information from 2D observations, it might give better solutions for many computer vision problems.
In this dissertation, we focus on estimating 3D information from 2D observations, which is well known as non-rigid structure from motion (NRSfM).
More formally, NRSfM finds the three dimensional structure of an object by analyzing image streams with the assumption that an object lies in a low-dimensional space. However, a human body for long periods of time can have complex shape variations and it makes a challenging problem for NRSfM due to its increased degree of freedom. In order to handle complex shape variations, we propose a Procrustean normal distribution mixture model (PNDMM) by extending a recently proposed Procrustean normal distribution (PND), which captures the distribution of non-rigid variations of an object by excluding the effects of rigid motion.
Unlike existing methods which use a single model to solve an NRSfM problem, the proposed PNDMM decomposes complex shape variations into a collection of simpler ones, thereby model learning can be more tractable and accurate. We perform experiments showing that the proposed method outperforms existing methods on highly complex and long human motion sequences.
In addition, we extend the PNDMM to a single view 3D human pose estimation problem. While recovering a 3D structure of a human body from an image is important, it is a highly ambiguous problem due to the deformation of an articulated human body. Moreover, before estimating a 3D human pose from a 2D human pose, it is important to obtain an accurate 2D human pose. In order to address inaccuracy of 2D pose estimation on a single image and 3D human pose ambiguities, we estimate multiple 2D and 3D human pose candidates and select the best one which can be explained by a 2D human pose detector and a 3D shape model. We also introduce a model transformation which is incorporated into the 3D shape prior model, such that the proposed method can be applied to a novel test image.
Experimental results show that the proposed method can provide good 3D reconstruction results when tested on a novel test image, despite inaccuracies of 2D part detections and 3D shape ambiguities.
Finally, we handle an action recognition problem from a video clip. Current studies show that high-level features obtained from estimated 2D human poses enable action recognition performance beyond current state-of-the-art methods using low- and mid-level features based on appearance and motion, despite inaccuracy of human pose estimation. Based on these findings, we propose an action recognition method using estimated 3D human pose information since the proposed PNDMM is able to reconstruct 3D shapes from 2D shapes. Experimental results show that 3D pose based descriptors are better than 2D pose based descriptors for action recognition, regardless of classification methods. Considering the fact that we use simple 3D pose descriptors based on a 3D shape model which is learned from 2D shapes, results reported in this dissertation are promising and obtaining accurate 3D information from 2D observations is still an important research issue for reliable computer vision systems.Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Research Issues 4
1.3 Organization of the Dissertation 6
Chapter 2 Preliminary 9
2.1 Generalized Procrustes Analysis (GPA) 11
2.2 EM-GPA Algorithm 12
2.2.1 Objective function 12
2.2.2 E-step 15
2.2.3 M-step 16
2.3 Implementation Considerations for EM-GPA 18
2.3.1 Preprocessing stage 18
2.3.2 Small update rate for the covariance matrix 20
2.4 Experiments 21
2.4.1 Shape alignment with the missing information 23
2.4.2 3D shape modeling 24
2.4.3 2D+3D active appearance models 28
2.5 Chapter Summary and Discussion 32
Chapter 3 Procrustean Normal Distribution Mixture Model 33
3.1 Non-Rigid Structure from Motion 35
3.2 Procrustean Normal Distribution (PND) 38
3.3 PND Mixture Model 41
3.4 Learning a PNDMM 43
3.4.1 E-step 44
3.4.2 M-step 46
3.5 Learning an Adaptive PNDMM 48
3.6 Experiments 50
3.6.1 Experimental setup 50
3.6.2 CMU Mocap database 53
3.6.3 UMPM dataset 69
3.6.4 Simple and short motions 74
3.6.5 Real sequence - qualitative representation 77
3.7 Chapter Summary 78
Chapter 4 Recovering a 3D Human Pose from a Novel Image 83
4.1 Single View 3D Human Pose Estimation 85
4.2 Candidate Generation 87
4.2.1 Initial pose generation 87
4.2.2 Part recombination 88
4.3 3D Shape Prior Model 89
4.3.1 Procrustean mixture model learning 89
4.3.2 Procrustean mixture model fitting 91
4.4 Model Transformation 92
4.4.1 Model normalization 92
4.4.2 Model adaptation 95
4.5 Result Selection 96
4.6 Experiments 98
4.6.1 Implementation details 98
4.6.2 Evaluation of the joint 2D and 3D pose estimation 99
4.6.3 Evaluation of the 2D pose estimation 104
4.6.4 Evaluation of the 3D pose estimation 106
4.7 Chapter Summary 108
Chapter 5 Application to Action Recognition 109
5.1 Appearance and Motion Based Descriptors 112
5.2 2D Pose Based Descriptors 113
5.3 Bag-of-Features with a Multiple Kernel Method 114
5.4 Classification - Kernel Group Sparse Representation 115
5.4.1 Group sparse representation for classification 116
5.4.2 Kernel group sparse (KGS) representation for classification 118
5.5 Experiment on sub-JHMDB Dataset 120
5.5.1 Experimental setup 120
5.5.2 3D pose based descriptor 122
5.5.3 Experimental results 123
5.6 Chapter Summary 129
Chapter 6 Conclusion and Future Work 131
Appendices 135
A Proof of Propositions in Chapter 2 137
A.1 Proof of Proposition 1 137
A.2 Proof of Proposition 3 138
A.3 Proof of Proposition 4 139
B Calculation of p(XijDii) in Chapter 3 141
B.1 Without the Dirac-delta term 141
B.2 With the Dirac-delta term 142
C Procrustean Mixture Model Learning and Fitting in Chapter 4 145
C.1 Procrustean Mixture Model Learning 145
C.2 Procrustean Mixture Model Fitting 147
Bibliography 153
초 록 167Docto