    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    A Data Fusion Perspective on Human Motion Analysis Including Multiple Camera Applications

    Proceedings of: 5th International Work-Conference on the Interplay Between Natural and Artificial Computation, (IWINAC 2013). Mallorca, Spain, June 10-14.Human motion analysis methods have received increasing attention during the last two decades. In parallel, data fusion technologies have emerged as a powerful tool for the estimation of properties of objects in the real world. This papers presents a view of human motion analysis from the viewpoint of data fusion. JDL process model and Dasarathy's input-output hierarchy are employed to categorize the works in the area. A survey of the literature in human motion analysis from multiple cameras is included. Future research directions in the area are identified after this review.Publicad

    Evaluation of face recognition algorithms under noise

    One of the major applications of computer vision and image processing is face recognition, where a computerized algorithm automatically identifies a person’s face from a large image dataset or even from a live video. This thesis addresses facial recognition, a topic that has been widely studied due to its importance in many applications in both civilian and military domains. The application of face recognition systems has expanded from security purposes to social networking sites, managing fraud, and improving user experience. Numerous algorithms have been designed to perform face recognition with good accuracy. This problem is challenging due to the dynamic nature of the human face and the different poses that it can take. Regardless of the algorithm, facial recognition accuracy can be heavily affected by the presence of noise. This thesis presents a comparison of traditional and deep learning face recognition algorithms under the presence of noise. For this purpose, Gaussian and salt-andpepper noises are applied to the face images drawn from the ORL Dataset. The image recognition is performed using each of the following eight algorithms: principal component analysis (PCA), two-dimensional PCA (2D-PCA), linear discriminant analysis (LDA), independent component analysis (ICA), discrete cosine transform (DCT), support vector machine (SVM), convolution neural network (CNN) and Alex Net. The ORL dataset was used in the experiments to calculate the evaluation accuracy for each of the investigated algorithms. Each algorithm is evaluated with two experiments; in the first experiment only one image per person is used for training, whereas in the second experiment, five images per person are used for training. The investigated traditional algorithms are implemented with MATLAB and the deep learning algorithms approaches are implemented with Python. The results show that the best performance was obtained using the DCT algorithm with 92% dominant eigenvalues and 95.25 % accuracy, whereas for deep learning, the best performance was using a CNN with accuracy of 97.95%, which makes it the best choice under noisy conditions

    Covariate-invariant gait recognition using random subspace method and its extensions

    Compared with other biometric traits like fingerprint or iris, the most significant advantage of gait is that it can be used for remote human identification without cooperation from the subjects. The technology of gait recognition may play an important role in crime prevention, law enforcement, etc. Yet the performance of automatic gait recognition may be affected by covariate factors such as speed, carrying condition, elapsed time, shoe, walking surface, clothing, camera viewpoint, video quality, etc. In this thesis, we propose a random subspace method (RSM) based classifier ensemble framework and its extensions for robust gait recognition. Covariates change the human gait appearance in different ways. For example, speed may change the appearance of human arms or legs; camera viewpoint alters the human visual appearance in a global manner; carrying condition and clothing may change the appearance of any parts of the human body (depending on what is being carried/wore). Due to the unpredictable nature of covariates, it is difficult to collect all the representative training data. We claim overfitting may be the main problem that hampers the performance of gait recognition algorithms (that rely on learning). First, for speed-invariant gait recognition, we employ a basic RSM model, which can reduce the generalisation errors by combining a large number of weak classifiers in the decision level (i.e., by using majority voting). We find that the performance of RSM decreases when the intra-class variations are large. In RSM, although weak classifiers with lower dimensionality tend to have better generalisation ability, they may have to contend with the underfitting problem if the dimensionality is too low. We thus enhance the RSM-based weak classifiers by extending RSM to multimodal-RSM. In tackling the elapsed time covariate, we use face information to enhance the RSM-based gait classifiers before the decision-level fusion. We find significant performance gain can be achieved when lower weight is assigned to the face information. We also employ a weak form of multimodal-RSM for gait recognition from low quality videos (with low resolution and low frame-rate) when other modalities are unavailable. In this case, model-based information is used to enhance the RSM-based weak classifiers. Then we point out the relationship of base classifier accuracy, classifier ensemble accuracy, and diversity among the base classifiers. By incorporating the model-based information (with lower weight) into the RSM-based weak classifiers, the diversity of the classifiers, which is positively correlated to the ensemble accuracy, can be enhanced. In contrast to multimodal systems, large intra-class variations may have a significant impact on unimodal systems. We model the effect of various unknown covariates as a partial feature corruption problem with unknown locations in the spatial domain. By making some assumptions in ideal cases analysis, we provide the theoretical basis of RSM-based classifier ensemble in the application of covariate-invariant gait recognition. However, in real cases, these assumptions may not hold precisely, and the performance may be affected when the intra-class variations are large. We propose a criterion to address this issue. That is, in the decision-level fusion stage, for a query gait with unknown covariates, we need to dynamically suppress the ratio of the false votes and the true votes before the majority voting. Two strategies are employed, i.e., local enhancing (LE) which can increase true votes, and the proposed hybrid decision-level fusion (HDF) which can decrease false votes. Based on this criterion, the proposed RSM-based HDF (RSM-HDF) framework achieves very competitive performance in tackling the covariates such as walking surface, clothing, and elapsed time, which were deemed as the open questions. The factor of camera viewpoint is different from other covariates. It alters the human appearance in a global manner. By employing unitary projection (UP), we form a new space, where the same subjects are closer from different views. However, it may also give rise to a large amount of feature distortions. We deem these distortions as the corrupted features with unknown locations in the new space (after UP), and use the RSM-HDF framework to address this issue. Robust view-invariant gait recognition can be achieved by using the UP-RSM-HDF framework. In this thesis, we propose a RSM-based classifier ensemble framework and its extensions to realise the covariate-invariant gait recognition. It is less sensitive to most of the covariate factors such as speed, shoe, carrying condition, walking surface, video quality, clothing, elapsed time, camera viewpoint, etc., and it outperforms other state-of-the-art algorithms significantly on all the major public gait databases. Specifically, our method can achieve very competitive performance against (large changes in) view, clothing, walking surface, elapsed time, etc., which were deemed as the most difficult covariate factors

    Principal Component Analysis

    This book is aimed at raising awareness of researchers, scientists and engineers on the benefits of Principal Component Analysis (PCA) in data analysis. In this book, the reader will find the applications of PCA in fields such as image processing, biometric, face recognition and speech processing. It also includes the core concepts and the state-of-the-art methods in data analysis and feature extraction

    Loughborough University Spontaneous Expression Database and baseline results for automatic emotion recognition

    The study of facial expressions in humans dates back to the 19th century and the study of the emotions that these facial expressions portray dates back even further. It is a natural part of non-verbal communication for humans to pass across messages using facial expressions either consciously or subconsciously, it is also routine for other humans to recognize these facial expressions and understand or deduce the underlying emotions which they represent. Over two decades ago and following technological advances, particularly in the area of image processing, research began into the use of machines for the recognition of facial expressions from images with the aim of inferring the corresponding emotion. Given a previously unknown test sample, the supervised learning problem is to accurately determine the facial expression class to which the test sample belongs using the knowledge of the known class memberships of each image from a set of training images. The solution to this problem building an effective classifier to recognize the facial expression is hinged on the availability of representative training data. To date, much of the research in the area of Facial Expression Recognition (FER) is still based on posed (acted) facial expression databases, which are often exaggerated and therefore not representative of real life affective displays, as such there is a need for more publically accessible spontaneous databases that are well labelled. This thesis therefore reports on the development of the newly collected Loughborough University Spontaneous Expression Database (LUSED); designed to bolster the development of new recognition systems and to provide a benchmark for researchers to compare results with more natural expression classes than most existing databases. To collect the database, an experiment was set up where volunteers were discretely videotaped while they watched a selection of emotion inducing video clips. The utility of the new LUSED dataset is validated using both traditional and more recent pattern recognition techniques; (1) baseline results are presented using the combination of Principal Component Analysis (PCA), Fisher Linear Discriminant Analysis (FLDA) and their kernel variants Kernel Principal Component Analysis (KPCA), Kernel Fisher Discriminant Analysis (KFDA) with a Nearest Neighbour-based classifier. These results are compared to the performance of an existing natural expression database Natural Visible and Infrared Expression (NVIE) database. A scheme for the recognition of encrypted facial expression images is also presented. (2) Benchmark results are presented by combining PCA, FLDA, KPCA and KFDA with a Sparse Representation-based Classifier (SRC). A maximum accuracy of 68% was obtained recognizing five expression classes, which is comparatively better than the known maximum for a natural database; around 70% (from recognizing only three classes) obtained from NVIE

    Biometric face recognition using multilinear projection and artificial intelligence

    PhD ThesisNumerous problems of automatic facial recognition in the linear and multilinear subspace learning have been addressed; nevertheless, many difficulties remain. This work focuses on two key problems for automatic facial recognition and feature extraction: object representation and high dimensionality. To address these problems, a bidirectional two-dimensional neighborhood preserving projection (B2DNPP) approach for human facial recognition has been developed. Compared with 2DNPP, the proposed method operates on 2-D facial images and performs reductions on the directions of both rows and columns of images. Furthermore, it has the ability to reveal variations between these directions. To further improve the performance of the B2DNPP method, a new B2DNPP based on the curvelet decomposition of human facial images is introduced. The curvelet multi- resolution tool enhances the edges representation and other singularities along curves, and thus improves directional features. In this method, an extreme learning machine (ELM) classifier is used which significantly improves classification rate. The proposed C-B2DNPP method decreases error rate from 5.9% to 3.5%, from 3.7% to 2.0% and from 19.7% to 14.2% using ORL, AR, and FERET databases compared with 2DNPP. Therefore, it achieves decreases in error rate more than 40%, 45%, and 27% respectively with the ORL, AR, and FERET databases. Facial images have particular natural structures in the form of two-, three-, or even higher-order tensors. Therefore, a novel method of supervised and unsupervised multilinear neighborhood preserving projection (MNPP) is proposed for face recognition. This allows the natural representation of multidimensional images 2-D, 3-D or higher-order tensors and extracts useful information directly from tensotial data rather than from matrices or vectors. As opposed to a B2DNPP which derives only two subspaces, in the MNPP method multiple interrelated subspaces are obtained over different tensor directions, so that the subspaces are learned iteratively by unfolding the tensor along the different directions. The performance of the MNPP has performed in terms of the two modes of facial recognition biometrics systems of identification and verification. The proposed supervised MNPP method achieved decrease over 50.8%, 75.6%, and 44.6% in error rate using ORL, AR, and FERET databases respectively, compared with 2DNPP. Therefore, the results demonstrate that the MNPP approach obtains the best overall performance in various learning scenarios