12 research outputs found

    Person re-identification by robust canonical correlation analysis

    Get PDF
    Person re-identification is the task to match people in surveillance cameras at different time and location. Due to significant view and pose change across non-overlapping cameras, directly matching data from different views is a challenging issue to solve. In this letter, we propose a robust canonical correlation analysis (ROCCA) to match people from different views in a coherent subspace. Given a small training set as in most re-identification problems, direct application of canonical correlation analysis (CCA) may lead to poor performance due to the inaccuracy in estimating the data covariance matrices. The proposed ROCCA with shrinkage estimation and smoothing technique is simple to implement and can robustly estimate the data covariance matrices with limited training samples. Experimental results on two publicly available datasets show that the proposed ROCCA outperforms regularized CCA (RCCA), and achieves state-of-the-art matching results for person re-identification as compared to the most recent methods

    Multitarget Tracking in Nonoverlapping Cameras Using a Reference Set

    Get PDF
    Tracking multiple targets in nonoverlapping cameras are challenging since the observations of the same targets are often separated by time and space. There might be significant appearance change of a target across camera views caused by variations in illumination conditions, poses, and camera imaging characteristics. Consequently, the same target may appear very different in two cameras. Therefore, associating tracks in different camera views directly based on their appearance similarity is difficult and prone to error. In most previous methods, the appearance similarity is computed either using color histograms or based on pretrained brightness transfer function that maps color between cameras. In this paper, a novel reference set based appearance model is proposed to improve multitarget tracking in a network of nonoverlapping cameras. Contrary to previous work, a reference set is constructed for a pair of cameras, containing subjects appearing in both camera views. For track association, instead of directly comparing the appearance of two targets in different camera views, they are compared indirectly via the reference set. Besides global color histograms, texture and shape features are extracted at different locations of a target, and AdaBoost is used to learn the discriminative power of each feature. The effectiveness of the proposed method over the state of the art on two challenging real-world multicamera video data sets is demonstrated by thorough experiments

    Dense Invariant Feature Based Support Vector Ranking for Cross-Camera Person Re-identification

    Get PDF
    Recently, support vector ranking has been adopted to address the challenging person re-identification problem. However, the ranking model based on ordinary global features cannot well represent the significant variation of pose and viewpoint across camera views. To address this issue, a novel ranking method which fuses the dense invariant features is proposed in this paper to model the variation of images across camera views. An optimal space for ranking is learned by simultaneously maximizing the margin and minimizing the error on the fused features. The proposed method significantly outperforms the original support vector ranking algorithm due to the invariance of the dense invariant features, the fusion of the bidirectional features and the adaptive adjustment of parameters. Experimental results demonstrate that the proposed method is competitive with state-of-the-art methods on two challenging datasets, showing its potential for real-world person re-identification

    Modeling feature distances by orientation driven classifiers for person re-identification

    Get PDF
    6siTo tackle the re-identification challenges existing methods propose to directly match image features or to learn the transformation of features that undergoes between two cameras. Other methods learn optimal similarity measures. However, the performance of all these methods are strongly dependent from the person pose and orientation. We focus on this aspect and introduce three main contributions to the field: (i) to propose a method to extract multiple frames of the same person with different orientations in order to capture the complete person appearance; (ii) to learn the pairwise feature dissimilarities space (PFDS) formed by the subspaces of similar and different image pair orientations; and (iii) within each subspace, a classifier is trained to capture the multi-modal inter-camera transformation of pairwise image dissimilarities and to discriminate between positive and negative pairs. The experiments show the superior performance of the proposed approach with respect to state-of-the-art methods using two publicly available benchmark datasets. © 2016 Elsevier Inc. All rights reserved.partially_openopenGarcía, Jorge; Martinel, Niki; Gardel, Alfredo; Bravo, Ignacio; Foresti, Gian Luca; Micheloni, ChristianGarcía, Jorge; Martinel, Niki; Gardel, Alfredo; Bravo, Ignacio; Foresti, Gian Luca; Micheloni, Christia

    Person Re-Identification using Deep Convnets with Multi-task Learning

    Get PDF

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Person Reidentification Using Deep Convnets With Multitask Learning

    Full text link

    Resource-constrained re-identification in camera networks

    Get PDF
    PhDIn multi-camera surveillance, association of people detected in different camera views over time, known as person re-identification, is a fundamental task. Re-identification is a challenging problem because of changes in the appearance of people under varying camera conditions. Existing approaches focus on improving the re-identification accuracy, while no specific effort has yet been put into efficiently utilising the available resources that are normally limited in a camera network, such as storage, computation and communication capabilities. In this thesis, we aim to perform and improve the task of re-identification under constrained resources. More specifically, we reduce the data needed to represent the appearance of an object through a proposed feature selection method and a difference-vector representation method. The proposed feature-selection method considers the computational cost of feature extraction and the cost of storing the feature descriptor jointly with the feature’s re-identification performance to select the most cost-effective and well-performing features. This selection allows us to improve inter-camera re-identification while reducing storage and computation requirements within each camera. The selected features are ranked in the order of effectiveness, which enable a further reduction by dropping the least effective features when application constraints require this conformity. We also reduce the communication overhead in the camera network by transferring only a difference vector, obtained from the extracted features of an object and the reference features within a camera, as an object representation for the association. In order to reduce the number of possible matches per association, we group the objects appearing within a defined time-interval in un-calibrated camera pairs. Such a grouping improves the re-identification, since only those objects that appear within the same time-interval in a camera pair are needed to be associated. For temporal alignment of cameras, we exploit differences between the frame numbers of the detected objects in a camera pair. Finally, in contrast to pairwise camera associations used in literature, we propose a many-to-one camera association method for re-identification, where multiple cameras can be candidates for having generated the previous detections of an object. We obtain camera-invariant matching scores from the scores obtained using the pairwise re-identification approaches. These scores measure the chances of a correct match between the objects detected in a group of cameras. Experimental results on publicly available and in-lab multi-camera image and video datasets show that the proposed methods successfully reduce storage, computation and communication requirements while improving the re-identification rate compared to existing re-identification approaches

    Minimising Human Annotation for Scalable Person Re-Identification

    Get PDF
    PhDAmong the diverse tasks performed by an intelligent distributed multi-camera surveillance system, person re-identification (re-id) is one of the most essential. Re-id refers to associating an individual or a group of people across non-overlapping cameras at different times and locations, and forms the foundation of a variety of applications ranging from security and forensic search to quotidian retail and health care. Though attracted rapidly increasing academic interests over the past decade, it still remains a non-trivial and unsolved problem for launching a practical reid system in real-world environments, due to the ambiguous and noisy feature of surveillance data and the potentially dramatic visual appearance changes caused by uncontrolled variations in human poses and divergent viewing conditions across distributed camera views. To mitigate such visual ambiguity and appearance variations, most existing re-id approaches rely on constructing fully supervised machine learning models with extensively labelled training datasets which is unscalable for practical applications in the real-world. Particularly, human annotators must exhaustively search over a vast quantity of offline collected data, manually label cross-view matched images of a large population between every possible camera pair. Nonetheless, having the prohibitively expensive human efforts dissipated, a trained re-id model is often not easily generalisable and transferable, due to the elastic and dynamic operating conditions of a surveillance system. With such motivations, this thesis proposes several scalable re-id approaches with significantly reduced human supervision, readily applied to practical applications. More specifically, this thesis has developed and investigated four new approaches for reducing human labelling effort in real-world re-id as follows: Chapter 3 The first approach is affinity mining from unlabelled data. Different from most existing supervised approaches, this work aims to model the discriminative information for reid without exploiting human annotations, but from the vast amount of unlabelled person image data, thus applicable to both semi-supervised and unsupervised re-id. It is non-trivial since the human annotated identity matching correspondence is often the key to discriminative re-id modelling. In this chapter, an alternative strategy is explored by specifically mining two types of affinity relationships among unlabelled data: (1) inter-view data affinity and (2) intra-view data affinity. In particular, with such affinity information encoded as constraints, a Regularised Kernel Subspace Learning model is developed to explicitly reduce inter-view appearance variations and meanwhile enhance intra-view appearance disparity for more discriminative re-id matching. Consequently, annotation costs can be immensely alleviated and a scalable re-id model is readily to be leveraged to plenty of unlabelled data which is inexpensive to collect. Chapter 4 The second approach is saliency discovery from unlabelled data. This chapter continues to investigate the problem of what can be learned in unlabelled images without identity labels annotated by human. Other than affinity mining as proposed by Chapter 3, a different solution is proposed. That is, to discover localised visual appearance saliency of person appearances. Intuitively, salient and atypical appearances of human are able to uniquely and representatively describe and identify an individual, whilst also often robust to view changes and detection variances. Motivated by this, an unsupervised Generative Topic Saliency model is proposed to jointly perform foreground extraction, saliency detection, as well as discriminative re-id matching. This approach completely avoids the exhaustive annotation effort for model training, and thus better scales to real-world applications. Moreover, its automatically discovered re-id saliency representations are shown to be semantically interpretable, suitable for generating useful visual analysis for deployable user-oriented software tools. Chapter 5 The third approach is incremental learning from actively labelled data. Since learning from unlabelled data alone yields less discriminative matching results, and in some cases there will be limited human labelling resources available for re-id modelling, this chapter thus investigate the problem of how to maximise a model’s discriminative capability with minimised labelling efforts. The challenges are to (1) automatically select the most representative data from a vast number of noisy/ambiguous unlabelled data in order to maximise model discrimination capacity; and (2) incrementally update the model parameters to accelerate machine responses and reduce human waiting time. To that end, this thesis proposes a regression based re-id model, characterised by its very fast and efficient incremental model updates. Furthermore, an effective active data sampling algorithm with three novel joint exploration-exploitation criteria is designed, to make automatic data selection feasible with notably reduced human labelling costs. Such an approach ensures annotations to be spent only on very few data samples which are most critical to model’s generalisation capability, instead of being exhausted by blindly labelling many noisy and redundant training samples. Chapter 6 The last technical area of this thesis is human-in-the-loop learning from relevance feedback. Whilst former chapters mainly investigate techniques to reduce human supervision for model training, this chapter motivates a novel research area to further minimise human efforts spent in the re-id deployment stage. In real-world applications where camera network and potential gallery size increases dramatically, even the state-of-the-art re-id models generate much inferior re-id performances and human involvements at deployment stage is inevitable. To minimise such human efforts and maximise re-id performance, this thesis explores an alternative approach to re-id by formulating a hybrid human-computer learning paradigm with humans in the model matching loop. Specifically, a Human Verification Incremental Learning model is formulated which does not require any pre-labelled training data, therefore scalable to new camera pairs; Moreover, the proposed model learns cumulatively from human feedback to provide an instant improvement to re-id ranking of each probe on-the-fly, thus scalable to large gallery sizes. It has been demonstrated that the proposed re-id model achieves significantly superior re-id results whilst only consumes much less human supervision effort. For facilitating a holistic understanding about this thesis, the main studies are summarised and framed into a graphical abstract as shown in Figur
    corecore