197 research outputs found

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201


    Get PDF
    Face recognition has been a long standing problem in computer vision. General face recognition is challenging because of large appearance variability due to factors including pose, ambient lighting, expression, size of the face, age, and distance from the camera, etc. There are very accurate techniques to perform face recognition in controlled environments, especially when large numbers of samples are available for each face (individual). However, face identification under uncontrolled( unconstrained) environments or with limited training data is still an unsolved problem. There are two face recognition tasks: face identification (who is who in a probe face set, given a gallery face set) and face verification (same or not, given two faces). In this work, we study both face identification and verification in unconstrained environments. Firstly, we propose a face verification framework that combines Partial Least Squares (PLS) and the One-Shot similarity model[1]. The idea is to describe a face with a large feature set combining shape, texture and color information. PLS regression is applied to perform multi-channel feature weighting on this large feature set. Finally the PLS regression is used to compute the similarity score of an image pair by One-Shot learning (using a fixed negative set). Secondly, we study face identification with image sets, where the gallery and probe are sets of face images of an individual. We model a face set by its covariance matrix (COV) which is a natural 2nd-order statistic of a sample set.By exploring an efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive a kernel function that explicitly maps the covariance matrix from the Riemannian manifold to Euclidean space. Then, discriminative learning is performed on the COV manifold: the learning aims to maximize the between-class COV distance and minimize the within-class COV distance. Sparse representation and dictionary learning have been widely used in face recognition, especially when large numbers of samples are available for each face (individual). Sparse coding is promising since it provides a more stable and discriminative face representation. In the last part of our work, we explore sparse coding and dictionary learning for face verification application. More specifically, in one approach, we apply sparse representations to face verification in two ways via a fix reference set as dictionary. In the other approach, we propose a dictionary learning framework with explicit pairwise constraints, which unifies the discriminative dictionary learning for pair matching (face verification) and classification (face recognition) problems

    Re-identifying people in the crowd

    Get PDF
    Developing an automated surveillance system is of great interest for various reasons including forensic and security applications. In the case of a network of surveillance cameras with non-overlapping fields of view, person detection and tracking alone are insufficient to track a subject of interest across the network. In this case, instances of a person captured in one camera view need to be retrieved among a gallery of different people, in other camera views. This vision problem is commonly known as person re-identification (re-id). Cross-view instances of pedestrians exhibit varied levels of illumination, viewpoint, and pose variations which makes the problem very challenging. Despite recent progress towards improving accuracy, existing systems suffer from low applicability to real-world scenarios. This is mainly caused by the need for large amounts of annotated data from pairwise camera views to be available for training. Given the difficulty of obtaining such data and annotating it, this thesis aims to bring the person re-id problem a step closer to real-world deployment. In the first contribution, the single-shot protocol, where each individual is represented by a pair of images that need to be matched, is considered. Following the extensive annotation of four datasets for six attributes, an evaluation of the most widely used feature extraction schemes is conducted. The results reveal two high-performing descriptors among those evaluated, and show illumination variation to have the most impact on re-id accuracy. Motivated by the wide availability of videos from surveillance cameras and the additional visual and temporal information they provide, video-based person re-id is then investigated, and a su-pervised system is developed. This is achieved by improving and extending the best performing image-based person descriptor into three dimensions and combining it with distance metric learn-ing. The system obtained achieves state-of-the-art results on two widely used datasets. Given the cost and difficulty of obtaining labelled data from pairwise cameras in a network to train the model, an unsupervised video-based person re-id method is also developed. It is based on a set-based distance measure that leverages rank vectors to estimate the similarity scores between person tracklets. The proposed system outperforms other unsupervised methods by a large margin on two datasets while competing with deep learning methods on another large-scale dataset

    BiCov: a novel image representation for person re-identification and face verification

    Full text link
    International audienceThis paper proposes a novel image representation which can properly handle both background and illumination variations. It is therefore adapted to the person/face reidentification tasks, avoiding the use of any additional pre-processing steps such as foreground-background separation or face and body part segmentation. This novel representation relies on the combination of Biologically Inspired Features (BIF) and covariance descriptors used to compute the similarity of the BIF features at neighboring scales. Hence, we will refer to it as the BiCov representation. To show the effectiveness of BiCov, this paper conducts experiments on two person re-identification tasks (VIPeR and ETHZ) and one face verification task (LFW), on which it improves the current state-of-the-art performance

    Person Re-identification by Local Maximal Occurrence Representation and Metric Learning

    Full text link
    Person re-identification is an important technique towards automatic search of a person's presence in a surveillance video. Two fundamental problems are critical for person re-identification, feature representation and metric learning. An effective feature representation should be robust to illumination and viewpoint changes, and a discriminant metric should be learned to match various person images. In this paper, we propose an effective feature representation called Local Maximal Occurrence (LOMO), and a subspace and metric learning method called Cross-view Quadratic Discriminant Analysis (XQDA). The LOMO feature analyzes the horizontal occurrence of local features, and maximizes the occurrence to make a stable representation against viewpoint changes. Besides, to handle illumination variations, we apply the Retinex transform and a scale invariant texture operator. To learn a discriminant metric, we propose to learn a discriminant low dimensional subspace by cross-view quadratic discriminant analysis, and simultaneously, a QDA metric is learned on the derived subspace. We also present a practical computation method for XQDA, as well as its regularization. Experiments on four challenging person re-identification databases, VIPeR, QMUL GRID, CUHK Campus, and CUHK03, show that the proposed method improves the state-of-the-art rank-1 identification rates by 2.2%, 4.88%, 28.91%, and 31.55% on the four databases, respectively.Comment: This paper has been accepted by CVPR 2015. For source codes and extracted features please visit http://www.cbsr.ia.ac.cn/users/scliao/projects/lomo_xqda