42 research outputs found

    Re-identification by Covariance Descriptors

    Get PDF
    International audienceThis chapter addresses the problem of appearance matching, while employing the covariance descriptor. We tackle the extremely challenging case in which the same non-rigid object has to be matched across disjoint camera views. Covariance statistics averaged over a Riemannian manifold are fundamental for designing appearance models invariant to camera changes. We discuss different ways of extracting an object appearance by incorporating various training strategies. Appearance matching is enhanced either by discriminative analysis using images from a single camera or by selecting distinctive features in a covariance metric space employing data from two cameras. By selecting only essential features for a specific class of objects (\textit{e.g.} humans) without defining \textit{a priori} feature vector for extracting covariance, we remove redundancy from the covariance descriptor and ensure low computational cost. Using a feature selection technique instead of learning on a manifold, we avoid the over-fitting problem. The proposed models have been successfully applied to the person re-identification task in which a human appearance has to be matched across non-overlapping cameras. We carry out detailed experiments of the suggested strategies, demonstrating their pros and cons \textit{w.r.t.} recognition rate and suitability to video analytics systems

    Person re-identification employing 3D scene information

    Get PDF
    International audienceThis paper addresses the person re-identification task applied in a real-world scenario. Finding people in a network of cameras is challenging due to significant variations in lighting conditions, different colour responses and different camera viewpoints. State of the art algorithms are likely to fail due to serious perspective and pose changes. Most of existing approaches try to cope with all these changes by applying metric learning tools to find a transfer function between a camera pair, while ignoring the body alignment issue. Additionally, this transfer function usually depends on the camera pair and requires labeled training data for each camera. This might be unattainable in a large camera network. In this paper we employ 3D scene information for minimising perspective distortions and estimating the target pose. The estimated pose is further used for splitting a target trajectory into the reliable chunks, each one with a uniform pose. These chunks are matched through a network of cameras using a previously learned metric pool. However, instead of learning transfer functions that cope with all appearance variations, we propose to learn a generic metric pool that only focuses on pose changes. This pool consists of metrics, each one learned to match a specific pair of poses, not being limited to a specific camera pair. Automatically estimated poses determine the proper metric, thus improving matching. We show that metrics learned using only a single camera can significantly improve the matching across the whole camera network, providing a scalable solution. We validated our approach on publicly available datasets demonstrating increase in the re-identification performance

    People detection and re-identification for multi surveillance cameras

    Get PDF
    International audienceRe-identifying people in a network of non overlapping cameras requires people to be accurately detected and tracked in order to build a strong visual signature of people appearances. Traditional surveillance cameras do not provide high enough image resolution to iris recognition algorithms. State of the art face recognition can not be easily applied to surveillance videos as people need to be facing the camera at a close range. The different lighting environment contained in each camera scene and the strong illumination variability occurring as people walk throughout a scene induce great variability in their appearance. %In addition, surveillance scene often display people whose images occlud each other onto the image plane making people detection difficult to achieve. In addition, people images occlud each other onto the image plane making people detection difficult to achieve. We propose a novel simplified Local Binary Pattern features to detect people, head and faces. A Mean Riemannian Covariance Grid (MRCG) is used to model appearance of tracked people to obtain highly discriminative human signature. The methods are evaluated and compared with the state of the art algorithms. We have created a new dataset from a network of 2 cameras showing the usefulness of our system to detect, track and re-identify people using appearance and face features

    Human Re-identification System On Highly Parallel GPU and CPU Architectures

    Get PDF
    International audienceThe paper presents a new approach to the human reindetification problem using covariance features. In many cases, a distance operator between signatures, based on generalized eigenvalues, has to be computed efficiently, especially once the real-time response time is expected from the system. This is a challenging problem as many procedures are in fact computationally intensive tasks and must be repeated constantly. To deal with this problem we have successfully designed and then tested a new video surveillance system. To obtain the required high efficiency we took the advantage of highly parallel computing architectures such as FPGA, GPU and CPU units to perform calculations. However, we had to propose a new GPU-based implementation of the distance operator for querying the example database. Thus, in this paper we present experimental evaluation of the proposed solution in the light of the database response time depending on its size

    Surpassing Real-World Source Training Data: Random 3D Characters for Generalizable Person Re-Identification

    Get PDF
    Person re-identification has seen significant advancement in recent years. However, the ability of learned models to generalize to unknown target domains still remains limited. One possible reason for this is the lack of large-scale and diverse source training data, since manually labeling such a dataset is very expensive and privacy sensitive. To address this, we propose to automatically synthesize a large-scale person re-identification dataset following a set-up similar to real surveillance but with virtual environments, and then use the synthesized person images to train a generalizable person re-identification model. Specifically, we design a method to generate a large number of random UV texture maps and use them to create different 3D clothing models. Then, an automatic code is developed to randomly generate various different 3D characters with diverse clothes, races and attributes. Next, we simulate a number of different virtual environments using Unity3D, with customized camera networks similar to real surveillance systems, and import multiple 3D characters at the same time, with various movements and interactions along different paths through the camera networks. As a result, we obtain a virtual dataset, called RandPerson, with 1,801,816 person images of 8,000 identities. By training person re-identification models on these synthesized person images, we demonstrate, for the first time, that models trained on virtual data can generalize well to unseen target images, surpassing the models trained on various real-world datasets, including CUHK03, Market-1501, DukeMTMC-reID, and almost MSMT17. The RandPerson dataset is available at this https URL

    Boosted human re-identification using Riemannian manifolds

    Get PDF
    International audienceThis paper presents an appearance-based model to address the human re-identification problem. Human re-identification is an important and still unsolved task in computer vision. In many systems there is a requirement to identify individuals or determine whether a given individual has already appeared over a network of cameras. The human appearance obtained in one camera is usually different from the ones obtained in another camera. In order to re-identify people a human signature should handle difference in illumination, pose and camera parameters. The paper focuses on a new appearance model based on Mean Riemannian Covariance (MRC) patches extracted from tracks of a particular individual. A new similarity measure using Riemannian manifold theory is also proposed to distinguish sets of patches belonging to a specific individual. We investigate the significance of MRC patches based on their reliability extracted during tracking and their discriminative power obtained by a boosting scheme. Our method is evaluated and compared with the state of the art using benchmark video sequences from the ETHZ and the i-LIDS datasets. Re-identification performance is presented using a cumulative matching characteristic (CMC) curve. We demonstrate that the proposed approach outperforms state of the art methods. Finally, the results of our approach are shown on two further and more pertinent datasets

    Representing Visual Appearance by Video Brownian Covariance Descriptor for Human Action Recognition

    Get PDF
    International audienceThis paper addresses a problem of recognizing human actions in video sequences. Recent studies have shown that methods which use bag-of-features and space-time features achieve high recognition accuracy. Such methods extract both appearance-based and motion-based features. This paper focuses only on appearance features. We proposeto model relationships between different pixel-level appearance features such as intensity and gradient using Brownian covariance, which is a natural extension of classical covariance measure. While classical covariance can model only linear relationships, Brownian covariance models all kinds of possible relationships. We propose a method to compute Brownian covariance on space-time volume of a video sequence. We show that proposed Video Brownian Covariance (VBC) descriptor carries complementary information to the Histogram of Oriented Gradients (HOG) descriptor. The fusion of these two descriptors gives a significant improvement in performance on three challenging action recognition datasets

    Improving Person Re-identification by Viewpoint Cues

    Get PDF
    International audienceRe-identifying people in a network of cameras requires an invariant human representation. State of the art algorithms are likely to fail in real-world scenarios due to serious perspective changes. Most of existing approaches focus on invariant and discriminative features, while ignoring the body alignment issue. In this paper we propose 3 methods for improving the performance of person re-identification. We focus on eliminating perspective distortions by using 3D scene information. Perspective changes are minimized by affine transformations of cropped images containing the target (1). Further we estimate the human pose for (2) clustering data from a video stream and (3) weighting image features. The pose is estimated using 3D scene information and motion of the target. We validated our approach on a publicly available dataset with a network of 8 cameras. The results demonstrated significant increase in the re-identification performance over the state of the art
    corecore