3 research outputs found

    Features for Multi-Target Multi-Camera Tracking and Re-Identification

    Full text link
    Multi-Target Multi-Camera Tracking (MTMCT) tracks many people through video taken from several cameras. Person Re-Identification (Re-ID) retrieves from a gallery images of people similar to a person query image. We learn good features for both MTMCT and Re-ID with a convolutional neural network. Our contributions include an adaptive weighted triplet loss for training and a new technique for hard-identity mining. Our method outperforms the state of the art both on the DukeMTMC benchmarks for tracking, and on the Market-1501 and DukeMTMC-ReID benchmarks for Re-ID. We examine the correlation between good Re-ID and good MTMCT scores, and perform ablation studies to elucidate the contributions of the main components of our system. Code is available.Comment: Accepted as spotlight at CVPR 201

    Appearance Descriptors for Person Re-identification: a Comprehensive Review

    Full text link
    In video-surveillance, person re-identification is the task of recognising whether an individual has already been observed over a network of cameras. Typically, this is achieved by exploiting the clothing appearance, as classical biometric traits like the face are impractical in real-world video surveillance scenarios. Clothing appearance is represented by means of low-level \textit{local} and/or \textit{global} features of the image, usually extracted according to some part-based body model to treat different body parts (e.g. torso and legs) independently. This paper provides a comprehensive review of current approaches to build appearance descriptors for person re-identification. The most relevant techniques are described in detail, and categorised according to the body models and features used. The aim of this work is to provide a structured body of knowledge and a starting point for researchers willing to conduct novel investigations on this challenging topic

    Dissimilarity-based people re-identification and search for intelligent video surveillance

    Get PDF
    Intelligent video-surveillance is at present one of the most active research fields in computer science. It brings together a wide variety of computer vision and machine learning techniques to provide useful tools for surveillance operators and forensic video analytics. Person re-identification is among these tools; it consists of recognising whether an individual has already been observed over a network of cameras. Person re-identification has various possible applications, e.g., off-line retrieval of all the video-sequences showing an individual of interest whose image is given as query, or on-line pedestrian tracking overmultiple cameras. The task is typically achieved by exploiting the clothing appearance, as classical biometric traits like the face are impractical in real-world video surveillance scenarios. Clothing appearance is represented by means of low-level local and global features of the images, usually extracted according to some part-based body model to treat different body parts (e.g. torso and legs) independently. The use of novel sensor technologies, e.g. RGB-D cameras like the MS Kinect, could also allow for the extraction of anthropometric measures from a reconstructed 3D model of the body, that can be used in combination with the clothing appearance to increase recognition accuracy. This thesis presents a novel framework, namedMultipleComponentDissimilarity (MCD), to construct descriptors of images of persons, using dissimilarity representations, a recent paradigm in machine learning in which the objects of interest are described as vectors of dissimilarities to a set of predefined prototypes. MCD extends the original dissimilarity paradigm to objects decomposable in multiple parts and with localised characteristics, to better deal with the peculiarities of the human body. The use of MCD has at least three important advantages: (i) a drastic reduction of computational needs, mostly due to the compactness of dissimilarity representations (basically, small vectors of real numbers, easy to store and very fast to be matched); (ii) a totally generic formulation of the underlying low-level representation, that allows one to combine different descriptors, even if they are heterogeneous in terms of the model and features used, into a single dissimilarity vector; (iii) it provides a natural way to learn high-level concepts from low-level representations. Building on its above salient features, MCD is used in this thesis to achieve several objectives: (i) develop an approach to speed up existing person re-identification methods; iii (ii) implement a novel person re-identification method based on the combination of different local and global features into a single dissimilarity vector, able to attain state-ofthe- art performance; (iv) develop a multi-modal approach to person re-identification (a novelty in the literature), by combining the clothing appearance with anthropometric measures extracted through the use of novel RGB-D sensors, into a single dissimilarity vector; (v) develop a method to perform a novel task, proposed for the first time in this thesis, consisting in finding, among a set of images of individuals, those relevant to a textual, semantic query describing clothing appearance of an individual of interest. This task has been named appearance-based people search and can be useful in applications like forensics video analysis, where a textual description of the individual of interest given by a witness can be available, instead of an image. Person re-identification and appearance-based people search are different tasks, aimed at addressing different problems. Still, they can be seen as instances of the more general problem of searching and matching people on multi-media data, e.g., video footages, rangedepth data, speech audio data. Building on the commonalities with Information Retrieval, in the final part of the thesis, a possible formulation of the task of people search on multimedia data will be proposed, with some suggestions and guidelines on how to exploit the MCD framework for addressing this novel class of problems
    corecore