2,768 research outputs found

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Sistema para análise automatizada de movimento durante a marcha usando uma câmara RGB-D

    Get PDF
    Nowadays it is still common in clinical practice to assess the gait (or way of walking) of a given subject through the visual observation and use of a rating scale, which is a subjective approach. However, sensors including RGB-D cameras, such as the Microsoft Kinect, can be used to obtain quantitative information that allows performing gait analysis in a more objective way. The quantitative gait analysis results can be very useful for example to support the clinical assessment of patients with diseases that can affect their gait, such as Parkinson’s disease. The main motivation of this thesis was thus to provide support to gait assessment, by allowing to carry out quantitative gait analysis in an automated way. This objective was achieved by using 3-D data, provided by a single RGB-D camera, to automatically select the data corresponding to walking and then detect the gait cycles performed by the subject while walking. For each detected gait cycle, we obtain several gait parameters, which are used together with anthropometric measures to automatically identify the subject being assessed. The automated gait data selection relies on machine learning techniques to recognize three different activities (walking, standing, and marching), as well as two different positions of the subject in relation to the camera (facing the camera and facing away from it). For gait cycle detection, we developed an algorithm that estimates the instants corresponding to given gait events. The subject identification based on gait is enabled by a solution that was also developed by relying on machine learning. The developed solutions were integrated into a system for automated gait analysis, which we found to be a viable alternative to gold standard systems for obtaining several spatiotemporal and some kinematic gait parameters. Furthermore, the system is suitable for use in clinical environments, as well as ambulatory scenarios, since it relies on a single markerless RGB-D camera that is less expensive, more portable, less intrusive and easier to set up, when compared with the gold standard systems (multiple cameras and several markers attached to the subject’s body).Atualmente ainda é comum na prática clínica avaliar a marcha (ou o modo de andar) de uma certa pessoa através da observação visual e utilização de uma escala de classificação, o que é uma abordagem subjetiva. No entanto, existem sensores incluindo câmaras RGB-D, como a Microsoft Kinect, que podem ser usados para obter informação quantitativa que permite realizar a análise da marcha de um modo mais objetivo. Os resultados quantitativos da análise da marcha podem ser muito úteis, por exemplo, para apoiar a avaliação clínica de pessoas com doenças que podem afetar a sua marcha, como a doença de Parkinson. Assim, a principal motivação desta tese foi fornecer apoio à avaliação da marcha, permitindo realizar a análise quantitativa da marcha de forma automatizada. Este objetivo foi atingido usando dados em 3-D, fornecidos por uma única câmara RGB-D, para automaticamente selecionar os dados correspondentes a andar e, em seguida, detetar os ciclos de marcha executados pelo sujeito durante a marcha. Para cada ciclo de marcha identificado, obtemos vários parâmetros de marcha, que são usados em conjunto com medidas antropométricas para identificar automaticamente o sujeito que está a ser avaliado. A seleção automatizada de dados de marcha usa técnicas de aprendizagem máquina para reconhecer três atividades diferentes (andar, estar parado em pé e marchar), bem como duas posições diferentes do sujeito em relação à câmara (de frente para a câmara e de costas para ela). Para a deteção dos ciclos da marcha, desenvolvemos um algoritmo que estima os instantes correspondentes a determinados eventos da marcha. A identificação do sujeito com base na sua marcha é realizada usando uma solução que também foi desenvolvida com base em aprendizagem máquina. As soluções desenvolvidas foram integradas num sistema de análise automatizada de marcha, que demonstrámos ser uma alternativa viável a sistemas padrão de referência para obter vários parâmetros de marcha espácio-temporais e alguns parâmetros angulares. Além disso, o sistema é adequado para uso em ambientes clínicos, bem como em cenários ambulatórios, pois depende de apenas de uma câmara RGB-D que não usa marcadores e é menos dispendiosa, mais portátil, menos intrusiva e mais fácil de configurar, quando comparada com os sistemas padrão de referência (múltiplas câmaras e vários marcadores colocados no corpo do sujeito).Programa Doutoral em Informátic

    Novel Architecture for Human Re-Identification with a Two-Stream Neural Network and Attention Mechanism

    Get PDF
    This paper proposes a novel architecture that utilises an attention mechanism in conjunction with multi-stream convolutional neural networks (CNN) to obtain high accuracy in human re-identification (Reid). The proposed architecture consists of four blocks. First, the pre-processing block prepares the input data and feeds it into a spatial-temporal two-stream CNN (STC) with two fusion points that extract the spatial-temporal features. Next, the spatial-temporal attentional LSTM block (STA) automatically fine-tunes the extracted features and assigns weight to the more critical frames in the video sequence by using an attention mechanism. Extensive experiments on four of the most popular datasets support our architecture. Finally, the results are compared with the state of the art, which shows the superiority of this approach

    Re-identification and semantic retrieval of pedestrians in video surveillance scenarios

    Get PDF
    Person re-identification consists of recognizing individuals across different sensors of a camera network. Whereas clothing appearance cues are widely used, other modalities could be exploited as additional information sources, like anthropometric measures and gait. In this work we investigate whether the re-identification accuracy of clothing appearance descriptors can be improved by fusing them with anthropometric measures extracted from depth data, using RGB-Dsensors, in unconstrained settings. We also propose a dissimilaritybased framework for building and fusing multi-modal descriptors of pedestrian images for re-identification tasks, as an alternative to the widely used score-level fusion. The experimental evaluation is carried out on two data sets including RGB-D data, one of which is a novel, publicly available data set that we acquired using Kinect sensors. In this dissertation we also consider a related task, named semantic retrieval of pedestrians in video surveillance scenarios, which consists of searching images of individuals using a textual description of clothing appearance as a query, given by a Boolean combination of predefined attributes. This can be useful in applications like forensic video analysis, where the query can be obtained froma eyewitness report. We propose a general method for implementing semantic retrieval as an extension of a given re-identification system that uses any multiple part-multiple component appearance descriptor. Additionally, we investigate on deep learning techniques to improve both the accuracy of attribute detectors and generalization capabilities. Finally, we experimentally evaluate our methods on several benchmark datasets originally built for re-identification task

    Novel architecture for human re-identification with a two-stream neural network and attention ,echanism

    Get PDF
    This paper proposes a novel architecture that utilises an attention mechanism in conjunction with multi-stream convolutional neural networks (CNN) to obtain high accuracy in human re-identification (Reid). The proposed architecture consists of four blocks. First, the pre-processing block prepares the input data and feeds it into a spatial-temporal two-stream CNN (STC) with two fusion points that extract the spatial-temporal features. Next, the spatial-temporal attentional LSTM block (STA) automatically fine-tunes the extracted features and assigns weight to the more critical frames in the video sequence by using an attention mechanism. Extensive experiments on four of the most popular datasets support our architecture. Finally, the results are compared with the state of the art, which shows the superiority of this approach

    Autonomous Person-Specific Following Robot

    Full text link
    Following a specific user is a desired or even required capability for service robots in many human-robot collaborative applications. However, most existing person-following robots follow people without knowledge of who it is following. In this paper, we proposed an identity-specific person tracker, capable of tracking and identifying nearby people, to enable person-specific following. Our proposed method uses a Sequential Nearest Neighbour with Thresholding Selection algorithm we devised to fuse together an anonymous person tracker and a face recogniser. Experiment results comparing our proposed method with alternative approaches showed that our method achieves better performance in tracking and identifying people, as well as improved robot performance in following a target individual

    Gait Recognition: Databases, Representations, and Applications

    No full text
    There has been considerable progress in automatic recognition of people by the way they walk since its inception almost 20 years ago: there is now a plethora of technique and data which continue to show that a person’s walking is indeed unique. Gait recognition is a behavioural biometric which is available even at a distance from a camera when other biometrics may be occluded, obscured or suffering from insufficient image resolution (e.g. a blurred face image or a face image occluded by mask). Since gait recognition does not require subject cooperation due to its non-invasive capturing process, it is expected to be applied for criminal investigation from CCTV footages in public and private spaces. This article introduces current progress, a research background, and basic approaches for gait recognition in the first three sections, and two important aspects of gait recognition, the gait databases and gait feature representations are described in the following sections.Publicly available gait databases are essential for benchmarking individual approaches, and such databases should contain a sufficient number of subjects as well as covariate factors to realize statistically reliable performance evaluation and also robust gait recognition. Gait recognition researchers have therefore built such useful gait databases which incorporate subject diversities and/or rich covariate factors.Gait feature representation is also an important aspect for effective and efficient gait recognition. We describe the two main approaches to representation: model-free (appearance-based) approaches and model-based approaches. In particular, silhouette-based model-free approaches predominate in recent studies and many have been proposed and are described in detail.Performance evaluation results of such recent gait feature representations on two of the publicly available gait databases are reported: USF Human ID with rich covariate factors such as views, surface, bag, shoes, time elapse; and OU-ISIR LP with more than 4,000 subjects. Since gait recognition is suitable for criminal investigation applications of the gait recognition to forensics are addressed with real criminal cases in the application section. Finally, several open problems of the gait recognition are discussed to show future research avenues of the gait recognition

    PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS

    Full text link
    [EN] The presence of surveillance systems in our lives has drastically increased during the last years. Camera networks can be seen in almost every crowded public and private place, which generate huge amount of data with valuable information. The automatic analysis of data plays an important role to extract relevant information from the scene. In particular, the problem of person re-identification is a prominent topic that has become of great interest, specially for the fields of security or marketing. However, there are some factors, such as changes in the illumination conditions, variations in the person pose, occlusions or the presence of outliers that make this topic really challenging. Fortunately, the recent introduction of new technologies such as depth cameras opens new paradigms in the image processing field and brings new possibilities. This Thesis proposes a new complete framework to tackle the problem of person re-identification using commercial rgb-depth cameras. This work includes the analysis and evaluation of new approaches for the modules of segmentation, tracking, description and matching. To evaluate our contributions, a public dataset for person re-identification using rgb-depth cameras has been created. Rgb-depth cameras provide accurate 3D point clouds with color information. Based on the analysis of the depth information, an novel algorithm for person segmentation is proposed and evaluated. This method accurately segments any person in the scene, and naturally copes with occlusions and connected people. The segmentation mask of a person generates a 3D person cloud, which can be easily tracked over time based on proximity. The accumulation of all the person point clouds over time generates a set of high dimensional color features, named raw features, that provides useful information about the person appearance. In this Thesis, we propose a family of methods to extract relevant information from the raw features in different ways. The first approach compacts the raw features into a single color vector, named Bodyprint, that provides a good generalisation of the person appearance over time. Second, we introduce the concept of 3D Bodyprint, which is an extension of the Bodyprint descriptor that includes the angular distribution of the color features. Third, we characterise the person appearance as a bag of color features that are independently generated over time. This descriptor receives the name of Bag of Appearances because its similarity with the concept of Bag of Words. Finally, we use different probabilistic latent variable models to reduce the feature vectors from a statistical perspective. The evaluation of the methods demonstrates that our proposals outperform the state of the art.[ES] La presencia de sistemas de vigilancia se ha incrementado notablemente en los últimos anños. Las redes de videovigilancia pueden verse en casi cualquier espacio público y privado concurrido, lo cual genera una gran cantidad de datos de gran valor. El análisis automático de la información juega un papel importante a la hora de extraer información relevante de la escena. En concreto, la re-identificación de personas es un campo que ha alcanzado gran interés durante los últimos años, especialmente en seguridad y marketing. Sin embargo, existen ciertos factores, como variaciones en las condiciones de iluminación, variaciones en la pose de la persona, oclusiones o la presencia de artefactos que hacen de este campo un reto. Afortunadamente, la introducción de nuevas tecnologías como las cámaras de profundidad plantea nuevos paradigmas en la visión artificial y abre nuevas posibilidades. En esta Tesis se propone un marco completo para abordar el problema de re-identificación utilizando cámaras rgb-profundidad. Este trabajo incluye el análisis y evaluación de nuevos métodos de segmentación, seguimiento, descripción y emparejado de personas. Con el fin de evaluar las contribuciones, se ha creado una base de datos pública para re-identificación de personas usando estas cámaras. Las cámaras rgb-profundidad proporcionan nubes de puntos 3D con información de color. A partir de la información de profundidad, se propone y evalúa un nuevo algoritmo de segmentación de personas. Este método segmenta de forma precisa cualquier persona en la escena y resuelve de forma natural problemas de oclusiones y personas conectadas. La máscara de segmentación de una persona genera una nube de puntos 3D que puede ser fácilmente seguida a lo largo del tiempo. La acumulación de todas las nubes de puntos de una persona a lo largo del tiempo genera un conjunto de características de color de grandes dimensiones, denominadas características base, que proporcionan información útil de la apariencia de la persona. En esta Tesis se propone una familia de métodos para extraer información relevante de las características base. La primera propuesta compacta las características base en un vector único de color, denominado Bodyprint, que proporciona una buena generalización de la apariencia de la persona a lo largo del tiempo. En segundo lugar, se introducen los Bodyprints 3D, definidos como una extensión de los Bodyprints que incluyen información angular de las características de color. En tercer lugar, la apariencia de la persona se caracteriza mediante grupos de características de color que se generan independientemente a lo largo del tiempo. Este descriptor recibe el nombre de Grupos de Apariencias debido a su similitud con el concepto de Grupos de Palabras. Finalmente, se proponen diferentes modelos probabilísticos de variables latentes para reducir los vectores de características desde un punto de vista estadístico. La evaluación de los métodos demuestra que nuestras propuestas superan los métodos del estado del arte.[CA] La presència de sistemes de vigilància s'ha incrementat notòriament en els últims anys. Les xarxes de videovigilància poden veure's en quasi qualsevol espai públic i privat concorregut, la qual cosa genera una gran quantitat de dades de gran valor. L'anàlisi automàtic de la informació pren un paper important a l'hora d'extraure informació rellevant de l'escena. En particular, la re-identificaciò de persones és un camp que ha aconseguit gran interès durant els últims anys, especialment en seguretat i màrqueting. No obstant, hi ha certs factors, com variacions en les condicions d'il.luminació, variacions en la postura de la persona, oclusions o la presència d'artefactes que fan d'aquest camp un repte. Afortunadament, la introducció de noves tecnologies com les càmeres de profunditat, planteja nous paradigmes en la visió artificial i obri noves possibilitats. En aquesta Tesi es proposa un marc complet per abordar el problema de la re-identificació mitjançant càmeres rgb-profunditat. Aquest treball inclou l'anàlisi i avaluació de nous mètodes de segmentació, seguiment, descripció i emparellat de persones. Per tal d'avaluar les contribucions, s'ha creat una base de dades pública per re-identificació de persones emprant aquestes càmeres. Les càmeres rgb-profunditat proporcionen núvols de punts 3D amb informació de color. A partir de la informació de profunditat, es defineix i s'avalua un nou algorisme de segmentació de persones. Aquest mètode segmenta de forma precisa qualsevol persona en l'escena i resol de forma natural problemes d'oclusions i persones connectades. La màscara de segmentació d'una persona genera un núvol de punts 3D que pot ser fàcilment seguida al llarg del temps. L'acumulació de tots els núvols de punts d'una persona al llarg del temps genera un conjunt de característiques de color de grans dimensions, anomenades característiques base, que hi proporcionen informació útil de l'aparença de la persona. En aquesta Tesi es proposen una família de mètodes per extraure informació rellevant de les característiques base. La primera proposta compacta les característiques base en un vector únic de color, anomenat Bodyprint, que proporciona una bona generalització de l'aparença de la persona al llarg del temps. En segon lloc, s'introdueixen els Bodyprints 3D, definits com una extensió dels Bodyprints que inclouen informació angular de les característiques de color. En tercer lloc, l'aparença de la persona es caracteritza amb grups de característiques de color que es generen independentment a llarg del temps. Aquest descriptor reb el nom de Grups d'Aparences a causa de la seua similitud amb el concepte de Grups de Paraules. Finalment, es proposen diferents models probabilístics de variables latents per reduir els vectors de característiques des d'un punt de vista estadístic. L'avaluació dels mètodes demostra que les propostes presentades superen als mètodes de l'estat de l'art.Oliver Moll, J. (2015). PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/59227TESI

    Gait Recognition

    Get PDF
    Gait recognition has received increasing attention as a remote biometric identification technology, i.e. it can achieve identification at the long distance that few other identification technologies can work. It shows enormous potential to apply in the field of criminal investigation, medical treatment, identity recognition, human‐computer interaction and so on. In this chapter, we introduce the state‐of‐the‐art gait recognition techniques, which include 3D‐based and 2D‐based methods, in the first part. And considering the advantages of 3D‐based methods, their related datasets are introduced as well as our gait database with both 2D silhouette images and 3D joints information in the second part. Given our gait dataset, a human walking model and the corresponding static and dynamic feature extraction are presented, which are verified to be view‐invariant, in the third part. And some gait‐based applications are introduced
    corecore