99 research outputs found

    Learning to rank in person re-identification with metric ensembles

    Full text link
    We propose an effective structured learning based approach to the problem of person re-identification which outperforms the current state-of-the-art on most benchmark data sets evaluated. Our framework is built on the basis of multiple low-level hand-crafted and high-level visual features. We then formulate two optimization algorithms, which directly optimize evaluation measures commonly used in person re-identification, also known as the Cumulative Matching Characteristic (CMC) curve. Our new approach is practical to many real-world surveillance applications as the re-identification performance can be concentrated in the range of most practical importance. The combination of these factors leads to a person re-identification system which outperforms most existing algorithms. More importantly, we advance state-of-the-art results on person re-identification by improving the rank-11 recognition rates from 40%40\% to 50%50\% on the iLIDS benchmark, 16%16\% to 18%18\% on the PRID2011 benchmark, 43%43\% to 46%46\% on the VIPeR benchmark, 34%34\% to 53%53\% on the CUHK01 benchmark and 21%21\% to 62%62\% on the CUHK03 benchmark.Comment: 10 page

    Person re-identification via rich color-gradient feature

    Full text link
    © 2016 IEEE. Person re-identification refers to match the same pedestrian across disjoint views in non-overlapping camera networks. Lots of local and global features in the literature are put forward to solve the matching problem, where color feature is robust to viewpoint variance and gradient feature provides a rich representation robust to illumination change. However, how to effectively combine the color and gradient features is an open problem. In this paper, to effectively leverage the color-gradient property in multiple color spaces, we propose a novel Second Order Histogram feature (SOH) for person reidentification in large surveillance dataset. Firstly, we utilize discrete encoding to transform commonly used color space into Encoding Color Space (ECS), and calculate the statistical gradient features on each color channel. Then, a second order statistical distribution is calculated on each cell map with a spatial partition. In this way, the proposed SOH feature effectively leverages the statistical property of gradient and color as well as reduces the redundant information. Finally, a metric learned by KISSME [1] with Mahalanobis distance is used for person matching. Experimental results on three public datasets, VIPeR, CAVIAR and CUHK01, show the promise of the proposed approach

    Covariance tracking: architecture optimizations for embedded systems

    Get PDF

    Real-time Person Re-identification at the Edge: A Mixed Precision Approach

    Full text link
    A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server 3.25×3.25\times reaching to 27.77fps and 1.75×1.75\times, respectively and decreases power consumption on the edge node by 1.45×1.45\times, while it deteriorates accuracy only 5.6\% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid.Comment: This is a pre-print of an article published in International Conference on Image Analysis and Recognition (ICIAR 2019), Lecture Notes in Computer Science. The final authenticated version is available online at https://doi.org/10.1007/978-3-030-27272-2_

    PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS

    Full text link
    [EN] The presence of surveillance systems in our lives has drastically increased during the last years. Camera networks can be seen in almost every crowded public and private place, which generate huge amount of data with valuable information. The automatic analysis of data plays an important role to extract relevant information from the scene. In particular, the problem of person re-identification is a prominent topic that has become of great interest, specially for the fields of security or marketing. However, there are some factors, such as changes in the illumination conditions, variations in the person pose, occlusions or the presence of outliers that make this topic really challenging. Fortunately, the recent introduction of new technologies such as depth cameras opens new paradigms in the image processing field and brings new possibilities. This Thesis proposes a new complete framework to tackle the problem of person re-identification using commercial rgb-depth cameras. This work includes the analysis and evaluation of new approaches for the modules of segmentation, tracking, description and matching. To evaluate our contributions, a public dataset for person re-identification using rgb-depth cameras has been created. Rgb-depth cameras provide accurate 3D point clouds with color information. Based on the analysis of the depth information, an novel algorithm for person segmentation is proposed and evaluated. This method accurately segments any person in the scene, and naturally copes with occlusions and connected people. The segmentation mask of a person generates a 3D person cloud, which can be easily tracked over time based on proximity. The accumulation of all the person point clouds over time generates a set of high dimensional color features, named raw features, that provides useful information about the person appearance. In this Thesis, we propose a family of methods to extract relevant information from the raw features in different ways. The first approach compacts the raw features into a single color vector, named Bodyprint, that provides a good generalisation of the person appearance over time. Second, we introduce the concept of 3D Bodyprint, which is an extension of the Bodyprint descriptor that includes the angular distribution of the color features. Third, we characterise the person appearance as a bag of color features that are independently generated over time. This descriptor receives the name of Bag of Appearances because its similarity with the concept of Bag of Words. Finally, we use different probabilistic latent variable models to reduce the feature vectors from a statistical perspective. The evaluation of the methods demonstrates that our proposals outperform the state of the art.[ES] La presencia de sistemas de vigilancia se ha incrementado notablemente en los últimos anños. Las redes de videovigilancia pueden verse en casi cualquier espacio público y privado concurrido, lo cual genera una gran cantidad de datos de gran valor. El análisis automático de la información juega un papel importante a la hora de extraer información relevante de la escena. En concreto, la re-identificación de personas es un campo que ha alcanzado gran interés durante los últimos años, especialmente en seguridad y marketing. Sin embargo, existen ciertos factores, como variaciones en las condiciones de iluminación, variaciones en la pose de la persona, oclusiones o la presencia de artefactos que hacen de este campo un reto. Afortunadamente, la introducción de nuevas tecnologías como las cámaras de profundidad plantea nuevos paradigmas en la visión artificial y abre nuevas posibilidades. En esta Tesis se propone un marco completo para abordar el problema de re-identificación utilizando cámaras rgb-profundidad. Este trabajo incluye el análisis y evaluación de nuevos métodos de segmentación, seguimiento, descripción y emparejado de personas. Con el fin de evaluar las contribuciones, se ha creado una base de datos pública para re-identificación de personas usando estas cámaras. Las cámaras rgb-profundidad proporcionan nubes de puntos 3D con información de color. A partir de la información de profundidad, se propone y evalúa un nuevo algoritmo de segmentación de personas. Este método segmenta de forma precisa cualquier persona en la escena y resuelve de forma natural problemas de oclusiones y personas conectadas. La máscara de segmentación de una persona genera una nube de puntos 3D que puede ser fácilmente seguida a lo largo del tiempo. La acumulación de todas las nubes de puntos de una persona a lo largo del tiempo genera un conjunto de características de color de grandes dimensiones, denominadas características base, que proporcionan información útil de la apariencia de la persona. En esta Tesis se propone una familia de métodos para extraer información relevante de las características base. La primera propuesta compacta las características base en un vector único de color, denominado Bodyprint, que proporciona una buena generalización de la apariencia de la persona a lo largo del tiempo. En segundo lugar, se introducen los Bodyprints 3D, definidos como una extensión de los Bodyprints que incluyen información angular de las características de color. En tercer lugar, la apariencia de la persona se caracteriza mediante grupos de características de color que se generan independientemente a lo largo del tiempo. Este descriptor recibe el nombre de Grupos de Apariencias debido a su similitud con el concepto de Grupos de Palabras. Finalmente, se proponen diferentes modelos probabilísticos de variables latentes para reducir los vectores de características desde un punto de vista estadístico. La evaluación de los métodos demuestra que nuestras propuestas superan los métodos del estado del arte.[CA] La presència de sistemes de vigilància s'ha incrementat notòriament en els últims anys. Les xarxes de videovigilància poden veure's en quasi qualsevol espai públic i privat concorregut, la qual cosa genera una gran quantitat de dades de gran valor. L'anàlisi automàtic de la informació pren un paper important a l'hora d'extraure informació rellevant de l'escena. En particular, la re-identificaciò de persones és un camp que ha aconseguit gran interès durant els últims anys, especialment en seguretat i màrqueting. No obstant, hi ha certs factors, com variacions en les condicions d'il.luminació, variacions en la postura de la persona, oclusions o la presència d'artefactes que fan d'aquest camp un repte. Afortunadament, la introducció de noves tecnologies com les càmeres de profunditat, planteja nous paradigmes en la visió artificial i obri noves possibilitats. En aquesta Tesi es proposa un marc complet per abordar el problema de la re-identificació mitjançant càmeres rgb-profunditat. Aquest treball inclou l'anàlisi i avaluació de nous mètodes de segmentació, seguiment, descripció i emparellat de persones. Per tal d'avaluar les contribucions, s'ha creat una base de dades pública per re-identificació de persones emprant aquestes càmeres. Les càmeres rgb-profunditat proporcionen núvols de punts 3D amb informació de color. A partir de la informació de profunditat, es defineix i s'avalua un nou algorisme de segmentació de persones. Aquest mètode segmenta de forma precisa qualsevol persona en l'escena i resol de forma natural problemes d'oclusions i persones connectades. La màscara de segmentació d'una persona genera un núvol de punts 3D que pot ser fàcilment seguida al llarg del temps. L'acumulació de tots els núvols de punts d'una persona al llarg del temps genera un conjunt de característiques de color de grans dimensions, anomenades característiques base, que hi proporcionen informació útil de l'aparença de la persona. En aquesta Tesi es proposen una família de mètodes per extraure informació rellevant de les característiques base. La primera proposta compacta les característiques base en un vector únic de color, anomenat Bodyprint, que proporciona una bona generalització de l'aparença de la persona al llarg del temps. En segon lloc, s'introdueixen els Bodyprints 3D, definits com una extensió dels Bodyprints que inclouen informació angular de les característiques de color. En tercer lloc, l'aparença de la persona es caracteritza amb grups de característiques de color que es generen independentment a llarg del temps. Aquest descriptor reb el nom de Grups d'Aparences a causa de la seua similitud amb el concepte de Grups de Paraules. Finalment, es proposen diferents models probabilístics de variables latents per reduir els vectors de característiques des d'un punt de vista estadístic. L'avaluació dels mètodes demostra que les propostes presentades superen als mètodes de l'estat de l'art.Oliver Moll, J. (2015). PERSON RE-IDENTIFICATION USING RGB-DEPTH CAMERAS [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/59227TESI

    Structured learning of metric ensembles with application to person re-identification

    Full text link
    Matching individuals across non-overlapping camera networks, known as person re-identification, is a fundamentally challenging problem due to the large visual appearance changes caused by variations of viewpoints, lighting, and occlusion. Approaches in literature can be categoried into two streams: The first stream is to develop reliable features against realistic conditions by combining several visual features in a pre-defined way; the second stream is to learn a metric from training data to ensure strong inter-class differences and intra-class similarities. However, seeking an optimal combination of visual features which is generic yet adaptive to different benchmarks is a unsoved problem, and metric learning models easily get over-fitted due to the scarcity of training data in person re-identification. In this paper, we propose two effective structured learning based approaches which explore the adaptive effects of visual features in recognizing persons in different benchmark data sets. Our framework is built on the basis of multiple low-level visual features with an optimal ensemble of their metrics. We formulate two optimization algorithms, CMCtriplet and CMCstruct, which directly optimize evaluation measures commonly used in person re-identification, also known as the Cumulative Matching Characteristic (CMC) curve.Comment: 16 pages. Extended version of "Learning to Rank in Person Re-Identification With Metric Ensembles", at http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Paisitkriangkrai_Learning_to_Rank_2015_CVPR_paper.html. arXiv admin note: text overlap with arXiv:1503.0154

    Plant species classification using flower images - a comparative study of local feature representations

    Get PDF
    Steady improvements of image description methods induced a growing interest in imagebased plant species classification, a task vital to the study of biodiversity and ecological sensitivity. Various techniques have been proposed for general object classification over the past years and several of them have already been studied for plant species classification. However, results of these studies are selective in the evaluated steps of a classification pipeline, in the utilized datasets for evaluation, and in the compared baseline methods. No study is available that evaluates the main competing methods for building an image representation on the same datasets allowing for generalized findings regarding flower-based plant species classification. The aim of this paper is to comparatively evaluate methods, method combinations, and their parameters towards classification accuracy. The investigated methods span from detection, extraction, fusion, pooling, to encoding of local features for quantifying shape and color information of flower images. We selected the flower image datasets Oxford Flower 17 and Oxford Flower 102 as well as our own Jena Flower 30 dataset for our experiments. Findings show large differences among the various studied techniques and that their wisely chosen orchestration allows for high accuracies in species classification. We further found that true local feature detectors in combination with advanced encoding methods yield higher classification results at lower computational costs compared to commonly used dense sampling and spatial pooling methods. Color was found to be an indispensable feature for high classification results, especially while preserving spatial correspondence to gray-level features. In result, our study provides a comprehensive overview of competing techniques and the implications of their main parameters for flowerbased plant species classification

    Methods for iris classification and macro feature detection

    Get PDF
    This work deals with two distinct aspects of iris-based biometric systems: iris classification and macro-feature detection. Iris classification will benefit identification systems where the query image has to be compared against all identities in the database. By preclassifying the query image based on its texture, this comparison is executed only against those irises that are from the same class as the query image. In the proposed classification method, the normalized iris is tessellated into overlapping rectangular blocks and textural features are extracted from each block. A clustering scheme is used to generate multiple classes of irises based on the extracted features. A minimum distance classifier is then used to assign the query iris to a particular class. The use of multiple blocks with decision level fusion in the classification process is observed to enhance the accuracy of the method.;Most iris-based systems use the global and local texture information of the iris to perform matching. In order to exploit the anatomical structures within the iris during the matching stage, two methods to detect the macro-features of the iris in multi-spectral images are proposed. These macro-features typically correspond to anomalies in pigmentation and structure within the iris. The first method uses the edge-flow technique to localize these features. The second technique uses the SIFT (Scale Invariant Feature Transform) operator to detect discontinuities in the image. Preliminary results show that detection of these macro features is a difficult problem owing to the richness and variability in iris color and texture. Thus a large number of spurious features are detected by both the methods suggesting the need for designing more sophisticated algorithms. However the ability of the SIFT operator to match partial iris images is demonstrated thereby indicating the potential of this scheme to be used for macro-feature detection
    • …
    corecore