2,717 research outputs found

    Recurrent Attention Models for Depth-Based Person Identification

    Get PDF
    We present an attention-based model that reasons on human body shape and motion dynamics to identify individuals in the absence of RGB information, hence in the dark. Our approach leverages unique 4D spatio-temporal signatures to address the identification problem across days. Formulated as a reinforcement learning task, our model is based on a combination of convolutional and recurrent neural networks with the goal of identifying small, discriminative regions indicative of human identity. We demonstrate that our model produces state-of-the-art results on several published datasets given only depth images. We further study the robustness of our model towards viewpoint, appearance, and volumetric changes. Finally, we share insights gleaned from interpretable 2D, 3D, and 4D visualizations of our model's spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201

    Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

    Get PDF
    Sparsity-based representations have recently led to notable results in various visual recognition tasks. In a separate line of research, Riemannian manifolds have been shown useful for dealing with features and models that do not lie in Euclidean spaces. With the aim of building a bridge between the two realms, we address the problem of sparse coding and dictionary learning over the space of linear subspaces, which form Riemannian structures known as Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping. This in turn enables us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we propose closed-form solutions for learning a Grassmann dictionary, atom by atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann sparse coding and dictionary learning algorithms through embedding into Hilbert spaces. Experiments on several classification tasks (gender recognition, gesture classification, scene analysis, face recognition, action recognition and dynamic texture classification) show that the proposed approaches achieve considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelized Affine Hull Method and graph-embedding Grassmann discriminant analysis.Comment: Appearing in International Journal of Computer Visio

    Sparse error gait image: a new representation for gait recognition

    Get PDF
    The performance of a gait recognition system is very much related to the usage of efficient feature representation and recognition modules. The first extracts features from an input image sequence to represent a user's distinctive gait pattern. The recognition module then compares the features of a probe user with those registered in the gallery database. This paper presents a novel gait feature representation, called Sparse Error Gait Image (SEGI), derived from the application of Robust Principal Component Analysis (RPCA) to Gait Energy Images (GEI). GEIs obtained from the same user at different instants always present some differences. Applying RPCA results in low-rank and sparse error components, the former capturing the commonalities and encompassing the small differences between input GEIs, while the larger differences are captured by the sparse error component. The proposed SEGI representation exploits the latter for recognition purposes. This paper also proposes two simple approaches for the recognition module, to exploit the SEGI, based on the computation of a Euclidean norm or the Euclidean distance. Using these simple recognition methods and the proposed SEGI representation gait recognition, results equivalent to the state-of-the-art are obtained

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Person re-Identification over distributed spaces and time

    Get PDF
    PhDReplicating the human visual system and cognitive abilities that the brain uses to process the information it receives is an area of substantial scientific interest. With the prevalence of video surveillance cameras a portion of this scientific drive has been into providing useful automated counterparts to human operators. A prominent task in visual surveillance is that of matching people between disjoint camera views, or re-identification. This allows operators to locate people of interest, to track people across cameras and can be used as a precursory step to multi-camera activity analysis. However, due to the contrasting conditions between camera views and their effects on the appearance of people re-identification is a non-trivial task. This thesis proposes solutions for reducing the visual ambiguity in observations of people between camera views This thesis first looks at a method for mitigating the effects on the appearance of people under differing lighting conditions between camera views. This thesis builds on work modelling inter-camera illumination based on known pairs of images. A Cumulative Brightness Transfer Function (CBTF) is proposed to estimate the mapping of colour brightness values based on limited training samples. Unlike previous methods that use a mean-based representation for a set of training samples, the cumulative nature of the CBTF retains colour information from underrepresented samples in the training set. Additionally, the bi-directionality of the mapping function is explored to try and maximise re-identification accuracy by ensuring samples are accurately mapped between cameras. Secondly, an extension is proposed to the CBTF framework that addresses the issue of changing lighting conditions within a single camera. As the CBTF requires manually labelled training samples it is limited to static lighting conditions and is less effective if the lighting changes. This Adaptive CBTF (A-CBTF) differs from previous approaches that either do not consider lighting change over time, or rely on camera transition time information to update. By utilising contextual information drawn from the background in each camera view, an estimation of the lighting change within a single camera can be made. This background lighting model allows the mapping of colour information back to the original training conditions and thus remove the need for 3 retraining. Thirdly, a novel reformulation of re-identification as a ranking problem is proposed. Previous methods use a score based on a direct distance measure of set features to form a correct/incorrect match result. Rather than offering an operator a single outcome, the ranking paradigm is to give the operator a ranked list of possible matches and allow them to make the final decision. By utilising a Support Vector Machine (SVM) ranking method, a weighting on the appearance features can be learned that capitalises on the fact that not all image features are equally important to re-identification. Additionally, an Ensemble-RankSVM is proposed to address scalability issues by separating the training samples into smaller subsets and boosting the trained models. Finally, the thesis looks at a practical application of the ranking paradigm in a real world application. The system encompasses both the re-identification stage and the precursory extraction and tracking stages to form an aid for CCTV operators. Segmentation and detection are combined to extract relevant information from the video, while several combinations of matching techniques are combined with temporal priors to form a more comprehensive overall matching criteria. The effectiveness of the proposed approaches is tested on datasets obtained from a variety of challenging environments including offices, apartment buildings, airports and outdoor public spaces

    Uncooperative gait recognition by learning to rank

    Get PDF
    This work has partially been supported by projects CICYT TIN2009-14205-C04-04 from the Spanish Ministry of Innovation and Science, and P1-1B2012-22, PREDOC/2008/04 and E-2011-36 from Universitat Jaume I of CastellΓ³n

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table
    • …
    corecore