10,356 research outputs found

    A data augmentation methodology for training machine/deep learning gait recognition algorithms

    Get PDF
    There are several confounding factors that can reduce the accuracy of gait recognition systems. These factors can reduce the distinctiveness, or alter the features used to characterise gait; they include variations in clothing, lighting, pose and environment, such as the walking surface. Full invariance to all confounding factors is challenging in the absence of high-quality labelled training data. We introduce a simulation-based methodology and a subject-specific dataset which can be used for generating synthetic video frames and sequences for data augmentation. With this methodology, we generated a multi-modal dataset. In addition, we supply simulation files that provide the ability to simultaneously sample from several confounding variables. The basis of the data is real motion capture data of subjects walking and running on a treadmill at different speeds. Results from gait recognition experiments suggest that information about the identity of subjects is retained within synthetically generated examples. The dataset and methodology allow studies into fully-invariant identity recognition spanning a far greater number of observation conditions than would otherwise be possible

    Object categorisation, object naming, and viewpoint-independence in visual remembering: Evidence from young children's drawings of a novel object

    Get PDF
    A simple object-drawing task confirms a three-way association between object categorisation, viewpoint independence, and longer-term visual remembering. Young children (5- to 7-year-olds) drew a familiar object or a novel object, immediately after it had been hidden from view or on the following day. Both objects were shown from a full range of viewpoints or from just two viewpoints, from neither of which would either object normally be drawn after unrestricted viewing. When drawing from short-term memory after restricted viewing, both objects were most likely to be depicted from a seen viewpoint. When drawing from longer-term memory after restricted viewing, the novel object continued to be drawn from a seen viewpoint, but the mug was now most likely to be drawn from a preferred viewpoint from which it had not been seen. Naming the novel object with a novel count noun ("Look at this. This is a dax"), to signal that it belonged to an object category, resulted in it being drawn in the same way as the familiar object. The results concur with other evidence indicating that short-term and longer-term visual remembering are differentially associated with viewpoint-dependent representations of individual objects and viewpoint independent representations of object categories, respectively

    Challenges in Disentangling Independent Factors of Variation

    Full text link
    We study the problem of building models that disentangle independent factors of variation. Such models could be used to encode features that can efficiently be used for classification and to transfer attributes between different images in image synthesis. As data we use a weakly labeled training set. Our weak labels indicate what single factor has changed between two data samples, although the relative value of the change is unknown. This labeling is of particular interest as it may be readily available without annotation costs. To make use of weak labels we introduce an autoencoder model and train it through constraints on image pairs and triplets. We formally prove that without additional knowledge there is no guarantee that two images with the same factor of variation will be mapped to the same feature. We call this issue the reference ambiguity. Moreover, we show the role of the feature dimensionality and adversarial training. We demonstrate experimentally that the proposed model can successfully transfer attributes on several datasets, but show also cases when the reference ambiguity occurs.Comment: Submitted to ICLR 201

    Gait recognition under carrying condition : a static dynamic fusion method

    Get PDF
    When an individual carries an object, such as a briefcase, conventional gait recognition algorithms based on average silhouette/Gait Energy Image (GEI) do not always perform well as the object carried may have the potential of being mistakenly regarded as a part of the human body. To solve such a problem, in this paper, instead of directly applying GEI to represent the gait information, we propose a novel dynamic feature template for classification. Based on this extracted dynamic information and some static feature templates (i.e., head part and trunk part), we cast gait recognition on the large USF (University of South Florida) database by adopting a static/dynamic fusion strategy. For the experiments involving carrying condition covariate, significant improvements are achieved when compared with other classic algorithms

    PANDA: Pose Aligned Networks for Deep Attribute Modeling

    Full text link
    We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion. Convolutional Neural Nets (CNN) have been shown to perform very well on large scale object recognition problems. In the context of attribute classification, however, the signal is often subtle and it may cover only a small part of the image, while the image is dominated by the effects of pose and viewpoint. Discounting for pose variation would require training on very large labeled datasets which are not presently available. Part-based models, such as poselets and DPM have been shown to perform well for this problem but they are limited by shallow low-level features. We propose a new method which combines part-based models and deep learning by training pose-normalized CNNs. We show substantial improvement vs. state-of-the-art methods on challenging attribute classification tasks in unconstrained settings. Experiments confirm that our method outperforms both the best part-based methods on this problem and conventional CNNs trained on the full bounding box of the person.Comment: 8 page
    • 

    corecore