2,269 research outputs found

    Unsupervised Discovery of Parts, Structure, and Dynamics

    Full text link
    Humans easily recognize object parts and their hierarchical structure by watching how they move; they can then predict how each part moves in the future. In this paper, we propose a novel formulation that simultaneously learns a hierarchical, disentangled object representation and a dynamics model for object parts from unlabeled videos. Our Parts, Structure, and Dynamics (PSD) model learns to, first, recognize the object parts via a layered image representation; second, predict hierarchy via a structural descriptor that composes low-level concepts into a hierarchical structure; and third, model the system dynamics by predicting the future. Experiments on multiple real and synthetic datasets demonstrate that our PSD model works well on all three tasks: segmenting object parts, building their hierarchical structure, and capturing their motion distributions.Comment: ICLR 2019. The first two authors contributed equally to this wor

    Performance evaluation of feature sets for carried object detection in still images

    Get PDF
    2014 Summer.Includes bibliographical references.Human activity recognition has gathered a lot of interest. The ability to accurately detect carried objects on human beings will directly help activity recognition. This thesis performs evaluation of four different features for carried object detection. To detect carried objects, image chips in a video are extracted by tracking moving objects using an off the shelf tracker. Pixels with similar colors are grouped together by using a superpixel segmentation algorithm. Features are calculated with respect to every superpixel, encoding information regarding their location in the track chip, shape of the superpixel, pose of the person in the track chip, and appearance of the superpixel. ROC curves are used for analyzing the detection of a superpixel as a carried object using these features individually or in a combination. These ROC curves show that the detection using Shape features as they are calculated have very less information. The location features, though simple to calculate, have a significant usable information. Detection using pose of a person in the track chip and appearance of the superpixel depend largely on the data used for their calculation. Pose detections are more likely to be correct if there are no occlusions, while appearance work better if we have high resolution of input images

    Vision-based techniques for gait recognition

    Full text link
    Global security concerns have raised a proliferation of video surveillance devices. Intelligent surveillance systems seek to discover possible threats automatically and raise alerts. Being able to identify the surveyed object can help determine its threat level. The current generation of devices provide digital video data to be analysed for time varying features to assist in the identification process. Commonly, people queue up to access a facility and approach a video camera in full frontal view. In this environment, a variety of biometrics are available - for example, gait which includes temporal features like stride period. Gait can be measured unobtrusively at a distance. The video data will also include face features, which are short-range biometrics. In this way, one can combine biometrics naturally using one set of data. In this paper we survey current techniques of gait recognition and modelling with the environment in which the research was conducted. We also discuss in detail the issues arising from deriving gait data, such as perspective and occlusion effects, together with the associated computer vision challenges of reliable tracking of human movement. Then, after highlighting these issues and challenges related to gait processing, we proceed to discuss the frameworks combining gait with other biometrics. We then provide motivations for a novel paradigm in biometrics-based human recognition, i.e. the use of the fronto-normal view of gait as a far-range biometrics combined with biometrics operating at a near distance
    • …
    corecore