11,795 research outputs found

    Video matching using DC-image and local features

    Get PDF
    This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity

    DC-image for real time compressed video matching

    Get PDF
    This chapter presents a suggested framework for video matching based on local features extracted from the DC-image of MPEG compressed videos, without full decompression. In addition, the relevant arguments and supporting evidences are discussed. Several local feature detectors will be examined to select the best for matching using the DC-image. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and computation complexity. The second experiment compares between using local features and global features regarding compressed video matching with respect to the DC-image. The results confirmed that the use of DC-image, despite its highly reduced size, it is promising as it produces higher matching precision, compared to the full I-frame. Also, SIFT, as a local feature, outperforms most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the real-time margin which leaves a space for further optimizations that can be done to improve this computation complexity

    Gait recognition

    No full text

    A Survey on Ear Biometrics

    No full text
    Recognizing people by their ear has recently received significant attention in the literature. Several reasons account for this trend: first, ear recognition does not suffer from some problems associated with other non contact biometrics, such as face recognition; second, it is the most promising candidate for combination with the face in the context of multi-pose face recognition; and third, the ear can be used for human recognition in surveillance videos where the face may be occluded completely or in part. Further, the ear appears to degrade little with age. Even though, current ear detection and recognition systems have reached a certain level of maturity, their success is limited to controlled indoor conditions. In addition to variation in illumination, other open research problems include hair occlusion; earprint forensics; ear symmetry; ear classification; and ear individuality. This paper provides a detailed survey of research conducted in ear detection and recognition. It provides an up-to-date review of the existing literature revealing the current state-of-art for not only those who are working in this area but also for those who might exploit this new approach. Furthermore, it offers insights into some unsolved ear recognition problems as well as ear databases available for researchers

    Sparse Radial Sampling LBP for Writer Identification

    Full text link
    In this paper we present the use of Sparse Radial Sampling Local Binary Patterns, a variant of Local Binary Patterns (LBP) for text-as-texture classification. By adapting and extending the standard LBP operator to the particularities of text we get a generic text-as-texture classification scheme and apply it to writer identification. In experiments on CVL and ICDAR 2013 datasets, the proposed feature-set demonstrates State-Of-the-Art (SOA) performance. Among the SOA, the proposed method is the only one that is based on dense extraction of a single local feature descriptor. This makes it fast and applicable at the earliest stages in a DIA pipeline without the need for segmentation, binarization, or extraction of multiple features.Comment: Submitted to the 13th International Conference on Document Analysis and Recognition (ICDAR 2015

    Automatic learning of gait signatures for people identification

    Get PDF
    This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task with an image resolution eight times lower than the previously reported results (i.e. 80x60 pixels).Comment: Proof of concept paper. Technical report on the use of ConvNets (CNN) for gait recognition. Data and code: http://www.uco.es/~in1majim/research/cnngaitof.htm

    Unsupervised learning of clutter-resistant visual representations from natural videos

    Get PDF
    Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e.g., position, scale, viewing angle [1, 2, 3]. Though the learning rules are not known, recent results [4, 5, 6] suggest the operation of an unsupervised temporal-association-based method e.g., Foldiak's trace rule [7]. Such methods exploit the temporal continuity of the visual world by assuming that visual experience over short timescales will tend to have invariant identity content. Thus, by associating representations of frames from nearby times, a representation that tolerates whatever transformations occurred in the video may be achieved. Many previous studies verified that such rules can work in simple situations without background clutter, but the presence of visual clutter has remained problematic for this approach. Here we show that temporal association based on large class-specific filters (templates) avoids the problem of clutter. Our system learns in an unsupervised way from natural videos gathered from the internet, and is able to perform a difficult unconstrained face recognition task on natural images: Labeled Faces in the Wild [8]
    corecore