2,245 research outputs found

    Automatic extraction and tracking of face sequences in MPEG video

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Vision-Based Production of Personalized Video

    No full text
    In this paper we present a novel vision-based system for the automated production of personalised video souvenirs for visitors in leisure and cultural heritage venues. Visitors are visually identified and tracked through a camera network. The system produces a personalized DVD souvenir at the end of a visitor’s stay allowing visitors to relive their experiences. We analyze how we identify visitors by fusing facial and body features, how we track visitors, how the tracker recovers from failures due to occlusions, as well as how we annotate and compile the final product. Our experiments demonstrate the feasibility of the proposed approach

    Advanced content-based semantic scene analysis and information retrieval: the SCHEMA project

    Get PDF
    The aim of the SCHEMA Network of Excellence is to bring together a critical mass of universities, research centers, industrial partners and end users, in order to design a reference system for content-based semantic scene analysis, interpretation and understanding. Relevant research areas include: content-based multimedia analysis and automatic annotation of semantic multimedia content, combined textual and multimedia information retrieval, semantic -web, MPEG-7 and MPEG-21 standards, user interfaces and human factors. In this paper, recent advances in content-based analysis, indexing and retrieval of digital media within the SCHEMA Network are presented. These advances will be integrated in the SCHEMA module-based, expandable reference system

    A morphological approach for segmentation and tracking of human faces

    Get PDF
    A new technique for segmenting and tracking human faces in video sequences is presented. The technique relies on morphological tools such as using connected operators to extract the connected component that more likely belongs to a face, and partition projection to track this component through the sequence. A binary partition tree (BPT) is used to implement the connected operator. The BPT is constructed based on the chrominance criteria and its nodes are analyzed so that the selected node maximizes an estimation of the likelihood of being part of a face. The tracking is performed using a partition projection approach. Images are divided into face and non-face parts, which are tracked through the sequence. The technique has been successfully assessed using several test sequences from the MPEG-4 (raw format) and the MPEG-7 databases (MPEG-1 format).Peer ReviewedPostprint (published version

    The DICEMAN description schemes for still images and video sequences

    Get PDF
    To address the problem of visual content description, two Description Schemes (DSs) developed within the context of a European ACTS project known as DICEMAN, are presented. The DSs, designed based on an analogy with well-known tools for document description, describe both the structure and semantics of still images and video sequences. The overall structure of both DSs including the various sub-DSs and descriptors (Ds) of which they are composed is described. In each case, the hierarchical sub-DS for describing structure can be constructed using automatic (or semi-automatic) image/video analysis tools. The hierarchical sub-DSs for describing the semantics, however, are constructed by a user. The integration of the two DSs into a video indexing application currently under development in DICEMAN is also briefly described.Peer ReviewedPostprint (published version

    A simple and efficient face detection algorithm for video database applications

    Get PDF
    The objective of this work is to provide a simple and yet efficient tool to detect human faces in video sequences. This information can be very useful for many applications such as video indexing and video browsing. In particular the paper focuses on the significant improvements made to our face detection algorithm presented by Albiol, Bouman and Delp (see IEEE Int. Conference on Image Processing, Kobe, Japan, 1999). Specifically, a novel approach to retrieve skin-like homogeneous regions is presented, which is later used to retrieve face images. Good results have been obtained for a large variety of video sequences.Peer ReviewedPostprint (published version

    A framework for dialogue detection in movies

    No full text
    In this paper, we investigate a novel framework for dialogue detection that is based on indicator functions. An indicator function defines that a particular actor is present at each time instant. Two dialogue detection rules are developed and assessed. The first rule relies on the value of the cross-correlation function at zero time lag that is compared to a threshold. The second rule is based on the cross-power in a particular frequency band that is also compared to a threshold. Experiments are carried out in order to validate the feasibility of the aforementioned dialogue detection rules by using ground-truth indicator functions determined by human observers from six different movies. A total of 25 dialogue scenes and another 8 non-dialogue scenes are employed. The probabilities of false alarm and detection are estimated by cross-validation, where 70% of the available scenes are used to learn the thresholds employed in the dialogue detection rules and the remaining 30% of the scenes are used for testing. An almost perfect dialogue detection is reported for every distinct threshold. © Springer-Verlag Berlin Heidelberg 2006
    corecore