14,829 research outputs found

    Face detection and clustering for video indexing applications

    Get PDF
    This paper describes a method for automatically detecting human faces in generic video sequences. We employ an iterative algorithm in order to give a confidence measure for the presence or absence of faces within video shots. Skin colour filtering is carried out on a selected number of frames per video shot, followed by the application of shape and size heuristics. Finally, the remaining candidate regions are normalized and projected into an eigenspace, the reconstruction error being the measure of confidence for presence/absence of face. Following this, the confidence score for the entire video shot is calculated. In order to cluster extracted faces into a set of face classes, we employ an incremental procedure using a PCA-based dissimilarity measure in con-junction with spatio-temporal correlation. Experiments were carried out on a representative broadcast news test corpus

    Activity-driven content adaptation for effective video summarisation

    Get PDF
    In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided

    Real-time 3D Face Recognition using Line Projection and Mesh Sampling

    Get PDF
    The main contribution of this paper is to present a novel method for automatic 3D face recognition based on sampling a 3D mesh structure in the presence of noise. A structured light method using line projection is employed where a 3D face is reconstructed from a single 2D shot. The process from image acquisition to recognition is described with focus on its real-time operation. Recognition results are presented and it is demonstrated that it can perform recognition in just over one second per subject in continuous operation mode and thus, suitable for real time operation

    Scene extraction in motion pictures

    Full text link
    This paper addresses the challenge of bridging the semantic gap between the rich meaning users desire when they query to locate and browse media and the shallowness of media descriptions that can be computed in today\u27s content management systems. To facilitate high-level semantics-based content annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from fill production to determine when a scene change occurs. We then investigate different rules and conventions followed as part of Fill Grammar that would guide and shape an algorithmic solution for determining a scene. Two different techniques using intershot analysis are proposed as solutions in this paper. In addition, we present different refinement mechanisms, such as film-punctuation detection founded on Film Grammar, to further improve the results. These refinement techniques demonstrate significant improvements in overall performance. Furthermore, we analyze errors in the context of film-production techniques, which offer useful insights into the limitations of our method

    Visualization and Correction of Automated Segmentation, Tracking and Lineaging from 5-D Stem Cell Image Sequences

    Get PDF
    Results: We present an application that enables the quantitative analysis of multichannel 5-D (x, y, z, t, channel) and large montage confocal fluorescence microscopy images. The image sequences show stem cells together with blood vessels, enabling quantification of the dynamic behaviors of stem cells in relation to their vascular niche, with applications in developmental and cancer biology. Our application automatically segments, tracks, and lineages the image sequence data and then allows the user to view and edit the results of automated algorithms in a stereoscopic 3-D window while simultaneously viewing the stem cell lineage tree in a 2-D window. Using the GPU to store and render the image sequence data enables a hybrid computational approach. An inference-based approach utilizing user-provided edits to automatically correct related mistakes executes interactively on the system CPU while the GPU handles 3-D visualization tasks. Conclusions: By exploiting commodity computer gaming hardware, we have developed an application that can be run in the laboratory to facilitate rapid iteration through biological experiments. There is a pressing need for visualization and analysis tools for 5-D live cell image data. We combine accurate unsupervised processes with an intuitive visualization of the results. Our validation interface allows for each data set to be corrected to 100% accuracy, ensuring that downstream data analysis is accurate and verifiable. Our tool is the first to combine all of these aspects, leveraging the synergies obtained by utilizing validation information from stereo visualization to improve the low level image processing tasks.Comment: BioVis 2014 conferenc

    An examination of automatic video retrieval technology on access to the contents of an historical video archive

    Get PDF
    Purpose – This paper aims to provide an initial understanding of the constraints that historical video collections pose to video retrieval technology and the potential that online access offers to both archive and users. Design/methodology/approach – A small and unique collection of videos on customs and folklore was used as a case study. Multiple methods were employed to investigate the effectiveness of technology and the modality of user access. Automatic keyframe extraction was tested on the visual content while the audio stream was used for automatic classification of speech and music clips. The user access (search vs browse) was assessed in a controlled user evaluation. A focus group and a survey provided insight on the actual use of the analogue archive. The results of these multiple studies were then compared and integrated (triangulation). Findings – The amateur material challenged automatic techniques for video and audio indexing, thus suggesting that the technology must be tested against the material before deciding on a digitisation strategy. Two user interaction modalities, browsing vs searching, were tested in a user evaluation. Results show users preferred searching, but browsing becomes essential when the search engine fails in matching query and indexed words. Browsing was also valued for serendipitous discovery; however the organisation of the archive was judged cryptic and therefore of limited use. This indicates that the categorisation of an online archive should be thought of in terms of users who might not understand the current classification. The focus group and the survey showed clearly the advantage of online access even when the quality of the video surrogate is poor. The evidence gathered suggests that the creation of a digital version of a video archive requires a rethinking of the collection in terms of the new medium: a new archive should be specially designed to exploit the potential that the digital medium offers. Similarly, users' needs have to be considered before designing the digital library interface, as needs are likely to be different from those imagined. Originality/value – This paper is the first attempt to understand the advantages offered and limitations held by video retrieval technology for small video archives like those often found in special collections
    • 

    corecore