729 research outputs found

    Using Content Analysis for Video Compression

    Get PDF
    This paper suggests the idea to model video information as a concatenation of different recurring sources. For each source a different tailored compressed representation can be optimally designed so as to best match the intrinsic characteristics of the viewed scene. Since in a video, a shot or scene with similar visual content recurs more than once, even at distant intervals in time, this enables to build a more compact representation of information. In a specific implementation of this idea, we suggest a content-based approach to structure video sequences into hierarchical summaries, and have each such summary represented by a tailored set of dictionaries of codewords. Vector quantization techniques, formerly employed for compression purposes only, have been here used first to represent the visual content of video shots and then to exploit visual-content redundancy inside the video. The depth in the hierarchy determines the precision in the representation both from a structural point of view and from a quality level in reproducing the video sequence. The effectiveness of the proposed method is demonstrated by preliminary tests performed on a limited collection of video-data excerpted from a feature movie. Some additional functionalities such as video skimming may remarkably benefit from this type of representation

    An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications

    Get PDF
    The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on

    Learning from Multiple Sources for Video Summarisation

    Get PDF
    Many visual surveillance tasks, e.g.video summarisation, is conventionally accomplished through analysing imagerybased features. Relying solely on visual cues for public surveillance video understanding is unreliable, since visual observations obtained from public space CCTV video data are often not sufficiently trustworthy and events of interest can be subtle. On the other hand, non-visual data sources such as weather reports and traffic sensory signals are readily accessible but are not explored jointly to complement visual data for video content analysis and summarisation. In this paper, we present a novel unsupervised framework to learn jointly from both visual and independently-drawn non-visual data sources for discovering meaningful latent structure of surveillance video data. In particular, we investigate ways to cope with discrepant dimension and representation whist associating these heterogeneous data sources, and derive effective mechanism to tolerate with missing and incomplete data from different sources. We show that the proposed multi-source learning framework not only achieves better video content clustering than state-of-the-art methods, but also is capable of accurately inferring missing non-visual semantics from previously unseen videos. In addition, a comprehensive user study is conducted to validate the quality of video summarisation generated using the proposed multi-source model
    corecore