1,577 research outputs found

    Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm

    Get PDF
    In recent years, a variety of relevance feedback (RF) schemes have been developed to improve the performance of content-based image retrieval (CBIR). Given user feedback information, the key to a RF scheme is how to select a subset of image features to construct a suitable dissimilarity measure. Among various RF schemes, biased discriminant analysis (BDA) based RF is one of the most promising. It is based on the observation that all positive samples are alike, while in general each negative sample is negative in its own way. However, to use BDA, the small sample size (SSS) problem is a big challenge, as users tend to give a small number of feedback samples. To explore solutions to this issue, this paper proposes a direct kernel BDA (DKBDA), which is less sensitive to SSS. An incremental DKBDA (IDKBDA) is also developed to speed up the analysis. Experimental results are reported on a real-world image collection to demonstrate that the proposed methods outperform the traditional kernel BDA (KBDA) and the support vector machine (SVM) based RF algorithms

    An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications

    Get PDF
    The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on

    A generic news story segmentation system and its evaluation

    Get PDF
    The paper presents an approach to segmenting broadcast TV news programmes automatically into individual news stories. We first segment the programme into individual shots, and then a number of analysis tools are run on the programme to extract features to represent each shot. The results of these feature extraction tools are then combined using a support vector machine trained to detect anchorperson shots. A news broadcast can then be segmented into individual stories based on the location of the anchorperson shots within the programme. We use one generic system to segment programmes from two different broadcasters, illustrating the robustness of our feature extraction process to the production styles of different broadcasters

    Image-Dependent Spatial Shape-Error Concealment

    Get PDF
    Existing spatial shape-error concealment techniques are broadly based upon either parametric curves that exploit geometric information concerning a shape's contour or object shape statistics using a combination of Markov random fields and maximum a posteriori estimation. Both categories are to some extent, able to mask errors caused by information loss, provided the shape is considered independently of the image/video. They palpably however, do not afford the best solution in applications where shape is used as metadata to describe image and video content. This paper presents a novel image-dependent spatial shape-error concealment (ISEC) algorithm that uses both image and shape information by employing the established rubber-band contour detecting function, with the novel enhancement of automatically determining the optimal width of the band to achieve superior error concealment. Experimental results corroborate both qualitatively and numerically, the enhanced performance of the new ISEC strategy compared with established techniques

    Video summarisation: A conceptual framework and survey of the state of the art

    Get PDF
    This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

    MASCOT : metadata for advanced scalable video coding tools : final report

    Get PDF
    The goal of the MASCOT project was to develop new video coding schemes and tools that provide both an increased coding efficiency as well as extended scalability features compared to technology that was available at the beginning of the project. Towards that goal the following tools would be used: - metadata-based coding tools; - new spatiotemporal decompositions; - new prediction schemes. Although the initial goal was to develop one single codec architecture that was able to combine all new coding tools that were foreseen when the project was formulated, it became clear that this would limit the selection of the new tools. Therefore the consortium decided to develop two codec frameworks within the project, a standard hybrid DCT-based codec and a 3D wavelet-based codec, which together are able to accommodate all tools developed during the course of the project

    COSMOS-7: Video-oriented MPEG-7 scheme for modelling and filtering of semantic content

    Get PDF
    MPEG-7 prescribes a format for semantic content models for multimedia to ensure interoperability across a multitude of platforms and application domains. However, the standard leaves it open as to how the models should be used and how their content should be filtered. Filtering is a technique used to retrieve only content relevant to user requirements, thereby reducing the necessary content-sifting effort of the user. This paper proposes an MPEG-7 scheme that can be deployed for semantic content modelling and filtering of digital video. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user
    corecore