18 research outputs found
Extraction of Significant Video Summaries by Dendrogram Analysis
In the current video analysis scenario, effective clustering of shots facilitates the access to the content and helps in understanding the associated semantics. This paper introduces a cluster analysis on shots which employs dendrogram representation to produce hierarchical summaries of the video document. Vector quantization codebooks are used to represent the visual content and to group the shots with similar chromatic consistency. The evaluation of the cluster codebook distortions, and the exploitation of the dependency relationships on the dendrograms, allow to obtain only a few significant summaries of the whole video. Finally the user can navigate through summaries and decide which one best suites his/her needs for eventual post-processing. The effectiveness of the proposed method is demonstrated by testing it on a collection of video-data from different kinds of programmes. Results are evaluated in terms of metrics that measure the content representational value of the summarization technique
Hierarchical Video Summaries by Dendrogram Cluster Analysis
In the current video analysis scenario, effective summarization of video sequences through shot clustering facilitates the access to the content and helps in understanding the associated semantics. This paper introduces a generic scheme to produce hierarchical summaries of the video document starting from a dendrogram representation of clusters of shots. The evaluation of the cluster distortions, and the exploitation of the dependency relationships between clusters on the dendrograms, allow to obtain only a few semantically significant
summaries of the whole video. Finally the user can navigate through summaries and decide which one best suites his/her needs for eventual post-processing. The effectiveness of the proposed method is demonstrated by testing it on a collection of video-data from different kinds of programmes, using and comparing different visual features on color information. Results are evaluated in terms of metrics that measure the content representational value of the summarization technique
Sparse Transfer Learning for Interactive Video Search Reranking
Visual reranking is effective to improve the performance of the text-based
video search. However, existing reranking algorithms can only achieve limited
improvement because of the well-known semantic gap between low level visual
features and high level semantic concepts. In this paper, we adopt interactive
video search reranking to bridge the semantic gap by introducing user's
labeling effort. We propose a novel dimension reduction tool, termed sparse
transfer learning (STL), to effectively and efficiently encode user's labeling
information. STL is particularly designed for interactive video search
reranking. Technically, it a) considers the pair-wise discriminative
information to maximally separate labeled query relevant samples from labeled
query irrelevant ones, b) achieves a sparse representation for the subspace to
encodes user's intention by applying the elastic net penalty, and c) propagates
user's labeling information from labeled samples to unlabeled samples by using
the data distribution knowledge. We conducted extensive experiments on the
TRECVID 2005, 2006 and 2007 benchmark datasets and compared STL with popular
dimension reduction algorithms. We report superior performance by using the
proposed STL based interactive video search reranking.Comment: 17 page
An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications
The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video
skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on
Recommended from our members
Adaptive Synchronization of Semantically Compressed Instructional Videos for Collaborative Distance Learning
The increasing popularity of online courses has highlighted the need for collaborative learning tools for student groups. In addition, the introduction of lecture videos into the online curriculum has drawn attention to the disparity in the network resources available to students. We present an e-Learning architecture and adaptation model called AI2TV (Adaptive Interactive Internet Team Video), which allows groups of students to collaboratively view a video in synchrony. AI2TV upholds the invariant that each student will view semantically equivalent content at all times. A semantic compression model is developed to provide instructional videos at different level-of-details to accommodate dynamic network conditions and usersĂ€Ă³û system requirements. We take advantage of the semantic compression algorithmĂ€Ă³ûs ability to provide different layers of semantically equivalent video by adapting the client to play at the appropriate layer that provides the client with the richest possible viewing experience. Video player actions, like play, pause and stop, can be initiated by any group member and and the results of those actions are synchronized with all the other students. These features allow students to review a lecture video in tandem, facilitating the learning process. Experimental trials show that AI2TV successfully synchronizes instructional videos for distributed students while concurrently optimizing the video quality, even under conditions of fluctuating bandwidth, by adaptively adjusting the quality level for each student while still maintaining the invariant
CONTENT BASED RETRIEVAL OF LECTURE VIDEO REPOSITORY: LITERATURE REVIEW
Multimedia has a significant role in communicating the information and a large amount of multimedia repositories make the browsing, retrieval and delivery of video contents. For higher education, using video as a tool for learning and teaching through multimedia application is a considerable promise. Many universities adopt educational systems where the teacher lecture is video recorded and the video lecture is made available to students with minimum post-processing effort. Since each video may cover many subjects, it is critical for an e-Learning environment to have content-based video searching capabilities to meet diverse individual learning needs. The present paper reviewed 120+ core research article on the content based retrieval of the lecture video repositories hosted on cloud by government academic and research organization of India
Towards key-frame extraction methods for 3D video: a review
The increasing rate of creation and use of 3D video content leads to a pressing need for methods capable of lowering
the cost of 3D video searching, browsing and indexing operations, with improved content selection performance.
Video summarisation methods specifically tailored for 3D video content fulfil these requirements. This paper presents
a review of the state-of-the-art of a crucial component of 3D video summarisation algorithms: the key-frame
extraction methods. The methods reviewed cover 3D video key-frame extraction as well as shot boundary detection
methods specific for use in 3D video. The performance metrics used to evaluate the key-frame extraction methods
and the summaries derived from those key-frames are presented and discussed. The applications of these methods
are also presented and discussed, followed by an exposition about current research challenges on 3D video
summarisation methods
Color image quality measures and retrieval
The focus of this dissertation is mainly on color image, especially on the images with lossy compression. Issues related to color quantization, color correction, color image retrieval and color image quality evaluation are addressed. A no-reference color image quality index is proposed. A novel color correction method applied to low bit-rate JPEG image is developed. A novel method for content-based image retrieval based upon combined feature vectors of shape, texture, and color similarities has been suggested. In addition, an image specific color reduction method has been introduced, which allows a 24-bit JPEG image to be shown in the 8-bit color monitor with 256-color display. The reduction in download and decode time mainly comes from the smart encoder incorporating with the proposed color reduction method after color space conversion stage. To summarize, the methods that have been developed can be divided into two categories: one is visual representation, and the other is image quality measure.
Three algorithms are designed for visual representation:
(1) An image-based visual representation for color correction on low bit-rate JPEG images. Previous studies on color correction are mainly on color image calibration among devices. Little attention was paid to the compressed image whose color distortion is evident in low bit-rate JPEG images. In this dissertation, a lookup table algorithm is designed based on the loss of PSNR in different compression ratio.
(2) A feature-based representation for content-based image retrieval. It is a concatenated vector of color, shape, and texture features from region of interest (ROI).
(3) An image-specific 256 colors (8 bits) reproduction for color reduction from 16 millions colors (24 bits). By inserting the proposed color reduction method into a JPEG encoder, the image size could be further reduced and the transmission time is also reduced. This smart encoder enables its decoder using less time in decoding.
Three algorithms are designed for image quality measure (IQM):
(1) A referenced IQM based upon image representation in very low-dimension. Previous studies on IQMs are based on high-dimensional domain including spatial and frequency domains. In this dissertation, a low-dimensional domain IQM based on random projection is designed, with preservation of the IQM accuracy in high-dimensional domain.
(2) A no-reference image blurring metric. Based on the edge gradient, the degree of image blur can be measured.
(3) A no-reference color IQM based upon colorfulness, contrast and sharpness