8 research outputs found

    Multimodal Stereoscopic Movie Summarization Conforming to Narrative Characteristics

    Get PDF
    Video summarization is a timely and rapidly developing research field with broad commercial interest, due to the increasing availability of massive video data. Relevant algorithms face the challenge of needing to achieve a careful balance between summary compactness, enjoyability, and content coverage. The specific case of stereoscopic 3D theatrical films has become more important over the past years, but not received corresponding research attention. In this paper, a multi-stage, multimodal summarization process for such stereoscopic movies is proposed, that is able to extract a short, representative video skim conforming to narrative characteristics from a 3D film. At the initial stage, a novel, low-level video frame description method is introduced (frame moments descriptor) that compactly captures informative image statistics from luminance, color, optical flow, and stereoscopic disparity video data, both in a global and in a local scale. Thus, scene texture, illumination, motion, and geometry properties may succinctly be contained within a single frame feature descriptor, which can subsequently be employed as a building block in any key-frame extraction scheme, e.g., for intra-shot frame clustering. The computed key-frames are then used to construct a movie summary in the form of a video skim, which is post-processed in a manner that also considers the audio modality. The next stage of the proposed summarization pipeline essentially performs shot pruning, controlled by a user-provided shot retention parameter, that removes segments from the skim based on the narrative prominence of movie characters in both the visual and the audio modalities. This novel process (multimodal shot pruning) is algebraically modeled as a multimodal matrix column subset selection problem, which is solved using an evolutionary computing approach. Subsequently, disorienting editing effects induced by summarization are dealt with, through manipulation of the video skim. At the last step, the skim is suitably post-processed in order to reduce stereoscopic video defects that may cause visual fatigue

    Towards key-frame extraction methods for 3D video: a review

    Get PDF
    The increasing rate of creation and use of 3D video content leads to a pressing need for methods capable of lowering the cost of 3D video searching, browsing and indexing operations, with improved content selection performance. Video summarisation methods specifically tailored for 3D video content fulfil these requirements. This paper presents a review of the state-of-the-art of a crucial component of 3D video summarisation algorithms: the key-frame extraction methods. The methods reviewed cover 3D video key-frame extraction as well as shot boundary detection methods specific for use in 3D video. The performance metrics used to evaluate the key-frame extraction methods and the summaries derived from those key-frames are presented and discussed. The applications of these methods are also presented and discussed, followed by an exposition about current research challenges on 3D video summarisation methods

    Video object segmentation.

    Get PDF
    Wei Wei.Thesis submitted in: December 2005.Thesis (M.Phil.)--Chinese University of Hong Kong, 2006.Includes bibliographical references (leaves 112-122).Abstracts in English and Chinese.Abstract --- p.IIList of Abbreviations --- p.IVChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview of Content-based Video Standard --- p.1Chapter 1.2 --- Video Object Segmentation --- p.4Chapter 1.2.1 --- Video Object Plane (VOP) --- p.4Chapter 1.2.2 --- Object Segmentation --- p.5Chapter 1.3 --- Problems of Video Object Segmentation --- p.6Chapter 1.4 --- Objective of the research work --- p.7Chapter 1.5 --- Organization of This Thesis --- p.8Chapter 1.6 --- Notes on Publication --- p.8Chapter Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- What is segmentation? --- p.10Chapter 2.1.1 --- Manual Segmentation --- p.10Chapter 2.1.2 --- Automatic Segmentation --- p.11Chapter 2.1.3 --- Semi-automatic segmentation --- p.12Chapter 2.2 --- Segmentation Strategy --- p.14Chapter 2.3 --- Segmentation of Moving Objects --- p.17Chapter 2.3.1 --- Motion --- p.18Chapter 2.3.2 --- Motion Field Representation --- p.19Chapter 2.3.3 --- Video Object Segmentation --- p.25Chapter 2.4 --- Summary --- p.35Chapter Chapter 3 --- Automatic Video Object Segmentation Algorithm --- p.37Chapter 3.1 --- Spatial Segmentation --- p.38Chapter 3.1.1 --- k:-Medians Clustering Algorithm --- p.39Chapter 3.1.2 --- Cluster Number Estimation --- p.41Chapter 3.1.2 --- Region Merging --- p.46Chapter 3.2 --- Foreground Detection --- p.48Chapter 3.2.1 --- Global Motion Estimation --- p.49Chapter 3.2.2 --- Detection of Moving Objects --- p.50Chapter 3.3 --- Object Tracking and Extracting --- p.50Chapter 3.3.1 --- Binary Model Tracking --- p.51Chapter 3.3.1.2 --- Initial Model Extraction --- p.53Chapter 3.3.2 --- Region Descriptor Tracking --- p.59Chapter 3.4 --- Results and Discussions --- p.65Chapter 3.4.1 --- Objective Evaluation --- p.65Chapter 3.4.2 --- Subjective Evaluation --- p.66Chapter 3.5 --- Conclusion --- p.74Chapter Chapter 4 --- Disparity Estimation and its Application in Video Object Segmentation --- p.76Chapter 4.1 --- Disparity Estimation --- p.79Chapter 4.1.1. --- Seed Selection --- p.80Chapter 4.1.2. --- Edge-based Matching by Propagation --- p.82Chapter 4.2 --- Remedy Matching Sparseness by Interpolation --- p.84Chapter 4.2 --- Disparity Applications in Video Conference Segmentation --- p.92Chapter 4.3 --- Conclusion --- p.106Chapter Chapter 5 --- Conclusion and Future Work --- p.108Chapter 5.1 --- Conclusion and Contribution --- p.108Chapter 5.2 --- Future work --- p.109Reference --- p.11

    A family of stereoscopic image compression algorithms using wavelet transforms

    Get PDF
    With the standardization of JPEG-2000, wavelet-based image and video compression technologies are gradually replacing the popular DCT-based methods. In parallel to this, recent developments in autostereoscopic display technology is now threatening to revolutionize the way in which consumers are used to enjoying the traditional 2-D display based electronic media such as television, computer and movies. However, due to the two-fold bandwidth/storage space requirement of stereoscopic imaging, an essential requirement of a stereo imaging system is efficient data compression. In this thesis, seven wavelet-based stereo image compression algorithms are proposed, to take advantage of the higher data compaction capability and better flexibility of wavelets. [Continues.

    A family of stereoscopic image compression algorithms using wavelet transforms

    Get PDF
    With the standardization of JPEG-2000, wavelet-based image and video compression technologies are gradually replacing the popular DCT-based methods. In parallel to this, recent developments in autostereoscopic display technology is now threatening to revolutionize the way in which consumers are used to enjoying the traditional 2D display based electronic media such as television, computer and movies. However, due to the two-fold bandwidth/storage space requirement of stereoscopic imaging, an essential requirement of a stereo imaging system is efficient data compression. In this thesis, seven wavelet-based stereo image compression algorithms are proposed, to take advantage of the higher data compaction capability and better flexibility of wavelets. In the proposed CODEC I, block-based disparity estimation/compensation (DE/DC) is performed in pixel domain. However, this results in an inefficiency when DWT is applied on the whole predictive error image that results from the DE process. This is because of the existence of artificial block boundaries between error blocks in the predictive error image. To overcome this problem, in the remaining proposed CODECs, DE/DC is performed in the wavelet domain. Due to the multiresolution nature of the wavelet domain, two methods of disparity estimation and compensation have been proposed. The first method is performing DEJDC in each subband of the lowest/coarsest resolution level and then propagating the disparity vectors obtained to the corresponding subbands of higher/finer resolution. Note that DE is not performed in every subband due to the high overhead bits that could be required for the coding of disparity vectors of all subbands. This method is being used in CODEC II. In the second method, DEJDC is performed m the wavelet-block domain. This enables disparity estimation to be performed m all subbands simultaneously without increasing the overhead bits required for the coding disparity vectors. This method is used by CODEC III. However, performing disparity estimation/compensation in all subbands would result in a significant improvement of CODEC III. To further improve the performance of CODEC ill, pioneering wavelet-block search technique is implemented in CODEC IV. The pioneering wavelet-block search technique enables the right/predicted image to be reconstructed at the decoder end without the need of transmitting the disparity vectors. In proposed CODEC V, pioneering block search is performed in all subbands of DWT decomposition which results in an improvement of its performance. Further, the CODEC IV and V are able to perform at very low bit rates(< 0.15 bpp). In CODEC VI and CODEC VII, Overlapped Block Disparity Compensation (OBDC) is used with & without the need of coding disparity vector. Our experiment results showed that no significant coding gains could be obtained for these CODECs over CODEC IV & V. All proposed CODECs m this thesis are wavelet-based stereo image coding algorithms that maximise the flexibility and benefits offered by wavelet transform technology when applied to stereo imaging. In addition the use of a baseline-JPEG coding architecture would enable the easy adaptation of the proposed algorithms within systems originally built for DCT-based coding. This is an important feature that would be useful during an era where DCT-based technology is only slowly being phased out to give way for DWT based compression technology. In addition, this thesis proposed a stereo image coding algorithm that uses JPEG-2000 technology as the basic compression engine. The proposed CODEC, named RASTER is a rate scalable stereo image CODEC that has a unique ability to preserve the image quality at binocular depth boundaries, which is an important requirement in the design of stereo image CODEC. The experimental results have shown that the proposed CODEC is able to achieve PSNR gains of up to 3.7 dB as compared to directly transmitting the right frame using JPEG-2000

    Efficient summarization of stereoscopic video sequences

    No full text
    corecore