10 research outputs found

    Fast Mode Decision Algorithms for Adaptive GOP Structure in the Scalable Extension of H.264/AVC

    Get PDF
    [[abstract]]We propose a fast mode decision algorithm to reduce the computational complexity of adaptive GOP structure (AGS) in the scalable extension of H.264/AVC. AGS can improve the coding efficiency of the scalable extension of H.264. It, however, needs to perform motion-compensated temporal filtering (MCTF) of all possible GOP sizes, leading to much higher computation than the fixed GOP structure. In our proposed algorithm, after performing the MCTF with the maximum GOP size, we utilize two features to decide whether to perform the remaining MCTFs of sub-GOPs and mode selection. Experimental results show that the proposed algorithm can significantly reduce unnecessary MCTF computation for AGS, while maintaining good coding efficiency.[[fileno]]2030144030011[[department]]電機工程學

    A hybrid error control and artifact detection mechanism for robust decoding of H.264/AVC video sequences

    Get PDF
    This letter presents a hybrid error control and artifact detection (HECAD) mechanism which can be used to enhance the error resilient capabilities of the standard H.264/advanced video coding (AVC) codec. The proposed solution first exploits the residual source redundancy to recover the most likelihood H.264/AVC bitstream. If error recovery is unsuccessful, the residual corrupted slices are then passed through a pixel-level artifact detection mechanism to detect the visually impaired macroblocks to be concealed. The proposed HECAD algorithm achieves overall peak signal-to-noise ratio gains between 0.4 dB and 4.5 dB relative to the standard with no additional bandwidth requirement. The cost of this solution translates in a marginal increase in the complexity of the decoder. In addition, this method can be applied in conjunction with other error resilient strategies and scales well with different encoding configurations.peer-reviewe

    Robust decoder-based error control strategy for recovery of H.264/AVC video content

    Get PDF
    Real-time wireless conversational and broadcasting multimedia applications offer particular transmission challenges as reliable content delivery cannot be guaranteed. The undelivered and erroneous content causes significant degradation in quality of experience. The H.264/AVC standard includes several error resilient tools to mitigate this effect on video quality. However, the methods implemented by the standard are based on a packet-loss scenario, where corrupted slices are dropped and the lost information concealed. Partially damaged slices still contain valuable information that can be used to enhance the quality of the recovered video. This study presents a novel error recovery solution that relies on a joint source-channel decoder to recover only feasible slices. A major advantage of this decoder-based strategy is that it grants additional robustness while keeping the same transmission data rate. Simulation results show that the proposed approach manages to completely recover 30.79% of the corrupted slices. This provides frame-by-frame peak signal-to-noise ratio (PSNR) gains of up to 18.1%dB, a result which, to the knowledge of the authors, is superior to all other joint source-channel decoding methods found in literature. Furthermore, this error resilient strategy can be combined with other error resilient tools adopted by the standard to enhance their performance.peer-reviewe

    Enhancing Wireless Video Transmissions in Virtual Collaboration Environments

    Get PDF
    This paper introduces the virtual collaboration environment and discusses the problems encountered in wireless video transmissions of the participating users. Different schemes are proposed and evaluated to address various problems encountered in the wireless access links of the virtual collaboration system for enhancing the perceived visual quality. The schemes include radio network resource optimization, optimal joint source and channel rate allocation and error resilience enhancement using SVC-MDC. These schemes have been shown to offer a strong potential to be incorporated in a virtual collaboration system for quality enhancement

    Resilient Digital Video Transmission over Wireless Channels using Pixel-Level Artefact Detection Mechanisms

    Get PDF
    Recent advances in communications and video coding technology have brought multimedia communications into everyday life, where a variety of services and applications are being integrated within different devices such that multimedia content is provided everywhere and on any device. H.264/AVC provides a major advance on preceding video coding standards obtaining as much as twice the coding efficiency over these standards (Richardson I.E.G., 2003, Wiegand T. & Sullivan G.J., 2007). Furthermore, this new codec inserts video related information within network abstraction layer units (NALUs), which facilitates the transmission of H.264/AVC coded sequences over a variety of network environments (Stockhammer, T. & Hannuksela M.M., 2005) making it applicable for a broad range of applications such as TV broadcasting, mobile TV, video-on-demand, digital media storage, high definition TV, multimedia streaming and conversational applications. Real-time wireless conversational and broadcast applications are particularly challenging as, in general, reliable delivery cannot be guaranteed (Stockhammer, T. & Hannuksela M.M., 2005). The H.264/AVC standard specifies several error resilient strategies to minimise the effect of transmission errors on the perceptual quality of the reconstructed video sequences. However, these methods assume a packet-loss scenario where the receiver discards and conceals all the video information contained within a corrupted NALU packet. This implies that the error resilient methods adopted by the standard operate at a lower bound since not all the information contained within a corrupted NALU packet is un-utilizable (Stockhammer, T. et al., 2003).peer-reviewe

    Error resilient H.264 coded video transmission over wireless channels

    Get PDF
    The H.264/AVC recommendation was first published in 2003 and builds on the concepts of earlier standards such as MPEG-2 and MPEG-4. The H.264 recommendation represents an evolution of the existing video coding standards and was developed in response to the growing need for higher compression. Even though H.264 provides for greater compression, H.264 compressed video streams are very prone to channel errors in mobile wireless fading channels such as 3G due to high error rates experienced. Common video compression techniques include motion compensation, prediction methods, transformation, quantization and entropy coding, which are the common elements of a hybrid video codecs. The ITU-T recommendation H.264 introduces several new error resilience tools, as well as several new features such as Intra Prediction and Deblocking Filter. The channel model used for the testing was the Rayleigh Fading channel with the noise component simulated as Additive White Gaussian Noise (AWGN) using QPSK as the modulation technique. The channel was used over several Eb/N0 values to provide similar bit error rates as those found in the literature. Though further research needs to be conducted, results have shown that when using the H.264 error resilience tools in protecting encoded bitstreams to minor channel errors improvement in the decoded video quality can be observed. The tools did not perform as well with mild and severe channel errors significant as the resultant bitstream was too corrupted. From this, further research in channel coding techniques is needed to determine if the bitstream can be protected from these sorts of error rate

    Overview of the scalable H.264/MPEG4-AVC extension

    No full text
    The scalable extension of H.264/MPEG4-AVC is a current standardization project of the Joint Video Team (JVT) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG). This paper gives an overview of the design of the scalable H.264/MPEG4-AVC extension and describes the basic concepts for supporting temporal, spatial, and SNR scalability. The efficiency of the described concepts for providing spatial and SNR scalability is analyzed by means of simulation results and compared to H.264/MPEG4-AVC compliant single layer coding

    Analysis for Scalable Coding of Quality-Adjustable Sensor Data

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 신현식.Machine-generated data such as sensor data now comprise major portion of available information. This thesis addresses two important problems: storing of massive sensor data collection and efficient sensing. We first propose a quality-adjustable sensor data archiving, which compresses entire collection of sensor data efficiently without compromising key features. Considering the data aging aspect of sensor data, we make our archiving scheme capable of controlling data fidelity to exploit less frequent data access of user. This flexibility on quality adjustability leads to more efficient usage of storage space. In order to store data from various sensor types in cost-effective way, we study the optimal storage configuration strategy using analytical models that capture characteristics of our scheme. This strategy helps storing sensor data blocks with the optimal configurations that maximizes data fidelity of various sensor data under given storage space. Next, we consider efficient sensing schemes and propose a quality-adjustable sensing scheme. We adopt compressive sensing (CS) that is well suited for resource-limited sensors because of its low computational complexity. We enhance quality adjustability intrinsic to CS with quantization and especially temporal downsampling. Our sensing architecture provides more rate-distortion operating points than previous schemes, which enables sensors to adapt data quality in more efficient way considering overall performance. Moreover, the proposed temporal downsampling improves coding efficiency that is a drawback of CS. At the same time, the downsampling further reduces computational complexity of sensing devices, along with sparse random matrix. As a result, our quality-adjustable sensing can deliver gains to a wide variety of resource-constrained sensing techniques.Abstract i Contents iii List of Figures vi List of Tables x Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Spatio-Temporal Correlation in Sensor Data 3 1.3 Quality Adjustability of Sensor Data 7 1.4 Research Contributions 9 1.5 Thesis Organization 11 Chapter 2 Archiving of Sensor Data 12 2.1 Encoding Sensor Data Collection 12 2.1.1 Archiving Architecture 13 2.1.2 Data Conversion 16 2.2 Compression Ratio Comparison 20 2.3 Quality-Adjustable Archiving Model 25 2.3.1 Data Fidelity Model: Rate 25 2.3.2 Data Fidelity Model: Distortion 28 2.4 QP-Rate-Distortion Model 36 2.5 Optimal Rate Allocation 40 2.5.1 Rate Allocation Strategy 40 2.5.2 Optimal Storage Configuration 41 2.5.3 Experimental Results 44 Chapter 3 Scalable Management of Storage 46 3.1 Scalable Quality Management 46 3.1.1 Archiving Architecture 47 3.1.2 Compression Ratio Comparison 49 3.2 Enhancing Quality Adjustability 51 3.2.1 Data Fidelity Model: Rate 52 3.2.2 Data Fidelity Model: Distortion 55 3.3 Optimal Rate Allocation 59 3.3.1 Rate Allocation Strategy 60 3.3.2 Optimal Storage Configuration 63 3.3.3 Experimental Results 71 Chapter 4 Quality-Adjustable Sensing 73 4.1 Compressive Sensing 73 4.1.1 Compressive Sensing Problem 74 4.1.2 General Signal Recovery 76 4.1.3 Noisy Signal Recovery 76 4.2 Quality Adjustability in Sensing Environment 77 4.2.1 Quantization and Temporal Downsampling 79 4.2.2 Optimization with Error Model 85 4.3 Low-Complexity Sensing 88 4.3.1 Sparse Random Matrix 89 4.3.2 Resource Savings 92 Chapter 5 Conclusions 96 5.1 Summary 96 5.2 Future Research Directions 98 Bibliography 100 Abstract in Korean 109Docto

    Error resilient stereoscopic video streaming using model-based fountain codes

    Get PDF
    Ankara : The Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 2009.Thesis (Ph.D.) -- Bilkent University, 2009.Includes bibliographical references leaves 101-110.Error resilient digital video streaming has been a challenging problem since the introduction and deployment of early packet switched networks. One of the most recent advances in video coding is observed on multi-view video coding which suggests methods for the compression of correlated multiple image sequences. The existing multi-view compression techniques increase the loss sensitivity and necessitate the use of efficient loss recovery schemes. Forward Error Correction (FEC) is an efficient, powerful and practical tool for the recovery of lost data. A novel class of FEC codes is Fountain codes which are suitable to be used with recent video codecs, such as H.264/AVC, and LT and Raptor codes are practical examples of this class. Although there are many studies on monoscopic video, transmission of multi-view video through lossy channels with FEC have not been explored yet. Aiming at this deficiency, an H.264-based multi-view video codec and a model-based Fountain code are combined to generate an effi- cient error resilient stereoscopic streaming system. Three layers of stereoscopic video with unequal importance are defined in order to exploit the benefits of Unequal Error Protection (UEP) with FEC. Simply, these layers correspond to intra frames of left view, predicted frames of left view and predicted frames of right view. The Rate-Distortion (RD) characteristics of these dependent layers are de- fined by extending the RD characteristics of monoscopic video. The parameters of the models are obtained with curve fitting using the RD samples of the video, and satisfactory results are achieved where the average difference between the analytical models and RD samples is between 1.00% and 9.19%. An heuristic analytical model of the performance of Raptor codes is used to obtain the residual number of lost packets for given channel bit rate, loss rate, and protection rate. This residual number is multiplied with the estimated average distortion of the loss of a single Network Abstraction Layer (NAL) unit to obtain the total transmission distortion. All these models are combined to minimize the end-toend distortion and obtain optimal encoder bit rates and UEP rates. When the proposed system is used, the simulation results demonstrate up to 2dB increase in quality compared to equal error protection and only left view error protection. Furthermore, Fountain codes are analyzed in the finite length region, and iterative performance models are derived without any assumptions or asymptotical approximations. The performance model of the belief-propagation (BP) decoder approximates either the behavior of a single simulation results or their average depending on the parameters of the LT code. The performance model of the maximum likelihood decoder approximates the average of simulation results more accurately compared to the model of the BP decoder. Raptor codes are modeled heuristically based on the exponential decay observed on the simulation results, and the model parameters are obtained by line of best fit. The analytical models of systematic and non-systematic Raptor codes accurately approximate the experimental average performance.Tan, A SerdarPh.D

    High Dynamic Range Visual Content Compression

    Get PDF
    This thesis addresses the research questions of High Dynamic Range (HDR) visual contents compression. The HDR representations are intended to represent the actual physical value of the light rather than exposed value. The current HDR compression schemes are the extension of legacy Low Dynamic Range (LDR) compressions, by using Tone-Mapping Operators (TMO) to reduce the dynamic range of the HDR contents. However, introducing TMO increases the overall computational complexity, and it causes the temporal artifacts. Furthermore, these compression schemes fail to compress non-salient region differently than the salient region, when Human Visual System (HVS) perceives them differently. The main contribution of this thesis is to propose a novel Mapping-free visual saliency-guided HDR content compression scheme. Firstly, the relationship of Discrete Wavelet Transform (DWT) lifting steps and TMO are explored. A novel approach to compress HDR image by Joint Photographic Experts Group (JPEG) 2000 codec while backward compatible to LDR is proposed. This approach exploits the reversibility of tone mapping and scalability of DWT. Secondly, the importance of the TMO in the HDR compression is evaluated in this thesis. A mapping-free post HDR image compression based on JPEG and JPEG2000 standard codecs for current HDR image formats is proposed. This approach exploits the structure of HDR formats. It has an equivalent compression performance and the lowest computational complexity compared to the existing HDR lossy compressions (50% lower than the state-of-the-art). Finally, the shortcomings of the current HDR visual saliency models, and HDR visual saliency-guided compression are explored in this thesis. A spatial saliency model for HDR visual content outperform others by 10% for spatial visual prediction task with 70% lower computational complexity is proposed. Furthermore, the experiment suggested more than 90% temporal saliency is predicted by the proposed spatial model. Moreover, the proposed saliency model can be used to guide the HDR compression by applying different quantization factor according to the intensity of predicted saliency map
    corecore