68 research outputs found

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF

    Selected topics on distributed video coding

    Get PDF
    Distributed Video Coding (DVC) is a new paradigm for video compression based on the information theoretical results of Slepian and Wolf (SW), and Wyner and Ziv (WZ). While conventional coding has a rigid complexity allocation as most of the complex tasks are performed at the encoder side, DVC enables a flexible complexity allocation between the encoder and the decoder. The most novel and interesting case is low complexity encoding and complex decoding, which is the opposite of conventional coding. While the latter is suitable for applications where the cost of the decoder is more critical than the encoder's one, DVC opens the door for a new range of applications where low complexity encoding is required and the decoder's complexity is not critical. This is interesting with the deployment of small and battery-powered multimedia mobile devices all around in our daily life. Further, since DVC operates as a reversed-complexity scheme when compared to conventional coding, DVC also enables the interesting scenario of low complexity encoding and decoding between two ends by transcoding between DVC and conventional coding. More specifically, low complexity encoding is possible by DVC at one end. Then, the resulting stream is decoded and conventionally re-encoded to enable low complexity decoding at the other end. Multiview video is attractive for a wide range of applications such as free viewpoint television, which is a system that allows viewing the scene from a viewpoint chosen by the viewer. Moreover, multiview can be beneficial for monitoring purposes in video surveillance. The increased use of multiview video systems is mainly due to the improvements in video technology and the reduced cost of cameras. While a multiview conventional codec will try to exploit the correlation among the different cameras at the encoder side, DVC allows for separate encoding of correlated video sources. Therefore, DVC requires no communication between the cameras in a multiview scenario. This is an advantage since communication is time consuming (i.e. more delay) and requires complex networking. Another appealing feature of DVC is the fact that it is based on a statistical framework. Moreover, DVC behaves as a natural joint source-channel coding solution. This results in an improved error resilience performance when compared to conventional coding. Further, DVC-based scalable codecs do not require a deterministic knowledge of the lower layers. In other words, the enhancement layers are completely independent from the base layer codec. This is called the codec-independent scalability feature, which offers a high flexibility in the way the various layers are distributed in a network. This thesis addresses the following topics: First, the theoretical foundations of DVC as well as the practical DVC scheme used in this research are presented. The potential applications for DVC are also outlined. DVC-based schemes use conventional coding to compress parts of the data, while the rest is compressed in a distributed fashion. Thus, different conventional codecs are studied in this research as they are compared in terms of compression efficiency for a rich set of sequences. This includes fine tuning the compression parameters such that the best performance is achieved for each codec. Further, DVC tools for improved Side Information (SI) and Error Concealment (EC) are introduced for monoview DVC using a partially decoded frame. The improved SI results in a significant gain in reconstruction quality for video with high activity and motion. This is done by re-estimating the erroneous motion vectors using the partially decoded frame to improve the SI quality. The latter is then used to enhance the reconstruction of the finally decoded frame. Further, the introduced spatio-temporal EC improves the quality of decoded video in the case of erroneously received packets, outperforming both spatial and temporal EC. Moreover, it also outperforms error-concealed conventional coding in different modes. Then, multiview DVC is studied in terms of SI generation, which differentiates it from the monoview case. More specifically, different multiview prediction techniques for SI generation are described and compared in terms of prediction quality, complexity and compression efficiency. Further, a technique for iterative multiview SI is introduced, where the final SI is used in an enhanced reconstruction process. The iterative SI outperforms the other SI generation techniques, especially for high motion video content. Finally, fusion techniques of temporal and inter-view side informations are introduced as well, which improves the performance of multiview DVC over monoview coding. DVC is also used to enable scalability for image and video coding. Since DVC is based on a statistical framework, the base and enhancement layers are completely independent, which is an interesting property called codec-independent scalability. Moreover, the introduced DVC scalable schemes show a good robustness to errors as the quality of decoded video steadily decreases with error rate increase. On the other hand, conventional coding exhibits a cliff effect as the performance drops dramatically after a certain error rate value. Further, the issue of privacy protection is addressed for DVC by transform domain scrambling, which is used to alter regions of interest in video such that the scene is still understood and privacy is preserved as well. The proposed scrambling techniques are shown to provide a good level of security without impairing the performance of the DVC scheme when compared to the one without scrambling. This is particularly attractive for video surveillance scenarios, which is one of the most promising applications for DVC. Finally, a practical DVC demonstrator built during this research is described, where the main requirements as well as the observed limitations are presented. Furthermore, it is defined in a setup being as close as possible to a complete real application scenario. This shows that it is actually possible to implement a complete end-to-end practical DVC system relying only on realistic assumptions. Even though DVC is inferior in terms of compression efficiency to the state of the art conventional coding for the moment, strengths of DVC reside in its good error resilience properties and the codec-independent scalability feature. Therefore, DVC offers promising possibilities for video compression with transmission over error-prone environments requirement as it significantly outperforms conventional coding in this case

    Computational inference and control of quality in multimedia services

    Get PDF
    Quality is the degree of excellence we expect of a service or a product. It is also one of the key factors that determine its value. For multimedia services, understanding the experienced quality means understanding how the delivered delity, precision and reliability correspond to the users' expectations. Yet the quality of multimedia services is inextricably linked to the underlying technology. It is developments in video recording, compression and transport as well as display technologies that enables high quality multimedia services to become ubiquitous. The constant evolution of these technologies delivers a steady increase in performance, but also a growing level of complexity. As new technologies stack on top of each other the interactions between them and their components become more intricate and obscure. In this environment optimizing the delivered quality of multimedia services becomes increasingly challenging. The factors that aect the experienced quality, or Quality of Experience (QoE), tend to have complex non-linear relationships. The subjectively perceived QoE is hard to measure directly and continuously evolves with the user's expectations. Faced with the diculty of designing an expert system for QoE management that relies on painstaking measurements and intricate heuristics, we turn to an approach based on learning or inference. The set of solutions presented in this work rely on computational intelligence techniques that do inference over the large set of signals coming from the system to deliver QoE models based on user feedback. We furthermore present solutions for inference of optimized control in systems with no guarantees for resource availability. This approach oers the opportunity to be more accurate in assessing the perceived quality, to incorporate more factors and to adapt as technology and user expectations evolve. In a similar fashion, the inferred control strategies can uncover more intricate patterns coming from the sensors and therefore implement farther-reaching decisions. Similarly to natural systems, this continuous adaptation and learning makes these systems more robust to perturbations in the environment, longer lasting accuracy and higher eciency in dealing with increased complexity. Overcoming this increasing complexity and diversity is crucial for addressing the challenges of future multimedia system. Through experiments and simulations this work demonstrates that adopting an approach of learning can improve the sub jective and objective QoE estimation, enable the implementation of ecient and scalable QoE management as well as ecient control mechanisms

    No-reference image and video quality assessment: a classification and review of recent approaches

    Get PDF

    Video Quality Metrics

    Get PDF

    QUALITY-DRIVEN CROSS LAYER DESIGN FOR MULTIMEDIA SECURITY OVER RESOURCE CONSTRAINED WIRELESS SENSOR NETWORKS

    Get PDF
    The strong need for security guarantee, e.g., integrity and authenticity, as well as privacy and confidentiality in wireless multimedia services has driven the development of an emerging research area in low cost Wireless Multimedia Sensor Networks (WMSNs). Unfortunately, those conventional encryption and authentication techniques cannot be applied directly to WMSNs due to inborn challenges such as extremely limited energy, computing and bandwidth resources. This dissertation provides a quality-driven security design and resource allocation framework for WMSNs. The contribution of this dissertation bridges the inter-disciplinary research gap between high layer multimedia signal processing and low layer computer networking. It formulates the generic problem of quality-driven multimedia resource allocation in WMSNs and proposes a cross layer solution. The fundamental methodologies of multimedia selective encryption and stream authentication, and their application to digital image or video compression standards are presented. New multimedia selective encryption and stream authentication schemes are proposed at application layer, which significantly reduces encryption/authentication complexity. In addition, network resource allocation methodologies at low layers are extensively studied. An unequal error protection-based network resource allocation scheme is proposed to achieve the best effort media quality with integrity and energy efficiency guarantee. Performance evaluation results show that this cross layer framework achieves considerable energy-quality-security gain by jointly designing multimedia selective encryption/multimedia stream authentication and communication resource allocation

    Prioritizing Content of Interest in Multimedia Data Compression

    Get PDF
    Image and video compression techniques make data transmission and storage in digital multimedia systems more efficient and feasible for the system's limited storage and bandwidth. Many generic image and video compression techniques such as JPEG and H.264/AVC have been standardized and are now widely adopted. Despite their great success, we observe that these standard compression techniques are not the best solution for data compression in special types of multimedia systems such as microscopy videos and low-power wireless broadcast systems. In these application-specific systems where the content of interest in the multimedia data is known and well-defined, we should re-think the design of a data compression pipeline. We hypothesize that by identifying and prioritizing multimedia data's content of interest, new compression methods can be invented that are far more effective than standard techniques. In this dissertation, a set of new data compression methods based on the idea of prioritizing the content of interest has been proposed for three different kinds of multimedia systems. I will show that the key to designing efficient compression techniques in these three cases is to prioritize the content of interest in the data. The definition of the content of interest of multimedia data depends on the application. First, I show that for microscopy videos, the content of interest is defined as the spatial regions in the video frame with pixels that don't only contain noise. Keeping data in those regions with high quality and throwing out other information yields to a novel microscopy video compression technique. Second, I show that for a Bluetooth low energy beacon based system, practical multimedia data storage and transmission is possible by prioritizing content of interest. I designed custom image compression techniques that preserve edges in a binary image, or foreground regions of a color image of indoor or outdoor objects. Last, I present a new indoor Bluetooth low energy beacon based augmented reality system that integrates a 3D moving object compression method that prioritizes the content of interest.Doctor of Philosoph

    Active and passive approaches for image authentication

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Efficient simultaneous encryption and compression of digital videos in computationally constrained applications

    Get PDF
    This thesis is concerned with the secure video transmission over open and wireless network channels. This would facilitate adequate interaction in computationally constrained applications among trusted entities such as in disaster/conflict zones, secure airborne transmission of videos for intelligence/security or surveillance purposes, and secure video communication for law enforcing agencies in crime fighting or in proactive forensics. Video content is generally too large and vulnerable to eavesdropping when transmitted over open network channels so that compression and encryption become very essential for storage and/or transmission. In terms of security, wireless channels, are more vulnerable than other kinds of mediums to a variety of attacks and eavesdropping. Since wireless communication is the main mode in the above applications, protecting video transmissions from unauthorized access through such network channels is a must. The main and multi-faceted challenges that one faces in implementing such a task are related to competing, and to some extent conflicting, requirements of a number of standard control factors relating to the constrained bandwidth, reasonably high image quality at the receiving end, the execution time, and robustness against security attacks. Applying both compression and encryption techniques simultaneously is a very tough challenge due to the fact that we need to optimize the compression ratio, time complexity, security and the quality simultaneously. There are different available image/video compression schemes that provide reasonable compression while attempting to maintain image quality, such as JPEG, MPEG and JPEG2000. The main approach to video compression is based on detecting and removing spatial correlation within the video frames as well as temporal correlations across the video frames. Temporal correlations are expected to be more evident across sequences of frames captured within a short period of time (often a fraction of a second). Correlation can be measured in terms of similarity between blocks of pixels. Frequency domain transforms such as the Discrete Cosine Transform (DCT) and the Discrete Wavelet Transform (DWT) have both been used restructure the frequency content (coefficients) to become amenable for efficient detection. JPEG and MPEG use DCT while JPEG2000 uses DWT. Removing spatial/temporal correlation encodes only one block from each class of equivalent (i.e. similar) blocks and remembering the position of all other block within the equivalence class. JPEG2000 compressed images achieve higher image quality than JPEG for the same compression ratios, while DCT based coding suffer from noticeable distortion at high compression ratio but when applied to any block it is easy to isolate the significant coefficients from the non-significant ones. Efficient video encryption in computationally constrained applications is another challenge on its own. It has long been recognised that selective encryption is the only viable approach to deal with the overwhelming file size. Selection can be made in the spatial or frequency domain. Efficiency of simultaneous compression and encryption is a good reason for us to apply selective encryption in the frequency domain. In this thesis we develop a hybrid of DWT and DCT for improved image/video compression in terms of image quality, compression ratio, bandwidth, and efficiency. We shall also investigate other techniques that have similar properties to the DCT in terms of representation of significant wavelet coefficients. The statistical properties of wavelet transform high frequency sub-bands provide one such approach, and we also propose phase sensing as another alternative but very efficient scheme. Simultaneous compression and encryption, in our investigations, were aimed at finding the best way of applying these two tasks in parallel by selecting some wavelet sub-bands for encryptions and applying compression on the other sub-bands. Since most spatial/temporal correlation appear in the high frequency wavelet sub-bands and the LL sub-bands of wavelet transformed images approximate the original images then we select the LL-sub-band data for encryption and the non-LL high frequency sub-band coefficients for compression. We also follow the common practice of using stream ciphers to meet efficiency requirements of real-time transmission. For key stream generation we investigated a number of schemes and the ultimate choice will depend on robustness to attacks. The still image (i.e. RF’s) are compressed with a modified EZW wavelet scheme by applying the DCT on the blocks of the wavelet sub-bands, selecting appropriate thresholds for determining significance of coefficients, and encrypting the EZW thresholds only with a simple 10-bit LFSR cipher This scheme is reasonably efficient in terms of processing time, compression ratio, image quality, as well was security robustness against statistical and frequency attack. However, many areas for improvements were identified as necessary to achieve the objectives of the thesis. Through a process of refinement we developed and tested 3 different secure efficient video compression schemes, whereby at each step we improve the performance of the scheme in the previous step. Extensive experiments are conducted to test performance of the new scheme, at each refined stage, in terms of efficiency, compression ratio, image quality, and security robustness. Depending on the aspects of compression that needs improvement at each refinement step, we replaced the previous block coding scheme with a more appropriate one from among the 3 above mentioned schemes (i.e. DCT, Edge sensing and phase sensing) for the reference frames or the non-reference ones. In subsequent refinement steps we apply encryption to a slightly expanded LL-sub-band using successively more secure stream ciphers, but with different approaches to key stream generation. In the first refinement step, encryption utilized two LFSRs seeded with three secret keys to scramble the significant wavelet LL-coefficients multiple times. In the second approach, the encryption algorithm utilises LFSR to scramble the wavelet coefficients of the edges extracted from the low frequency sub-band. These edges are mapped from the high frequency sub-bands using different threshold. Finally, use a version of the A5 cipher combined with chaotic logistic map to encrypt the significant parameters of the LL sub-band. Our empirical results show that the refinement process achieves the ultimate objectives of the thesis, i.e. efficient secure video compression scheme that is scalable in terms of the frame size at about 100 fps and satisfying the following features; high compression, reasonable quality, and resistance to the statistical, frequency and the brute force attack with low computational processing. Although image quality fluctuates depending on video complexity, in the conclusion we recommend an adaptive implementation of our scheme. Although this thesis does not deal with transmission tasks but the efficiency achieved in terms of video encryption and compression time as well as in compression ratios will be sufficient for real-time secure transmission of video using commercially available mobile computing devices

    Efficient algorithms for scalable video coding

    Get PDF
    A scalable video bitstream specifically designed for the needs of various client terminals, network conditions, and user demands is much desired in current and future video transmission and storage systems. The scalable extension of the H.264/AVC standard (SVC) has been developed to satisfy the new challenges posed by heterogeneous environments, as it permits a single video stream to be decoded fully or partially with variable quality, resolution, and frame rate in order to adapt to a specific application. This thesis presents novel improved algorithms for SVC, including: 1) a fast inter-frame and inter-layer coding mode selection algorithm based on motion activity; 2) a hierarchical fast mode selection algorithm; 3) a two-part Rate Distortion (RD) model targeting the properties of different prediction modes for the SVC rate control scheme; and 4) an optimised Mean Absolute Difference (MAD) prediction model. The proposed fast inter-frame and inter-layer mode selection algorithm is based on the empirical observation that a macroblock (MB) with slow movement is more likely to be best matched by one in the same resolution layer. However, for a macroblock with fast movement, motion estimation between layers is required. Simulation results show that the algorithm can reduce the encoding time by up to 40%, with negligible degradation in RD performance. The proposed hierarchical fast mode selection scheme comprises four levels and makes full use of inter-layer, temporal and spatial correlation aswell as the texture information of each macroblock. Overall, the new technique demonstrates the same coding performance in terms of picture quality and compression ratio as that of the SVC standard, yet produces a saving in encoding time of up to 84%. Compared with state-of-the-art SVC fast mode selection algorithms, the proposed algorithm achieves a superior computational time reduction under very similar RD performance conditions. The existing SVC rate distortion model cannot accurately represent the RD properties of the prediction modes, because it is influenced by the use of inter-layer prediction. A separate RD model for inter-layer prediction coding in the enhancement layer(s) is therefore introduced. Overall, the proposed algorithms improve the average PSNR by up to 0.34dB or produce an average saving in bit rate of up to 7.78%. Furthermore, the control accuracy is maintained to within 0.07% on average. As aMADprediction error always exists and cannot be avoided, an optimisedMADprediction model for the spatial enhancement layers is proposed that considers the MAD from previous temporal frames and previous spatial frames together, to achieve a more accurateMADprediction. Simulation results indicate that the proposedMADprediction model reduces the MAD prediction error by up to 79% compared with the JVT-W043 implementation
    corecore