18 research outputs found

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF

    Towards Computational Efficiency of Next Generation Multimedia Systems

    Get PDF
    To address throughput demands of complex applications (like Multimedia), a next-generation system designer needs to co-design and co-optimize the hardware and software layers. Hardware/software knobs must be tuned in synergy to increase the throughput efficiency. This thesis provides such algorithmic and architectural solutions, while considering the new technology challenges (power-cap and memory aging). The goal is to maximize the throughput efficiency, under timing- and hardware-constraints

    Selective encryption in the CCSDS standard for lossless and near-lossless multispectral and hyperspectral image compression

    Get PDF
    In this paper, we investigate low-complexity encryption solutions to be embedded in the recently proposed CCSDS standard for lossless and near-lossless multispectral and hyperspectral image compression. The proposed approach is based on the randomization of selected components in the image compression pipeline, namely the sign of prediction residual and the fixed part of Rice-Golomb codes, inspired by similar solutions adopted in video coding. Thanks to the adaptive nature of the CCSDS algorithm, even simple randomization of the sign of prediction residuals can provide a sufficient scrambling of the decoded image when the encryption key is not available. Results on the standard CCSDS test set show that the proposed technique uses on average only about 20% of the keystream compared to a conventional stream cipher, with a negligible increase of the rate of the encoder

    MAP Joint Source-Channel Arithmetic Decoding for Compressed Video

    Get PDF
    In order to have robust video transmission over error prone telecommunication channels several mechanisms are introduced. These mechanisms try to detect, correct or conceal the errors in the received video stream. In this thesis, the performance of the video codec is improved in terms of error rates without increasing overhead in terms of data bit rate. This is done by exploiting the residual syntactic/semantic redundancy inside compressed video along with optimizing the configuration of the state-of-the art entropy coding, i.e., binary arithmetic coding, and optimizing the quantization of the channel output. The thesis is divided into four phases. In the first phase, a breadth-first suboptimal sequential maximum a posteriori (MAP) decoder is employed for joint source-channel arithmetic decoding of H.264 symbols. The proposed decoder uses not only the intentional redundancy inserted via a forbidden symbol (FS) but also exploits residual redundancy by a syntax checker. In contrast to previous methods this is done as each channel bit is decoded. Simulations using intra prediction modes show improvements in error rates, e.g., syntax element error rate reduction by an order of magnitude for channel SNR of 7.33dB. The cost of this improvement is more computational complexity spent on the syntax checking. In the second phase, the configuration of the FS in the symbol set is studied. The delay probability function, i.e., the probability of the number of bits required to detect an error, is calculated for various FS configurations. The probability of missed error detection is calculated as a figure of merit for optimizing the FS configuration. The simulation results show the effectiveness of the proposed figure of merit, and support the FS configuration in which the FS lies entirely between the other information carrying symbols to be the best. In the third phase, a new method for estimating the a priori probability of particular syntax elements is proposed. This estimation is based on the interdependency among the syntax elements that were previously decoded. This estimation is categorized as either reliable or unreliable. The decoder uses this prior information when they are reliable, otherwise the MAP decoder considers that the syntax elements are equiprobable and in turn uses maximum likelihood (ML) decoding. The reliability detection is carried out using a threshold on the local entropy of syntax elements in the neighboring macroblocks. In the last phase, a new measure to assess performance of the channel quantizer is proposed. This measure is based on the statistics of the rank of true candidate among the sorted list of candidates in the MAP decoder. Simulation results shows that a quantizer designed based on the proposed measure is superior to the quantizers designed based on maximum mutual information and minimum mean square error

    Dynamically Reconfigurable Architectures and Systems for Time-varying Image Constraints (DRASTIC) for Image and Video Compression

    Get PDF
    In the current information booming era, image and video consumption is ubiquitous. The associated image and video coding operations require significant computing resources for both small-scale computing systems as well as over larger network systems. For different scenarios, power, bitrate and image quality can impose significant time-varying constraints. For example, mobile devices (e.g., phones, tablets, laptops, UAVs) come with significant constraints on energy and power. Similarly, computer networks provide time-varying bandwidth that can depend on signal strength (e.g., wireless networks) or network traffic conditions. Alternatively, the users can impose different constraints on image quality based on their interests. Traditional image and video coding systems have focused on rate-distortion optimization. More recently, distortion measures (e.g., PSNR) are being replaced by more sophisticated image quality metrics. However, these systems are based on fixed hardware configurations that provide limited options over power consumption. The use of dynamic partial reconfiguration with Field Programmable Gate Arrays (FPGAs) provides an opportunity to effectively control dynamic power consumption by jointly considering software-hardware configurations. This dissertation extends traditional rate-distortion optimization to rate-quality-power/energy optimization and demonstrates a wide variety of applications in both image and video compression. In each application, a family of Pareto-optimal configurations are developed that allow fine control in the rate-quality-power/energy optimization space. The term Dynamically Reconfiguration Architecture Systems for Time-varying Image Constraints (DRASTIC) is used to describe the derived systems. DRASTIC covers both software-only as well as software-hardware configurations to achieve fine optimization over a set of general modes that include: (i) maximum image quality, (ii) minimum dynamic power/energy, (iii) minimum bitrate, and (iv) typical mode over a set of opposing constraints to guarantee satisfactory performance. In joint software-hardware configurations, DRASTIC provides an effective approach for dynamic power optimization. For software configurations, DRASTIC provides an effective method for energy consumption optimization by controlling processing times. The dissertation provides several applications. First, stochastic methods are given for computing quantization tables that are optimal in the rate-quality space and demonstrated on standard JPEG compression. Second, a DRASTIC implementation of the DCT is used to demonstrate the effectiveness of the approach on motion JPEG. Third, a reconfigurable deblocking filter system is investigated for use in the current H.264/AVC systems. Fourth, the dissertation develops DRASTIC for all 35 intra-prediction modes as well as intra-encoding for the emerging High Efficiency Video Coding standard (HEVC)

    Network distributed 3D video quality monitoring system

    Get PDF
    This project description presents a research and development work whose primary goal was the design and implementation of an Internet Protocol (IP) network distributed video quality assessment tool. Even though the system was designed to monitor H.264 three-dimensional (3D) stereo video quality it is also applicable to di erent formats of 3D video (such as texture plus depth) and can use di erent video quality assessment models making it easily customizable and adaptable to varying conditions and transmission scenarios. The system uses packet level data collection done by a set of network probes located at convenient network points, that carry out packet monitoring, inspection and analysis to obtain information about 3D video packets passing through the probe's locations. The information gathered is sent to a central server for further processing including 3D video quality estimation based on packet level information. Firstly an overview of current 3D video standards, their evolution and features is presented, strongly focused on H.264/AVC and HEVC. Then follows a description of video quality assessment metrics, describing in more detail the quality estimator used in the work. Video transport methods over the Internet Protocol are also explained in detail as thorough knowledge of video packetization schemes is important to understand the information retrieval and parsing performed at the front stage of the system, the probes. After those introductory themes are addressed, a general system architecture is shown, explaining all its components and how they interact with each other. The development steps of each of the components are then thoroughly described. In addition to the main project, a 3D video streamer was created to be used in the implementation tests of the system. This streamer was purposely built for the present work as currently available free-domain streamers do not support 3D video streaming. The overall result is a system that can be deployed in any IP network and is exible enough to help in future video quality assessment research, since it can be used as a testing platform to validate any proposed new quality metrics, serve as a network monitoring tool for video transmission or help to understand the impact that some network characteristics may have on video quality

    An objective and subjective quality assessment for passive gaming video streaming

    Get PDF
    Gaming video streaming has become increasingly popular in recent times. Along with the rise and popularity of cloud gaming services and e-sports, passive gaming video streaming services such as Twitch.tv, YouTubeGaming, etc. where viewers watch the gameplay of other gamers, have seen increasing acceptance. Twitch.tv alone has over 2.2 million monthly streamers and 15 million daily active users with almost a million average concurrent users, making Twitch.tv the 4th biggest internet traffic generator, just after Netflix, YouTube and Apple. Despite the increasing importance and popularity of such live gaming video streaming services, they have until recently not caught the attention of the quality assessment research community. For the continued success of such services, it is imperative to maintain and satisfy the end user Quality of Experience (QoE), which can be measured using various Video Quality Assessment (VQA) methods. Gaming videos are synthetic and artificial in nature and have different streaming requirements as compared to traditional non-gaming content. While there exist a lot of subjective and objective studies in the field of quality assessment of Video-on-demand (VOD) streaming services, such as Netflix and YouTube, along with the design of many VQA metrics, no work has been done previously towards quality assessment of live passive gaming video streaming applications. The research work in this thesis tries to address this gap by using various subjective and objective quality assessment studies. A codec comparison using the three most popular and widely used compression standards is performed to determine their compression efficiency. Furthermore, a subjective and objective comparative study is carried out to find out the difference between gaming and non-gaming videos in terms of the trade-off between quality and data-rate after compression. This is followed by the creation of an open source gaming video dataset, which is then used for a performance evaluation study of the eight most popular VQA metrics. Different temporal pooling strategies and content based classification approaches are evaluated to assess their effect on the VQA metrics. Finally, due to the low performance of existing No-Reference (NR) VQA metrics on gaming video content, two machine learning based NR models are designed using NR features and existing NR metrics, which are shown to outperform existing NR metrics while performing on par with state-of-the-art Full-Reference (FR) VQA metrics

    Compressão de dados sensoriais em sistemas robóticos

    Get PDF
    One of the main problems in the development and debugging of robotic systems is the amount of data stored in files containing sensor data (ex. ROS proprietary log files - BAGS). If we consider a robot with several cameras and other sensors that collect information from the environment several times per second, we quickly obtain very large files. Besides the concerns regarding storage and, in some cases, transmission, it becomes extremely hard to find important information in these files. In this thesis, we tried to solve both problems studying and implementing data compression solutions to reduce the referred files. The main focus was image and video compression, by far the most storage consuming data. Moreover, we conducted a detailed study about the effect of lossy compression methods in the performance of some state of the art image analysis algorithms. Another contribution was the development of an intelligent video player to help roboticists in their work while they evaluate the recorded data after experiments. Parts of the video that do not contain relevant information are skipped during the play. Based on the results, we concluded that ROS native compression is not sufficient. Furthermore, solutions based on ROS, or virtually any robotic system that has to deal with image/video data, would benefit with the use of a H.265 codec, as it provides the smallest number of bits per pixel without a significant penalty on the performance of image analysis algorithms.Um dos principais problemas no desenvolvimento e depuração de sistemas robóticos é a quantidade de dados armazenados em ficheiros contendo dados sensoriais (ex. ficheiros de log proprietários de ROS - Bags). Se considerarmos um robô com várias câmaras e outros sensores, que recolhem informação do ambiente diversas vezes por segundo, obtemos rapidamente ficheiros muito grandes. Além das preocupações com o armazenamento e, em alguns casos, a transmissão, torna-se extremamente difícil encontrar informações importantes nesses ficheiros. Nesta dissertação, procuramos a melhor solução para os dois problemas estudando e implementando soluções de compressão de dados para reduzir os ficheiros referidos. O foco principal foi compressão de imagem/video, de longe, os dados que consomem mais armazenamento. Além disso, realizamos um estudo detalhado sobre o efeito de compressão com perdas no desempenho de alguns algoritmos de análise de imagem estado da arte. Outra contribuição foi o desenvolvimento de um leitor de vídeo inteligente para ajudar os roboticistas no seu trabalho enquanto avaliam os dados gravados. Partes do vídeo que não contêm informações relevantes são aceleradas durante a leitura. Com base nos resultados, concluímos que a compressão nativa de ROS não é suficiente. Além disso, soluções baseadas em ROS, ou de um modo geral qualquer sistema robótico que precise de lidar com dados de imagem/vídeo, beneficiaria com o uso de um codec H.265, uma vez que fornece o menor número de bits por pixel sem penalização significativa da eficiência dos algoritmos de análise de imagem.Mestrado em Engenharia de Computadores e Telemátic

    Fast and Efficient Foveated Video Compression Schemes for H.264/AVC Platform

    Get PDF
    Some fast and efficient foveated video compression schemes for H.264/AVC platform are presented in this dissertation. The exponential growth in networking technologies and widespread use of video content based multimedia information over internet for mass communication applications like social networking, e-commerce and education have promoted the development of video coding to a great extent. Recently, foveated imaging based image or video compression schemes are in high demand, as they not only match with the perception of human visual system (HVS), but also yield higher compression ratio. The important or salient regions are compressed with higher visual quality while the non-salient regions are compressed with higher compression ratio. From amongst the foveated video compression developments during the last few years, it is observed that saliency detection based foveated schemes are the keen areas of intense research. Keeping this in mind, we propose two multi-scale saliency detection schemes. (1) Multi-scale phase spectrum based saliency detection (FTPBSD); (2) Sign-DCT multi-scale pseudo-phase spectrum based saliency detection (SDCTPBSD). In FTPBSD scheme, a saliency map is determined using phase spectrum of a given image/video with unity magnitude spectrum. On the other hand, the proposed SDCTPBSD method uses sign information of discrete cosine transform (DCT) also known as sign-DCT (SDCT). It resembles the response of receptive field neurons of HVS. A bottom-up spatio-temporal saliency map is obtained by linear weighted sum of spatial saliency map and temporal saliency map. Based on these saliency detection techniques, foveated video compression (FVC) schemes (FVC-FTPBSD and FVC-SDCTPBSD) are developed to improve the compression performance further.Moreover, the 2D-discrete cosine transform (2D-DCT) is widely used in various video coding standards for block based transformation of spatial data. However, for directional featured blocks, 2D-DCT offers sub-optimal performance and may not able to efficiently represent video data with fewer coefficients that deteriorates compression ratio. Various directional transform schemes are proposed in literature for efficiently encoding such directional featured blocks. However, it is observed that these directional transform schemes suffer from many issues like ‘mean weighting defect’, use of a large number of DCTs and a number of scanning patterns. We propose a directional transform scheme based on direction-adaptive fixed length discrete cosine transform (DAFL-DCT) for intra-, and inter-frame to achieve higher coding efficiency in case of directional featured blocks.Furthermore, the proposed DAFL-DCT has the following two encoding modes. (1) Direction-adaptive fixed length ― high efficiency (DAFL-HE) mode for higher compression performance; (2) Direction-adaptive fixed length ― low complexity (DAFL-LC) mode for low complexity with a fair compression ratio. On the other hand, motion estimation (ME) exploits temporal correlation between video frames and yields significant improvement in compression ratio while sustaining high visual quality in video coding. Block-matching motion estimation (BMME) is the most popular approach due to its simplicity and efficiency. However, the real-world video sequences may contain slow, medium and/or fast motion activities. Further, a single search pattern does not prove efficient in finding best matched block for all motion types. In addition, it is observed that most of the BMME schemes are based on uni-modal error surface. Nevertheless, real-world video sequences may exhibit a large number of local minima available within a search window and thus possess multi-modal error surface (MES). Hence, the following two uni-modal error surface based and multi-modal error surface based motion estimation schemes are developed. (1) Direction-adaptive motion estimation (DAME) scheme; (2) Pattern-based modified particle swarm optimization motion estimation (PMPSO-ME) scheme. Subsequently, various fast and efficient foveated video compression schemes are developed with combination of these schemes to improve the video coding performance further while maintaining high visual quality to salient regions. All schemes are incorporated into the H.264/AVC video coding platform. Various experiments have been carried out on H.264/AVC joint model reference software (version JM 18.6). Computing various benchmark metrics, the proposed schemes are compared with other existing competitive schemes in terms of rate-distortion curves, Bjontegaard metrics (BD-PSNR, BD-SSIM and BD-bitrate), encoding time, number of search points and subjective evaluation to derive an overall conclusion
    corecore