104 research outputs found

    Region-Based Template Matching Prediction for Intra Coding

    Get PDF
    Copy prediction is a renowned category of prediction techniques in video coding where the current block is predicted by copying the samples from a similar block that is present somewhere in the already decoded stream of samples. Motion-compensated prediction, intra block copy, template matching prediction etc. are examples. While the displacement information of the similar block is transmitted to the decoder in the bit-stream in the first two approaches, it is derived at the decoder in the last one by repeating the same search algorithm which was carried out at the encoder. Region-based template matching is a recently developed prediction algorithm that is an advanced form of standard template matching. In this method, the reference area is partitioned into multiple regions and the region to be searched for the similar block(s) is conveyed to the decoder in the bit-stream. Further, its final prediction signal is a linear combination of already decoded similar blocks from the given region. It was demonstrated in previous publications that region-based template matching is capable of achieving coding efficiency improvements for intra as well as inter-picture coding with considerably less decoder complexity than conventional template matching. In this paper, a theoretical justification for region-based template matching prediction subject to experimental data is presented. Additionally, the test results of the aforementioned method on the latest H.266/Versatile Video Coding (VVC) test model (version VTM-14.0) yield an average Bjøntegaard-Delta (BD) bit-rate savings of −0.75% using all intra (AI) configuration with 130% encoder run-time and 104% decoder run-time for a particular parameter selection

    Extended Signaling Methods for Reduced Video Decoder Power Consumption Using Green Metadata

    Full text link
    In this paper, we discuss one aspect of the latest MPEG standard edition on energy-efficient media consumption, also known as Green Metadata (ISO/IEC 232001-11), which is the interactive signaling for remote decoder-power reduction for peer-to-peer video conferencing. In this scenario, the receiver of a video, e.g., a battery-driven portable device, can send a dedicated request to the sender which asks for a video bitstream representation that is less complex to decode and process. Consequently, the receiver saves energy and extends operating times. We provide an overview on latest studies from the literature dealing with energy-saving aspects, which motivate the extension of the legacy Green Metadata standard. Furthermore, we explain the newly introduced syntax elements and verify their effectiveness by performing dedicated experiments. We show that the integration of these syntax elements can lead to dynamic energy savings of up to 90% for software video decoding and 80% for hardware video decoding, respectively.Comment: 5 pages, 2 figure

    QoE-Based Low-Delay Live Streaming Using Throughput Predictions

    Full text link
    Recently, HTTP-based adaptive streaming has become the de facto standard for video streaming over the Internet. It allows clients to dynamically adapt media characteristics to network conditions in order to ensure a high quality of experience, that is, minimize playback interruptions, while maximizing video quality at a reasonable level of quality changes. In the case of live streaming, this task becomes particularly challenging due to the latency constraints. The challenge further increases if a client uses a wireless network, where the throughput is subject to considerable fluctuations. Consequently, live streams often exhibit latencies of up to 30 seconds. In the present work, we introduce an adaptation algorithm for HTTP-based live streaming called LOLYPOP (Low-Latency Prediction-Based Adaptation) that is designed to operate with a transport latency of few seconds. To reach this goal, LOLYPOP leverages TCP throughput predictions on multiple time scales, from 1 to 10 seconds, along with an estimate of the prediction error distribution. In addition to satisfying the latency constraint, the algorithm heuristically maximizes the quality of experience by maximizing the average video quality as a function of the number of skipped segments and quality transitions. In order to select an efficient prediction method, we studied the performance of several time series prediction methods in IEEE 802.11 wireless access networks. We evaluated LOLYPOP under a large set of experimental conditions limiting the transport latency to 3 seconds, against a state-of-the-art adaptation algorithm from the literature, called FESTIVE. We observed that the average video quality is by up to a factor of 3 higher than with FESTIVE. We also observed that LOLYPOP is able to reach a broader region in the quality of experience space, and thus it is better adjustable to the user profile or service provider requirements.Comment: Technical Report TKN-16-001, Telecommunication Networks Group, Technische Universitaet Berlin. This TR updated TR TKN-15-00

    Intelligent Algorithm for Enhancing MPEG-DASH QoE in eMBMS

    Full text link
    [EN] Multimedia streaming is the most demanding and bandwidth hungry application in today¿s world of Internet. MPEG-DASH as a video technology standard is designed for delivering live or on-demand streams in Internet to deliver best quality content with the fewest dropouts and least possible buffering. Hybrid architecture of DASH and eMBMS has attracted a great attention from the telecommunication industry and multimedia services. It is deployed in response to the immense demand in multimedia traffic. However, handover and limited available resources of the system affected on dropping segments of the adaptive video streaming in eMBMS and it creates an adverse impact on Quality of Experience (QoE), which is creating trouble for service providers and network providers towards delivering the service. In this paper, we derive a case study in eMBMS to approach to provide test measures evaluating MPEG-DASH QoE, by defining the metrics are influenced on QoE in eMBMS such as bandwidth and packet loss then we observe the objective metrics like stalling (number, duration and place), buffer length and accumulative video time. Moreover, we build a smart algorithm to predict rate of segments are lost in multicast adaptive video streaming. The algorithm deploys an estimation decision regards how to recover the lost segments. According to the obtained results based on our proposal algorithm, rate of lost segments is highly decreased by comparing to the traditional approach of MPEG-DASH multicast and unicast for high number of users.This work has been partially supported by the Postdoctoral Scholarship Contratos Postdoctorales UPV 2014 (PAID-10-14) of the Universitat Politècnica de València , by the Programa para la Formación de Personal Investigador (FPI-2015-S2-884) of the Universitat Politècnica de València , by the Ministerio de Economía y Competitividad , through the Convocatoria 2014. Proyectos I+D - Programa Estatal de Investigación Científica y Técnica de Excelencia in the Subprograma Estatal de Generación de Conocimiento , project TIN2014-57991-C3-1-P and through the Convocatoria 2017 - Proyectos I+D+I - Programa Estatal de Investigación, Desarrollo e Innovación, convocatoria excelencia (Project TIN2017-84802-C2-1-P).Abdullah, MT.; Jimenez, JM.; Canovas Solbes, A.; Lloret, J. (2017). Intelligent Algorithm for Enhancing MPEG-DASH QoE in eMBMS. Network Protocols and Algorithms. 9(3-4):94-114. https://doi.org/10.5296/npa.v9i3-4.12573S9411493-

    Video Quality Assessment with Texture Information Fusion for Streaming Applications

    Full text link
    The rise in video streaming applications has increased the demand for video quality assessment (VQA). In 2016, Netflix introduced Video Multi-Method Assessment Fusion (VMAF), a full reference VQA metric that strongly correlates with perceptual quality, but its computation is time-intensive. We propose a Discrete Cosine Transform (DCT)-energy-based VQA with texture information fusion (VQ-TIF) model for video streaming applications that determines the visual quality of the reconstructed video compared to the original video. VQ-TIF extracts Structural Similarity (SSIM) and spatiotemporal features of the frames from the original and reconstructed videos and fuses them using a long short-term memory (LSTM)-based model to estimate the visual quality. Experimental results show that VQ-TIF estimates the visual quality with a Pearson Correlation Coefficient (PCC) of 0.96 and a Mean Absolute Error (MAE) of 2.71, on average, compared to the ground truth VMAF scores. Additionally, VQ-TIF estimates the visual quality at a rate of 9.14 times faster than the state-of-the-art VMAF implementation, along with an 89.44 % reduction in energy consumption, assuming an Ultra HD (2160p) display resolution.Comment: 2024 Mile High Video (MHV

    Formangepasste diskrete Cosinus-Transformation für die Prädiktionsverbesserung im HEVC

    Get PDF
    Um die Kompression in der Videocodierung zu verbessern, führen wir eine explizite Referenzbildentrauschung in die Codierschleife eines Videocodecs ein. Motiviert durch den Gedanken, dass die Leistung des Prädiktionsfehlers höher sein kann, falls Rauschen in dem zu codierenden Video vorhanden ist, wird die Bewegungskompensation durch die eingeführten Module verbessert. Es wird gezeigt wie man einen solchen Ansatz für die Codierung bei sehr kleinen Einstellungen des Quantisierungsparameters aber auch bei sehr groben Quantisierungseinstellungen verwenden kann. Die entwickelten Algorithmen wurden in der Referenzsoftware des aktuellen HEVC-Standards getestet. Die Simulationsergebnisse zeigen, dass mit der vorgeschlagenen Vorgehensweise maximale Bitratenersparnisse von bis zu 10 % für niedrige als auch hohe Quantisierungsparametereinstellungen erreicht werden können. Im Durchschnitt wurden Bitratenersparnisse von 7 % für hohe Qualität und 5 % für niedrige Qualität bei Codierung der ClassB-Sequenzen erreicht

    GAN-Based Differential Private Image Privacy Protection Framework for the Internet of Multimedia Things.

    Full text link
    With the development of the Internet of Multimedia Things (IoMT), an increasing amount of image data is collected by various multimedia devices, such as smartphones, cameras, and drones. This massive number of images are widely used in each field of IoMT, which presents substantial challenges for privacy preservation. In this paper, we propose a new image privacy protection framework in an effort to protect the sensitive personal information contained in images collected by IoMT devices. We aim to use deep neural network techniques to identify the privacy-sensitive content in images, and then protect it with the synthetic content generated by generative adversarial networks (GANs) with differential privacy (DP). Our experiment results show that the proposed framework can effectively protect users' privacy while maintaining image utility
    corecore