Search CORE

1,270 research outputs found

An Optimized Rate Control Algorithm in Versatile Video Coding for 360° Videos

Author: He Liqiang
He Xiaohai
Sheriff Raymond
Xiong Shuhua
Zhao Zeming
Publication venue
Publication date: 31/10/2022
Field of study

Edge Hill University Research Information Repository

Space-variant picture coding

Author: Popkin Timothy John
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2010
Field of study

PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring

Queen Mary Research Online

OpenGrey Repository

Foveation scalable video coding with automatic fixation selection

Author: A.C. Bovik
Ligang Lu
Zhou Wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Efficient high-resolution video compression scheme using background and foreground layers

Author: Afsana Fariha
Murshed Manzur
Paul Manoranjan
Taubman David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE

Federation ResearchOnline

JOINT CODING OF MULTIMODAL BIOMEDICAL IMAGES US ING CONVOLUTIONAL NEURAL NETWORKS

Author: Parracho João Oliveira
Publication venue
Publication date: 08/11/2020
Field of study

The massive volume of data generated daily by the gathering of medical images with different modalities might be difficult to store in medical facilities and share through communication networks. To alleviate this issue, efficient compression methods must be implemented to reduce the amount of storage and transmission resources required in such applications. However, since the preservation of all image details is highly important in the medical context, the use of lossless image compression algorithms is of utmost importance. This thesis presents the research results on a lossless compression scheme designed to encode both computerized tomography (CT) and positron emission tomography (PET). Different techniques, such as image-to-image translation, intra prediction, and inter prediction are used. Redundancies between both image modalities are also investigated. To perform the image-to-image translation approach, we resort to lossless compression of the original CT data and apply a cross-modality image translation generative adversarial network to obtain an estimation of the corresponding PET. Two approaches were implemented and evaluated to determine a PET residue that will be compressed along with the original CT. In the first method, the residue resulting from the differences between the original PET and its estimation is encoded, whereas in the second method, the residue is obtained using encoders inter-prediction coding tools. Thus, in alternative to compressing two independent picture modalities, i.e., both images of the original PET-CT pair solely the CT is independently encoded alongside with the PET residue, in the proposed method. Along with the proposed pipeline, a post-processing optimization algorithm that modifies the estimated PET image by altering the contrast and rescaling the image is implemented to maximize the compression efficiency. Four different versions (subsets) of a publicly available PET-CT pair dataset were tested. The first proposed subset was used to demonstrate that the concept developed in this work is capable of surpassing the traditional compression schemes. The obtained results showed gains of up to 8.9% using the HEVC. On the other side, JPEG2k proved not to be the most suitable as it failed to obtain good results, having reached only -9.1% compression gain. For the remaining (more challenging) subsets, the results reveal that the proposed refined post-processing scheme attains, when compared to conventional compression methods, up 6.33% compression gain using HEVC, and 7.78% using VVC

IC-online

Recommended from our members

Automatic assessment and enhancement of streaming video quality under bandwidth and dynamic range limitations

Author: Venkataramanan Abhinau Kumar
Publication venue
Publication date: 07/08/2024
Field of study

The explosion in the amount of video content being streamed over the internet in recent years has accelerated the demand for effective and efficient methods for assessing and improving the perceptual quality of images and videos while adhering to internet bandwidth and display dynamic range limitations. Objective models of perceptual quality have found extensive use in optimizing video compression and enhancement parameters to achieve desirable streaming fidelity. In this dissertation, we develop a variety of quality modeling and quality enhancement methods targeting the streaming of standard and high dynamic range (SDR/HDR) videos over the internet, subjected to compression and tone mapping. The Visual Multimethod Assessment Fusion (VMAF) algorithm has recently emerged as a state-of-the-art approach to video quality prediction, that now pervades the streaming and social media industry. However, since VMAF requires the evaluation of a heterogeneous set of quality models, it is computationally expensive. Given other advances in hardware-accelerated encoding, quality assessment is emerging as a significant bottleneck in video compression pipelines. Towards alleviating this burden, we first propose a novel Fusion of Unified Quality Evaluators (FUNQUE) framework, by enabling computation sharing and by using a transform that is sensitive to visual perception to boost accuracy. Further, we expand the FUNQUE framework to define a collection of improved low-complexity fused-feature models that advance the state-of-the-art of video quality performance with respect to both accuracy, by 4.2\% to 5.3\%, and computational efficiency, by factors of 3.8 to 11 times! High Dynamic Range (HDR) videos are able to represent wider ranges of contrasts and colors than Standard Dynamic Range (SDR) videos, giving more vivid experiences. Due to this, HDR videos are expected to grow into the dominant video modality of the future. However, HDR videos are incompatible with existing SDR displays, which form the majority of affordable consumer displays on the market. Because of this, HDR videos must be processed by tone-mapping them to reduced bit-depths to service a broad swath of SDR-limited video consumers. Here, we analyzed the impact of tone-mapping operators on the visual quality of streaming HDR videos by building the first large-scale subjectively annotated open-source database of compressed tone-mapped HDR videos, containing 15,000 tone-mapped sequences derived from 40 unique HDR source contents. The videos in the database were labeled with more than 750,000 subjective quality annotations, collected from more than 1,600 unique human observers. We envision that the new LIVE Tone-Mapped HDR (LIVE-TMHDR) database will enable significant progress on HDR video tone mapping and quality assessment in the future. To this end, we make the database freely available to the community at https://live.ece.utexas.edu/research/LIVE_TMHDR/index.html. Server-side tone-mapping involves automating decisions regarding the choices of tone-mapping operators (TMOs) and their parameters to yield high-fidelity outputs. Moreover, these choices must be balanced against the effects of lossy compression, which is ubiquitous in streaming scenarios. To automate this process, we developed a novel, efficient model of objective video quality named Cut-FUNQUE that is able to accurately predict the visual quality of tone-mapped and compressed HDR videos. By evaluating Cut-FUNQUE on the LIVE-TMHDR database, we show that it achieves state-of-the-art accuracy. Finally, the deep learning revolution has strongly impacted low-level image processing tasks such as style/domain transfer, enhancement/restoration, and visual quality assessments. Despite often being treated separately, the aforementioned tasks share a common theme of understanding, editing, or enhancing the appearance of input images without modifying the underlying content. We leverage this observation to develop a novel disentangled representation learning method that decomposes inputs into content and appearance features. The model is trained in a self-supervised manner and we use the learned features to develop a new quality prediction model named DisQUE. We demonstrate through extensive evaluations that DisQUE achieves state-of-the-art accuracy across quality prediction tasks and distortion types. Moreover, we demonstrate that the same features may also be used for image processing tasks such as HDR tone mapping, where the desired output characteristics may be tuned using example input-output pairs.Electrical and Computer Engineerin

Texas ScholarWorks

Diseño centrado en calidad para la difusión Peer-to-Peer de video en vivo

Author: Rodríguez Bocca Pablo
Publication venue: UR. FI-INCO,
Publication date
Field of study

El uso de redes Peer-to-Peer (P2P) es una forma escalable para ofrecer servicios de video sobre Internet. Este documento hace foco en la definición, desarrollo y evaluación de una arquitectura P2P para distribuir video en vivo. El diseño global de la red es guiado por la calidad de experiencia (Quality of Experience - QoE), cuyo principal componente en este caso es la calidad del video percibida por los usuarios finales, en lugar del tradicional diseño basado en la calidad de servicio (Quality of Service - QoE) de la mayoría de los sistemas. Para medir la calidad percibida del video, en tiempo real y automáticamente, extendimos la recientemente propuesta metodología Pseudo-Subjective Quality Assessment (PSQA). Dos grandes líneas de investigación son desarrolladas. Primero, proponemos una técnica de distribución de video desde múltiples fuentes con las características de poder ser optimizada para maximizar la calidad percibida en contextos de muchas fallas y de poseer muy baja señalización (a diferencia de los sistemas existentes). Desarrollamos una metodología, basada en PSQA, que nos permite un control fino sobre la forma en que la señal de video es dividida en partes y la cantidad de redundancia agregada, como una función de la dinámica de los usuarios de la red. De esta forma es posible mejorar la robustez del sistema tanto como sea deseado, contemplando el límite de capacidad en la comunicación. En segundo lugar, presentamos un mecanismo estructurado para controlar la topología de la red. La selección de que usuarios servirán a que otros es importante para la robustez de la red, especialmente cuando los usuarios son heterogéneos en sus capacidades y en sus tiempos de conexión.Nuestro diseño maximiza la calidad global esperada (evaluada usando PSQA), seleccionado una topología que mejora la robustez del sistema. Además estudiamos como extender la red con dos servicios complementarios: el video bajo demanda (Video on Demand - VoD) y el servicio MyTV. El desafío en estos servicios es como realizar búsquedas eficientes sobre la librería de videos, dado al alto dinamismo del contenido. Presentamos una estrategia de "caching" para las búsquedas en estos servicios, que maximiza el número total de respuestas correctas a las consultas, considerando una dinámica particular en los contenidos y restricciones de ancho de banda. Nuestro diseño global considera escenarios reales, donde los casos de prueba y los parámetros de configuración surgen de datos reales de un servicio de referencia en producción. Nuestro prototipo es completamente funcional, de uso gratuito, y basado en tecnologías bien probadas de código abierto

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas