1,270 research outputs found

    Space-variant picture coding

    Get PDF
    PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring

    Foveation scalable video coding with automatic fixation selection

    Full text link

    Efficient high-resolution video compression scheme using background and foreground layers

    Get PDF
    Video coding using dynamic background frame achieves better compression compared to the traditional techniques by encoding background and foreground separately. This process reduces coding bits for the overall frame significantly; however, encoding background still requires many bits that can be compressed further for achieving better coding efficiency. The cuboid coding framework has been proven to be one of the most effective methods of image compression which exploits homogeneous pixel correlation within a frame and has better alignment with object boundary compared to traditional block-based coding. In a video sequence, the cuboid-based frame partitioning varies with the changes of the foreground. However, since the background remains static for a group of pictures, the cuboid coding exploits better spatial pixel homogeneity. In this work, the impact of cuboid coding on the background frame for high-resolution videos (Ultra-High-Definition (UHD) and 360-degree videos) is investigated using the multilayer framework of SHVC. After the cuboid partitioning, the method of coarse frame generation has been improved with a novel idea by keeping human-visual sensitive information. Unlike the traditional SHVC scheme, in the proposed method, cuboid coded background and the foreground are encoded in separate layers in an implicit manner. Simulation results show that the proposed video coding method achieves an average BD-Rate reduction of 26.69% and BD-PSNR gain of 1.51 dB against SHVC with significant encoding time reduction for both UHD and 360 videos. It also achieves an average of 13.88% BD-Rate reduction and 0.78 dB BD-PSNR gain compared to the existing relevant method proposed by X. Hoang Van. © 2013 IEEE

    JOINT CODING OF MULTIMODAL BIOMEDICAL IMAGES US ING CONVOLUTIONAL NEURAL NETWORKS

    Get PDF
    The massive volume of data generated daily by the gathering of medical images with different modalities might be difficult to store in medical facilities and share through communication networks. To alleviate this issue, efficient compression methods must be implemented to reduce the amount of storage and transmission resources required in such applications. However, since the preservation of all image details is highly important in the medical context, the use of lossless image compression algorithms is of utmost importance. This thesis presents the research results on a lossless compression scheme designed to encode both computerized tomography (CT) and positron emission tomography (PET). Different techniques, such as image-to-image translation, intra prediction, and inter prediction are used. Redundancies between both image modalities are also investigated. To perform the image-to-image translation approach, we resort to lossless compression of the original CT data and apply a cross-modality image translation generative adversarial network to obtain an estimation of the corresponding PET. Two approaches were implemented and evaluated to determine a PET residue that will be compressed along with the original CT. In the first method, the residue resulting from the differences between the original PET and its estimation is encoded, whereas in the second method, the residue is obtained using encoders inter-prediction coding tools. Thus, in alternative to compressing two independent picture modalities, i.e., both images of the original PET-CT pair solely the CT is independently encoded alongside with the PET residue, in the proposed method. Along with the proposed pipeline, a post-processing optimization algorithm that modifies the estimated PET image by altering the contrast and rescaling the image is implemented to maximize the compression efficiency. Four different versions (subsets) of a publicly available PET-CT pair dataset were tested. The first proposed subset was used to demonstrate that the concept developed in this work is capable of surpassing the traditional compression schemes. The obtained results showed gains of up to 8.9% using the HEVC. On the other side, JPEG2k proved not to be the most suitable as it failed to obtain good results, having reached only -9.1% compression gain. For the remaining (more challenging) subsets, the results reveal that the proposed refined post-processing scheme attains, when compared to conventional compression methods, up 6.33% compression gain using HEVC, and 7.78% using VVC

    Diseño centrado en calidad para la difusión Peer-to-Peer de video en vivo

    Get PDF
    El uso de redes Peer-to-Peer (P2P) es una forma escalable para ofrecer servicios de video sobre Internet. Este documento hace foco en la definición, desarrollo y evaluación de una arquitectura P2P para distribuir video en vivo. El diseño global de la red es guiado por la calidad de experiencia (Quality of Experience - QoE), cuyo principal componente en este caso es la calidad del video percibida por los usuarios finales, en lugar del tradicional diseño basado en la calidad de servicio (Quality of Service - QoE) de la mayoría de los sistemas. Para medir la calidad percibida del video, en tiempo real y automáticamente, extendimos la recientemente propuesta metodología Pseudo-Subjective Quality Assessment (PSQA). Dos grandes líneas de investigación son desarrolladas. Primero, proponemos una técnica de distribución de video desde múltiples fuentes con las características de poder ser optimizada para maximizar la calidad percibida en contextos de muchas fallas y de poseer muy baja señalización (a diferencia de los sistemas existentes). Desarrollamos una metodología, basada en PSQA, que nos permite un control fino sobre la forma en que la señal de video es dividida en partes y la cantidad de redundancia agregada, como una función de la dinámica de los usuarios de la red. De esta forma es posible mejorar la robustez del sistema tanto como sea deseado, contemplando el límite de capacidad en la comunicación. En segundo lugar, presentamos un mecanismo estructurado para controlar la topología de la red. La selección de que usuarios servirán a que otros es importante para la robustez de la red, especialmente cuando los usuarios son heterogéneos en sus capacidades y en sus tiempos de conexión.Nuestro diseño maximiza la calidad global esperada (evaluada usando PSQA), seleccionado una topología que mejora la robustez del sistema. Además estudiamos como extender la red con dos servicios complementarios: el video bajo demanda (Video on Demand - VoD) y el servicio MyTV. El desafío en estos servicios es como realizar búsquedas eficientes sobre la librería de videos, dado al alto dinamismo del contenido. Presentamos una estrategia de "caching" para las búsquedas en estos servicios, que maximiza el número total de respuestas correctas a las consultas, considerando una dinámica particular en los contenidos y restricciones de ancho de banda. Nuestro diseño global considera escenarios reales, donde los casos de prueba y los parámetros de configuración surgen de datos reales de un servicio de referencia en producción. Nuestro prototipo es completamente funcional, de uso gratuito, y basado en tecnologías bien probadas de código abierto
    corecore