Search CORE

306 research outputs found

Perceptually-Driven Video Coding with the Daala Video Codec

Author: Bankoski
Daede
Daede
Dai
de Oliveira
Duda
Egge
Egge
Fukuma
Fuldseth
Grange
Han
Ponomarenko
Reader
Sezer
Stuiver
Terriberry
Terriberry
Tran
Valin
Valin
Valin
Wang
Watanabe
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 08/10/2016
Field of study

The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which ones worked and which did not. We evaluate which tools are easy to integrate into a more traditional codec design, and show results in the context of the codec being developed by the Alliance for Open Media.Comment: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 201

arXiv.org e-Print Archive

Crossref

Mathematical transforms and image compression: A review

Author: Satish K. Singh
Publication venue: Maejo University
Publication date: 01/07/2010
Field of study

It is well known that images, often used in a variety of computer and other scientific and engineering applications, are difficult to store and transmit due to their sizes. One possible solution to overcome this problem is to use an efficient digital image compression technique where an image is viewed as a matrix and then the operations are performed on the matrix. All the contemporary digital image compression systems use various mathematical transforms for compression. The compression performance is closely related to the performance by these mathematical transforms in terms of energy compaction and spatial frequency isolation by exploiting inter-pixel redundancies present in the image data. Through this paper, a comprehensive literature survey has been carried out and the pros and cons of various transform-based image compression models have also been discussed

Directory of Open Access Journals

Contributions in image and video coding

Author: Testoni Vanessa
Publication venue: [s.n.]
Publication date: 19/08/2018
Field of study

Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio da Producao Cientifica e Intelectual da Unicamp

A fast algorithm for the computation of 2-D forward and inverse MDCT

Author: Luo Limin,
Senhadji Lotfi
Shu Huazhong
Wu Jiasong
Publication venue: 'Elsevier BV'
Publication date: 01/06/2008
Field of study

International audienceA fast algorithm for computing the two-dimensional (2-D) forward and inverse modified discrete cosine transform (MDCT and IMDCT) is proposed. The algorithm converts the 2-D MDCT and IMDCT with block size M N into four 2-D discrete cosine transforms (DCTs) with block size ðM=4Þ ðN=4Þ. It is based on an algorithm recently presented by Cho et al. [An optimized algorithm for computing the modified discrete cosine transform and its inverse transform, in: Proceedings of the IEEE TENCON, vol. A, 21–24 November 2004, pp. 626–628] for the efficient calculation of onedimensional MDCT and IMDCT. Comparison of the computational complexity with the traditional row–column method shows that the proposed algorithm reduces significantly the number of arithmetic operations

HAL-Rennes 1

Implementation of Lapped Biorthogonal Transform for JPEG- XR Image Coding

Author: Khan Ahmad Khalil
Raja Gulistan
ur Rehman Muhammad Riaz
Publication venue: 'IntechOpen'
Publication date: 09/01/2013
Field of study

IntechOpen

Deep Vectorization Convolutional Neural Networks for Denoising in Mammogram Using Enhanced Image

Author: A Buades
A Mencattini
EM Eksioglu
Eri Matsuyama
F Luisier
J Portilla
JL Starck
K Dabov
MS Elsherif
P Heinlein
SG Chang
T Blu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

University of Liverpool Repository

Crossref