418 research outputs found

    Is Smaller Always Better? - Evaluating Video Compression Techniques for Simulation Ensembles

    Get PDF
    We provide an evaluation of the applicability of video compression techniques for compressing visualization image databases that are often used for in situ visualization. Considering relevant practical implementation aspects, we identify relevant compression parameters, and evaluate video compression for several test cases, involving several data sets and visualization methods; we use three different video codecs. To quantify the benefits and drawbacks of video compression, we employ metrics for image quality, compression rate, and performance. The experiments discussed provide insight into good choices of parameter values, working well in the considered cases

    Neighbor Correspondence Matching for Flow-based Video Frame Synthesis

    Full text link
    Video frame synthesis, which consists of interpolation and extrapolation, is an essential video processing technique that can be applied to various scenarios. However, most existing methods cannot handle small objects or large motion well, especially in high-resolution videos such as 4K videos. To eliminate such limitations, we introduce a neighbor correspondence matching (NCM) algorithm for flow-based frame synthesis. Since the current frame is not available in video frame synthesis, NCM is performed in a current-frame-agnostic fashion to establish multi-scale correspondences in the spatial-temporal neighborhoods of each pixel. Based on the powerful motion representation capability of NCM, we further propose to estimate intermediate flows for frame synthesis in a heterogeneous coarse-to-fine scheme. Specifically, the coarse-scale module is designed to leverage neighbor correspondences to capture large motion, while the fine-scale module is more computationally efficient to speed up the estimation process. Both modules are trained progressively to eliminate the resolution gap between training dataset and real-world videos. Experimental results show that NCM achieves state-of-the-art performance on several benchmarks. In addition, NCM can be applied to various practical scenarios such as video compression to achieve better performance.Comment: Accepted to ACM MM 202

    VIDEO PREPROCESSING BASED ON HUMAN PERCEPTION FOR TELESURGERY

    Get PDF
    Video transmission plays a critical role in robotic telesurgery because of the high bandwidth and high quality requirement. The goal of this dissertation is to find a preprocessing method based on human visual perception for telesurgical video, so that when preprocessed image sequences are passed to the video encoder, the bandwidth can be reallocated from non-essential surrounding regions to the region of interest, ensuring excellent image quality of critical regions (e.g. surgical region). It can also be considered as a quality control scheme that will gracefully degrade the video quality in the presence of network congestion. The proposed preprocessing method can be separated into two major parts. First, we propose a time-varying attention map whose value is highest at the gazing point and falls off progressively towards the periphery. Second, we propose adaptive spatial filtering and the parameters of which are adjusted according to the attention map. By adding visual adaptation to the spatial filtering, telesurgical video data can be compressed efficiently because of the high degree of visual redundancy removal by our algorithm. Our experimental results have shown that with the proposed preprocessing method, over half of the bandwidth can be reduced while there is no significant visual effect for the observer. We have also developed an optimal parameter selecting algorithm, so that when the network bandwidth is limited, the overall visual distortion after preprocessing is minimized

    Portable Video Streaming Network

    Get PDF
    This dissertation addresses the challenge of developing a video call system capable of supporting both Android mobile devices and fixed computers. Addi tionally, it analyses the quality of video achieved and its variation in the presence of network bandwidth and packet loss constraints. A prototype of a video call system was implemented using a web application and the Web Real-Time Communication (WebRTC) library. Clients use WebRTC to stream video over a Traversal Using Relays around NAT (TURN) relay server, allowing them to send video to any terminal connected to the Internet. Signalling was implemented using WebSockets and a Node.js server. A quality testing prototype was also implemented, which supports sending pre-recorded videos and capturing and storing video recordings at the sender and receiver. The Video Multimethod Assessment Fusion (VMAF) metric was used as the main video quality metric, based on the comparison between the transmitted and received videos. The quality of a video encoded using the open source video encoder VP8 was analysed in constrained network setups. The results measured the video quality degradation and percentage of received frames, showing that the system is resilient to some bandwidth strangulation and packet loss, although with a noticeable video quality degradation.Esta dissertação aborda o desafio de desenvolver um sistema de videochamada capaz de suportar dispositivos móveis Android e computadores fixos. Além disso, analisa a qualidade do vídeo obtida e sua variação na presença de restrições de largura de banda da rede e perda de pacotes. Um protótipo de um sistema de videochamada foi implementado usando uma aplicação web e a biblioteca Web Real-Time Communication (WebRTC). Os clientes usam WebRTC para transmitir o vídeo através de um servidor de retransmissão Traversal Using Relays around NAT (TURN), permitindo que enviem vídeo a qualquer cliente ligado à Internet. A sinalização foi implementada usando WebSockets e um servidor Node.js. Também foi implementado um protótipo de teste de qualidade, que suporta o envio de vídeos pré-gravados e a captura e armazenamento de gravações de vídeo no emissor e no recetor. A métrica Video Multimethod Assessment Fusion (VMAF) foi utilizada como a principal métrica de qualidade de vídeo, com base na comparação entre os vídeos transmitidos e recebidos. A qualidade de um vídeo codificado usando VP8 foi analisada em configurações de rede com limitações. Os resultados mediram a degradação da qualidade do vídeo e a percentagem de tramas recebidas, mostrando que o sistema é resiliente a algum estrangulamento da largura de banda e perda de pacotes, embora com uma degradação percetível da qualidade do vídeo

    Livrable D5.2 of the PERSEE project : 2D/3D Codec architecture

    Get PDF
    Livrable D5.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D5.2 du projet. Son titre : 2D/3D Codec architectur

    Analog parallel processor solutions for video encoding

    Get PDF
    This thesis deals with Cellular Nonlinear Network (CNN) analog parallel processor networks and their implementations in current video coding standards. The target applications are low-power video encoders within 3rd generation mobile terminals. The video codecs of such mobile terminals are defined by either the MPEG-4/H.263 or H.264 video standard. All of these standards are based on the block-based hybrid approach. As block-based motion estimation (ME) is responsible for most of the power consumption of such hybrid video encoders, this thesis deals mostly with low-power ME implementations. Low-power solutions are introduced at both the algorithmic and hardware levels. On the algorithmic level, the introduced implementations are derived from a segmentation algorithm, which has previously been partly realized. The first introduced algorithm reduces the computational complexity of ME within an object-based MPEG-4 encoder. The use of this algorithm enables a 60% drop in the power consumption of Full Search ME. The second algorithm calculates a near-optimal block-size partition for H.264 motion estimation. With this algorithm, the use of computationally complex Lagrange optimization in H.264 ME is not required. The third algorithm reduces the shape bit-rate of an object-based MPEG-4 encoder. On the hardware level a CNN-type ME architecture is introduced. The architecture includes connections and circuitry to fully realize block-based ME. The analog ME implemented with this architecture is capable of lower power than comparable digital realizations. A 9×9 test chip has also been realized. Additionally implemented is a digital predictive ME realization that takes advantage of the introduced partition algorithm. Although the IC layout of the ME algorithm was drawn, the design was verified as an FPGA.reviewe

    Video Compression for Camera Networks: A Distributed Approach

    Get PDF
    The problem of finding efficient communications techniques to distribute multi-view video content across different devices and users in a network is receiving a great attention in the last years. Much interest in particular has been devoted recently to the so called field of Distributed Video Coding (DVC). After briefly reporting traditional approaches to multiview coding, this chapter will introduce the field of DVC for multi-camera systems. The theoretical background of Distributed Source Coding (DSC) is first concisely presented and the problem of the application of DSC principles to the case of video sources is then analyzed. The topic is presented discussing approaches to the problem of DVC in both single-view and in multi-view applications

    Contributions in image and video coding

    Get PDF
    Orientador: Max Henrique Machado CostaTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A comunidade de codificação de imagens e vídeo vem também trabalhando em inovações que vão além das tradicionais técnicas de codificação de imagens e vídeo. Este trabalho é um conjunto de contribuições a vários tópicos que têm recebido crescente interesse de pesquisadores na comunidade, nominalmente, codificação escalável, codificação de baixa complexidade para dispositivos móveis, codificação de vídeo de múltiplas vistas e codificação adaptativa em tempo real. A primeira contribuição estuda o desempenho de três transformadas 3-D rápidas por blocos em um codificador de vídeo de baixa complexidade. O codificador recebeu o nome de Fast Embedded Video Codec (FEVC). Novos métodos de implementação e ordens de varredura são propostos para as transformadas. Os coeficiente 3-D são codificados por planos de bits pelos codificadores de entropia, produzindo um fluxo de bits (bitstream) de saída totalmente embutida. Todas as implementações são feitas usando arquitetura com aritmética inteira de 16 bits. Somente adições e deslocamentos de bits são necessários, o que reduz a complexidade computacional. Mesmo com essas restrições, um bom desempenho em termos de taxa de bits versus distorção pôde ser obtido e os tempos de codificação são significativamente menores (em torno de 160 vezes) quando comparados ao padrão H.264/AVC. A segunda contribuição é a otimização de uma recente abordagem proposta para codificação de vídeo de múltiplas vistas em aplicações de video-conferência e outras aplicações do tipo "unicast" similares. O cenário alvo nessa abordagem é fornecer vídeo com percepção real em 3-D e ponto de vista livre a boas taxas de compressão. Para atingir tal objetivo, pesos são atribuídos a cada vista e mapeados em parâmetros de quantização. Neste trabalho, o mapeamento ad-hoc anteriormente proposto entre pesos e parâmetros de quantização é mostrado ser quase-ótimo para uma fonte Gaussiana e um mapeamento ótimo é derivado para fonte típicas de vídeo. A terceira contribuição explora várias estratégias para varredura adaptativa dos coeficientes da transformada no padrão JPEG XR. A ordem de varredura original, global e adaptativa do JPEG XR é comparada com os métodos de varredura localizados e híbridos propostos neste trabalho. Essas novas ordens não requerem mudanças nem nos outros estágios de codificação e decodificação, nem na definição da bitstream A quarta e última contribuição propõe uma transformada por blocos dependente do sinal. As transformadas hierárquicas usualmente exploram a informação residual entre os níveis no estágio da codificação de entropia, mas não no estágio da transformada. A transformada proposta neste trabalho é uma técnica de compactação de energia que também explora as similaridades estruturais entre os níveis de resolução. A idéia central da técnica é incluir na transformada hierárquica um número de funções de base adaptativas derivadas da resolução menor do sinal. Um codificador de imagens completo foi desenvolvido para medir o desempenho da nova transformada e os resultados obtidos são discutidos neste trabalhoAbstract: The image and video coding community has often been working on new advances that go beyond traditional image and video architectures. This work is a set of contributions to various topics that have received increasing attention from researchers in the community, namely, scalable coding, low-complexity coding for portable devices, multiview video coding and run-time adaptive coding. The first contribution studies the performance of three fast block-based 3-D transforms in a low complexity video codec. The codec has received the name Fast Embedded Video Codec (FEVC). New implementation methods and scanning orders are proposed for the transforms. The 3-D coefficients are encoded bit-plane by bit-plane by entropy coders, producing a fully embedded output bitstream. All implementation is performed using 16-bit integer arithmetic. Only additions and bit shifts are necessary, thus lowering computational complexity. Even with these constraints, reasonable rate versus distortion performance can be achieved and the encoding time is significantly smaller (around 160 times) when compared to the H.264/AVC standard. The second contribution is the optimization of a recent approach proposed for multiview video coding in videoconferencing applications or other similar unicast-like applications. The target scenario in this approach is providing realistic 3-D video with free viewpoint video at good compression rates. To achieve such an objective, weights are computed for each view and mapped into quantization parameters. In this work, the previously proposed ad-hoc mapping between weights and quantization parameters is shown to be quasi-optimum for a Gaussian source and an optimum mapping is derived for a typical video source. The third contribution exploits several strategies for adaptive scanning of transform coefficients in the JPEG XR standard. The original global adaptive scanning order applied in JPEG XR is compared with the localized and hybrid scanning methods proposed in this work. These new orders do not require changes in either the other coding and decoding stages or in the bitstream definition. The fourth and last contribution proposes an hierarchical signal dependent block-based transform. Hierarchical transforms usually exploit the residual cross-level information at the entropy coding step, but not at the transform step. The transform proposed in this work is an energy compaction technique that can also exploit these cross-resolution-level structural similarities. The core idea of the technique is to include in the hierarchical transform a number of adaptive basis functions derived from the lower resolution of the signal. A full image codec is developed in order to measure the performance of the new transform and the obtained results are discussed in this workDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia Elétric
    • …
    corecore