245 research outputs found
A comprehensive video codec comparison
In this paper, we compare the video codecs AV1 (version 1.0.0-2242 from August 2019), HEVC (HM and x265), AVC (x264), the exploration software JEM which is based on HEVC, and the VVC (successor of HEVC) test model VTM (version 4.0 from February 2019) under two fair and balanced configurations: All Intra for the assessment of intra coding and Maximum Coding Efficiency with all codecs being tuned for their best coding efficiency settings. VTM achieves the highest coding efficiency in both configurations, followed by JEM and AV1. The worst coding efficiency is achieved by x264 and x265, even in the placebo preset for highest coding efficiency. AV1 gained a lot in terms of coding efficiency compared to previous versions and now outperforms HM by 24% BD-Rate gains. VTM gains 5% over AV1 in terms of BD-Rates. By reporting separate numbers for JVET and AOM test sequences, it is ensured that no bias in the test sequences exists. When comparing only intra coding tools, it is observed that the complexity increases exponentially for linearly increasing coding efficiency
Deep perceptual preprocessing for video coding
We introduce the concept of rate-aware deep perceptual preprocessing (DPP) for video encoding. DPP makes a single pass over each input frame in order to enhance its visual quality when the video is to be compressed with any codec at any bitrate. The resulting bitstreams can be decoded and displayed at the client side without any post-processing component. DPP comprises a convolutional neural network that is trained via a composite set of loss functions that incorporates: (i) a perceptual loss based on a trained no-reference image quality assessment model, (ii) a reference-based fidelity loss expressing L1 and structural similarity aspects, (iii) a motion-based rate loss via block-based transform, quantization and entropy estimates that converts the essential components of standard hybrid video encoder designs into a trainable framework. Extensive testing using multiple quality metrics and AVC, AV1 and VVC encoders shows that DPP+encoder reduces, on average, the bitrate of the corresponding encoder by 11%. This marks the first time a server-side neural processing component achieves such savings over the state-of-the-art in video coding
Portable Video Streaming Network
This dissertation addresses the challenge of developing a video call system
capable of supporting both Android mobile devices and fixed computers. Addi tionally, it analyses the quality of video achieved and its variation in the presence
of network bandwidth and packet loss constraints.
A prototype of a video call system was implemented using a web application
and the Web Real-Time Communication (WebRTC) library. Clients use WebRTC
to stream video over a Traversal Using Relays around NAT (TURN) relay server,
allowing them to send video to any terminal connected to the Internet. Signalling
was implemented using WebSockets and a Node.js server.
A quality testing prototype was also implemented, which supports sending
pre-recorded videos and capturing and storing video recordings at the sender and
receiver. The Video Multimethod Assessment Fusion (VMAF) metric was used as
the main video quality metric, based on the comparison between the transmitted
and received videos.
The quality of a video encoded using the open source video encoder VP8
was analysed in constrained network setups. The results measured the video
quality degradation and percentage of received frames, showing that the system
is resilient to some bandwidth strangulation and packet loss, although with a
noticeable video quality degradation.Esta dissertação aborda o desafio de desenvolver um sistema de videochamada
capaz de suportar dispositivos móveis Android e computadores fixos. Além disso,
analisa a qualidade do vídeo obtida e sua variação na presença de restrições de
largura de banda da rede e perda de pacotes.
Um protótipo de um sistema de videochamada foi implementado usando uma
aplicação web e a biblioteca Web Real-Time Communication (WebRTC). Os clientes usam WebRTC para transmitir o vídeo através de um servidor de retransmissão Traversal Using Relays around NAT (TURN), permitindo que enviem vídeo a
qualquer cliente ligado à Internet. A sinalização foi implementada usando WebSockets e um servidor Node.js.
Também foi implementado um protótipo de teste de qualidade, que suporta
o envio de vídeos pré-gravados e a captura e armazenamento de gravações de
vídeo no emissor e no recetor. A métrica Video Multimethod Assessment Fusion
(VMAF) foi utilizada como a principal métrica de qualidade de vídeo, com base
na comparação entre os vídeos transmitidos e recebidos.
A qualidade de um vídeo codificado usando VP8 foi analisada em configurações de rede com limitações. Os resultados mediram a degradação da qualidade
do vídeo e a percentagem de tramas recebidas, mostrando que o sistema é resiliente a algum estrangulamento da largura de banda e perda de pacotes, embora
com uma degradação percetível da qualidade do vídeo
Encoder Complexity Control in SVT-AV1 by Speed-Adaptive Preset Switching
Current developments in video encoding technology lead to continuously
improving compression performance but at the expense of increasingly higher
computational demands. Regarding the online video traffic increases during the
last years and the concomitant need for video encoding, encoder complexity
control mechanisms are required to restrict the processing time to a sufficient
extent in order to find a reasonable trade-off between performance and
complexity. We present a complexity control mechanism in SVT-AV1 by using
speed-adaptive preset switching to comply with the remaining time budget. This
method enables encoding with a user-defined time constraint within the complete
preset range with an average precision of 8.9 \% without introducing any
additional latencies.Comment: 5 pages, 2 figures, accepted for IEEE International Conference on
Image Processing (ICIP) 202
- …