226 research outputs found
PEA265: Perceptual Assessment of Video Compression Artifacts
The most widely used video encoders share a common hybrid coding framework
that includes block-based motion estimation/compensation and block-based
transform coding. Despite their high coding efficiency, the encoded videos
often exhibit visually annoying artifacts, denoted as Perceivable Encoding
Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience
(QoE) of end users. To monitor and improve visual QoE, it is crucial to develop
subjective and objective measures that can identify and quantify various types
of PEAs. In this work, we make the first attempt to build a large-scale
subjectlabelled database composed of H.265/HEVC compressed videos containing
various PEAs. The database, namely the PEA265 database, includes 4 types of
spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types
of temporal PEAs (i.e. flickering and floating). Each containing at least
60,000 image or video patches with positive and negative labels. To objectively
identify these PEAs, we train Convolutional Neural Networks (CNNs) using the
PEA265 database. It appears that state-of-theart ResNeXt is capable of
identifying each type of PEAs with high accuracy. Furthermore, we define PEA
pattern and PEA intensity measures to quantify PEA levels of compressed video
sequence. We believe that the PEA265 database and our findings will benefit the
future development of video quality assessment methods and perceptually
motivated video encoders.Comment: 10 pages,15 figures,4 table
Comparative analysis of DIRAC PRO-VC-2, H.264 AVC and AVS CHINA-P7
Video codec compresses the input video source to reduce storage and transmission bandwidth requirements while maintaining the quality. It is an essential technology for applications, to name a few such as digital television, DVD-Video, mobile TV, videoconferencing and internet video streaming. There are different video codecs used in the industry today and understanding their operation to target certain video applications is the key to optimization. The latest advanced video codec standards have become of great importance in multimedia industries which provide cost-effective encoding and decoding of video and contribute for high compression and efficiency. Currently, H.264 AVC, AVS, and DIRAC are used in the industry to compress video. H.264 codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG). Audio-video coding standard (AVS) is a working group of audio and video coding standard in China. VC-2, also known as Dirac Pro developed by BBC, is a royalty free technology that anyone can use and has been standardized through the SMPTE as VC-2. H.264 AVC, Dirac Pro, Dirac and AVS-P2 are dedicated to High Definition Video, while AVS-P7 is to mobile video. Out of many standards, this work performs a comparative analysis for the H.264 AVC, DIRAC PRO/SMPTE-VC-2 and AVS-P7 standards in low bitrate region and high bitrate region. Bitrate control and constant QP are the methods which are employed for analysis. Evaluation parameters like Compression Ratio, PSNR and SSIM are used for quality comparison. Depending on target application and available bitrate, order of performance is mentioned to show the preferred codec
Video Stream Adaptation In Computer Vision Systems
Computer Vision (CV) has been deployed recently in a wide range of applications, including surveillance and automotive industries. According to a recent report, the market for CV technologies will grow to $33.3 billion by 2019. Surveillance and automotive industries share over 20% of this market. This dissertation considers the design of real-time CV systems with live video streaming, especially those over wireless and mobile networks. Such systems include video cameras/sensors and monitoring stations. The cameras should adapt their captured videos based on the events and/or available resources and time requirement. The monitoring station receives video streams from all cameras and run CV algorithms for decisions, warnings, control, and/or other actions. Real-time CV systems have constraints in power, computational, and communicational resources. Most video adaptation techniques considered the video distortion as the primary metric. In CV systems, however, the main objective is enhancing the event/object detection/recognition/tracking accuracy. The accuracy can essentially be thought of as the quality perceived by machines, as opposed to the human perceptual quality. High-Efficiency Video Coding (HEVC) is a recent encoding standard that seeks to address the limited communication bandwidth problem as a result of the popularity of High Definition (HD) videos. Unfortunately, HEVC adopts algorithms that greatly slow down the encoding process, and thus results in complications in real-time systems.
This dissertation presents a method for adapting live video streams to limited and varying network bandwidth and energy resources. It analyzes and compares the rate-accuracy and rate-energy characteristics of various video streams adaptation techniques in CV systems. We model the video capturing, encoding, and transmission aspects and then provide an overall model of the power consumed by the video cameras and/or sensors. In addition to modeling the power consumption, we model the achieved bitrate of video encoding. We validate and analyze the power consumption models of each phase as well as the aggregate power consumption model through extensive experiments. The analysis includes examining individual parameters separately and examining the impacts of changing more than one parameter at a time. For HEVC, we develop an algorithm that predicts the size of the block without iterating through the exhaustive Rate Distortion Optimization (RDO) method. We demonstrate the effectiveness of the proposed algorithm in comparison with existing algorithms. The proposed algorithm achieves approximately 5 times the encoding speed of the RDO algorithm and 1.42 times the encoding speed of the fastest analyzed algorithm
Lightweight super resolution network for point cloud geometry compression
This paper presents an approach for compressing point cloud geometry by
leveraging a lightweight super-resolution network. The proposed method involves
decomposing a point cloud into a base point cloud and the interpolation
patterns for reconstructing the original point cloud. While the base point
cloud can be efficiently compressed using any lossless codec, such as
Geometry-based Point Cloud Compression, a distinct strategy is employed for
handling the interpolation patterns. Rather than directly compressing the
interpolation patterns, a lightweight super-resolution network is utilized to
learn this information through overfitting. Subsequently, the network parameter
is transmitted to assist in point cloud reconstruction at the decoder side.
Notably, our approach differentiates itself from lookup table-based methods,
allowing us to obtain more accurate interpolation patterns by accessing a
broader range of neighboring voxels at an acceptable computational cost.
Experiments on MPEG Cat1 (Solid) and Cat2 datasets demonstrate the remarkable
compression performance achieved by our method.Comment: 10 pages, 3 figures, 2 tables, and 27 reference
Low-complexity video compression algorithm and video encoder LSI design
制度:新 ; 報告番号:甲2876号 ; 学位の種類:博士(工学) ; 授与年月日:2009/9/15 ; 早大学位記番号:新510
MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding
The rapid advancement of artificial intelligence (AI) technology has led to
the prioritization of standardizing the processing, coding, and transmission of
video using neural networks. To address this priority area, the Moving Picture,
Audio, and Data Coding by Artificial Intelligence (MPAI) group is developing a
suite of standards called MPAI-EEV for "end-to-end optimized neural video
coding." The aim of this AI-based video standard project is to compress the
number of bits required to represent high-fidelity video data by utilizing
data-trained neural coding technologies. This approach is not constrained by
how data coding has traditionally been applied in the context of a hybrid
framework. This paper presents an overview of recent and ongoing
standardization efforts in this area and highlights the key technologies and
design philosophy of EEV. It also provides a comparison and report on some
primary efforts such as the coding efficiency of the reference model.
Additionally, it discusses emerging activities such as learned
Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under
development, or in the exploration phase. With a focus on UAV video signals,
this paper addresses the current status of these preliminary efforts. It also
indicates development timelines, summarizes the main technical details, and
provides pointers to further points of reference. The exploration experiment
shows that the EEV model performs better than the state-of-the-art video coding
standard H.266/VVC in terms of perceptual evaluation metric
LOW-DELAY WINDOW-BASED RATE CONTROL SCHEME FOR VIDEO QUALITY OPTIMIZATION IN VIDEO ENCODER
ABSTRACT The consistent video quality and encoding latency due to buffering are two important aspects in designing rate control scheme for the application of real-time video coding system. To well balance these two contrary objectives, we firstly analyze the constraint of buffer latency and the definition of a "consistent" video quality. Then a window-based rate control scheme is proposed with one window for controlling the rate and latency, while the other window for optimizing video quality. By applying low complexity frame level ratedistortion model in the testing sequences, our proposed method shows excellent performance in balancing the encoder buffer latency and optimized video quality. Besides, this one-pass rate control scheme is highly practical for the real-time video coding application. Rate control, buffer latency, video quality, window-based, video encode
- …